-
Notifications
You must be signed in to change notification settings - Fork 6
Changed vocabulary term indexing from string to integer. #397
Conversation
search/__init__.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the SearchQuerySet know that v_id
is a vocabulary id and not a term id or other number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can only be a vocabulary. It so happens it's an integer, but if you look at the lines 25-26 this replaces, the logic has not changed at all.
It can never be a term because, in the key/value store of Elasticsearch, the vocabulary integers are keys and the term integers are values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To more fully elaborate, unless we add hard-coded values to course, run, and resource_type that are integers for some reason, there can be no collision.
bccbb1f
to
dec56e4
Compare
ui/views.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename to vocab_id and term_data for consistency with the above for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or whatever names make sense in this context, as long as they're the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in HipChat, I've decided not to do this. It adds duplicated code in multiple places to add vocab_
and term_
, then requires more code to strip them out again for database lookups.
Less code is always better, and it's more efficient as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't argue further on changing the facet key, but please change v_id to something more descriptive. If it can contain resource_type
or run
it shouldn't be named v_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed:
diff --git search/__init__.py search/__init__.py
index b4067e7..e4915b9 100644
--- search/__init__.py
+++ search/__init__.py
@@ -22,6 +22,6 @@ def get_sqs():
sqs = sqs.facet(facet)
# Add dynamic facets (from taxonomy). Certain characters cause problems,
# so use the primary key.
- for v_id in Vocabulary.objects.all().values_list("id", flat=True):
- sqs = sqs.facet(v_id)
+ for vocabulary_id in Vocabulary.objects.all().values_list("id", flat=True):
+ sqs = sqs.facet(vocabulary_id)
return sqs
diff --git ui/views.py ui/views.py
index ba914a2..8080449 100644
--- ui/views.py
+++ ui/views.py
@@ -165,12 +165,12 @@ def get_vocabularies(facets):
id__in=term_ids).values_list('id', 'label')
}
vocabularies = {}
- for v_id, t_data in facets["fields"].items():
- if not v_id.isdigit():
+ for vocabulary_id, term_data in facets["fields"].items():
+ if not vocabulary_id.isdigit():
continue
- vocab = (v_id, vocabs[int(v_id)])
+ vocab = (vocabulary_id, vocabs[int(vocabulary_id)])
vocabularies[vocab] = []
- for t_id, count in t_data:
+ for t_id, count in term_data:
vocabularies[vocab].append((t_id, terms[int(t_id)], count))
return vocabularies
dec56e4
to
a477a5e
Compare
I will merge this after tests pass |
Cool, thanks. 😺 |
Changed vocabulary term indexing from string to integer.
Prevents any funny business with special characters. Incidentally saves
space in Elasticsearch.
Fixes #384
Replaces and closes #374.
Replaces and closes #316.