Skip to content
This repository was archived by the owner on Jan 28, 2020. It is now read-only.

Conversation

ShawnMilo
Copy link
Contributor

Prevents any funny business with special characters. Incidentally saves
space in Elasticsearch.

Fixes #384
Replaces and closes #374.
Replaces and closes #316.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the SearchQuerySet know that v_id is a vocabulary id and not a term id or other number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can only be a vocabulary. It so happens it's an integer, but if you look at the lines 25-26 this replaces, the logic has not changed at all.

It can never be a term because, in the key/value store of Elasticsearch, the vocabulary integers are keys and the term integers are values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To more fully elaborate, unless we add hard-coded values to course, run, and resource_type that are integers for some reason, there can be no collision.

@ShawnMilo ShawnMilo force-pushed the bug/skm/384_nonalpha branch 2 times, most recently from bccbb1f to dec56e4 Compare July 21, 2015 15:24
ui/views.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename to vocab_id and term_data for consistency with the above for loop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or whatever names make sense in this context, as long as they're the same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in HipChat, I've decided not to do this. It adds duplicated code in multiple places to add vocab_ and term_, then requires more code to strip them out again for database lookups.

Less code is always better, and it's more efficient as-is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't argue further on changing the facet key, but please change v_id to something more descriptive. If it can contain resource_type or run it shouldn't be named v_id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed:

diff --git search/__init__.py search/__init__.py
index b4067e7..e4915b9 100644
--- search/__init__.py
+++ search/__init__.py
@@ -22,6 +22,6 @@ def get_sqs():
         sqs = sqs.facet(facet)
     # Add dynamic facets (from taxonomy). Certain characters cause problems,
     # so use the primary key.
-    for v_id in Vocabulary.objects.all().values_list("id", flat=True):
-        sqs = sqs.facet(v_id)
+    for vocabulary_id in Vocabulary.objects.all().values_list("id", flat=True):
+        sqs = sqs.facet(vocabulary_id)
     return sqs
diff --git ui/views.py ui/views.py
index ba914a2..8080449 100644
--- ui/views.py
+++ ui/views.py
@@ -165,12 +165,12 @@ def get_vocabularies(facets):
             id__in=term_ids).values_list('id', 'label')
     }
     vocabularies = {}
-    for v_id, t_data in facets["fields"].items():
-        if not v_id.isdigit():
+    for vocabulary_id, term_data in facets["fields"].items():
+        if not vocabulary_id.isdigit():
             continue
-        vocab = (v_id, vocabs[int(v_id)])
+        vocab = (vocabulary_id, vocabs[int(vocabulary_id)])
         vocabularies[vocab] = []
-        for t_id, count in t_data:
+        for t_id, count in term_data:
             vocabularies[vocab].append((t_id, terms[int(t_id)], count))
     return vocabularies

Prevents any funny business with special characters. Incidentally saves
space in Elasticsearch.

Fixes #384
Replaces and closes #374.
Replaces and closes #316.
@ShawnMilo ShawnMilo force-pushed the bug/skm/384_nonalpha branch from dec56e4 to a477a5e Compare July 21, 2015 16:42
@noisecapella
Copy link
Contributor

I will merge this after tests pass

@ShawnMilo
Copy link
Contributor Author

Cool, thanks. 😺

noisecapella added a commit that referenced this pull request Jul 21, 2015
Changed vocabulary term indexing from string to integer.
@noisecapella noisecapella merged commit ebb5bfe into master Jul 21, 2015
@noisecapella noisecapella deleted the bug/skm/384_nonalpha branch July 21, 2015 17:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle non-alphanumeric characters in terms.

2 participants