Fully implement and test projection filtering #2352

sutartmelson · 2017-09-20T17:17:42Z

brianhelba · 2017-09-22T01:49:15Z

girder/models/model_base.py

+    :type fields: `str, list of strings, or tuple of strings for fields to be included from the
+        document, or dict for an inclusion or exclusion projection`.
+    :param overwrite: Additional document key(s) to be included or not excluded in fields.
+    :type overwrite: `str, list of strings, or tuple of strings for fields to be included from the


I realize that other places in Girder are flexible like this, but IMO, it's burdensome to implement this sort of argument flexibility. str, list, and tuple have distinctive behavior and semantics, and there's only one that's actually appropriate here: list. I see nothing wrong with expecting overwrite to always be passed as a list.

brianhelba · 2017-09-22T01:49:52Z

girder/models/model_base.py

+    if fields is None:
+        return fields
+
+    overwrite = list(overwrite)


No need for this, we're not mutating overwrite.

brianhelba · 2017-09-22T01:52:42Z

girder/models/model_base.py

-            # If this is a list/tuple/set, that means inclusion
-            return True
+            # Inclusion projection (str, list, or tuple)
+            copy = list(fields)


Since the parent API allows a str, I suppose we need to support that case here. As-is, this will turn each of the characters in the string into a list element, so you need to specially check if fields is a string and place it in a list with [fields].

This bug wasn't caught because the tests don't ever pass a standalone str. Since we need to support that, we should test it at least once.

brianhelba · 2017-09-22T01:55:38Z

girder/models/model_base.py

+
+    :param fields: A mask for filtering result documents by key, or None to return the full
+        document, passed to MongoDB find() as the `projection` param.
+    :type fields: `str, list of strings, or tuple of strings for fields to be included from the


We need to support fields being a list (which is correct), or a str (for legacy purposes), but a tuple makes no sense semantically (IMO, that's intended for structures where the ordinal position has semantic meaning). I'd cut all the references to tuple throughout this PR. It will still work, but there's no reason we need to explicitly document that.

brianhelba · 2017-09-22T02:00:19Z

girder/models/model_base.py

+            copy = list(fields)
+            for entry in overwrite:
+                if entry not in copy:
+                    copy.append(entry)


This would be simpler (and theoretically more efficient) as:

copy = list( set(fields) | set(overwrite) )

brianhelba · 2017-09-22T02:08:32Z

girder/models/model_base.py

+                if k not in whitelist:
+                    del doc[k]
+    else:
+        fields = list(fields)


No need to make a copy of fields here (iterating through a tuple works the same, anyway). If it's a string, we would need to cast it to a list.

Also, let's add a test for the case when fields is a string.

brianhelba · 2017-09-22T02:10:59Z

girder/models/model_base.py

+                    del doc[k]
+    else:
+        fields = list(fields)
+        for k in list(doc.keys()):


If you want to iterate over keys, it's most memory-efficient to do:

for k in six.viewkeys(doc):

in Girder, or:

for k in doc.viewkeys():

in Python2, or

for k in doc.keys()

in Python3.

The problem I was having is that the dictionary is changing since I'm deleting keys. So copying doc.keys() to a list solved that problem.

brianhelba · 2017-09-22T02:12:56Z

girder/models/model_base.py

+                    del doc[k]
+    else:
+        fields = list(fields)
+        for k in list(doc.keys()):


As this is now, it's got time complexity O(doc * fields). It's possible to get it down to O(fields * log(doc)), which for fields << doc would actually matter.

brianhelba · 2017-09-22T02:14:42Z

girder/models/model_base.py

@@ -1267,24 +1338,16 @@ def load(self, id, level=AccessType.ADMIN, user=None, objectId=True,

        # Ensure we include access and public, they are needed by requireAccess
        loadFields = copy.copy(fields)


No need to copy here, if we're doing it inside _overwriteFields.

brianhelba · 2017-09-22T02:19:01Z

tests/cases/model_test.py

+        retval = _overwriteFields(exclusionProjDict, ['newValue'])
+        self.assertEqual(retval, exclusionProjDict)
+        retval = _overwriteFields(inclusionProjDict, ['newValue'])
+        self.assertEqual(retval, {


Let's test the fact that retval should not be mutated by _overwriteFields.

retval is never passed to _overwriteFields? I'm not really sure what that would be testing against?

Or are you saying the first argument, in this case inclusionProjDict should not be mutated by _overwriteFields?

brianhelba

LGTM!

brianhelba · 2017-09-27T16:39:19Z

@sutartmelson Actually, there are 2 uncovered lines in _isInclusionProjection. Would it be easy to add another test statement to cover these?

sutartmelson · 2017-09-27T16:41:07Z

@brianhelba yup, I'm on it.

brianhelba · 2017-09-27T18:39:03Z

@manthey Can you take a look?

manthey · 2017-09-27T19:19:00Z

It looks fine.

One petty comment: at least in early versions of python newdict = original.copy() was mildly faster than copying a dict via newdict = dict(original), so to my eyes seeing the constructor is less efficient (and less obvious) than using original.copy() or dict.copy(original). I haven't checked, so it is distinctly possible that they are internally the same at this point.

brianhelba

2 more petty issues, that I noticed while I was using this code in my own PR.

brianhelba · 2017-09-27T20:52:27Z

girder/models/model_base.py

-        :param exc: If not found, throw a ValidationException instead of
-            returning None.
+        :type fields: list or dict
+        :param exc: If not found, throw a ValidationException instead of returning None.
        :type exc: bool
        :raises ValidationException: If an invalid ObjectId is passed.
        :returns: The matching document, or None if no match exists.
        """



No need to for an extra newline here.

brianhelba · 2017-09-27T20:52:58Z

girder/models/model_base.py

-            else:
-                loadFields = list(set(loadFields) | {'access', 'public'})
+        loadFields = fields
+        overwriteFields = {'access', 'public'}


You can put this definition inside the if not force: block.

brianhelba · 2017-09-27T20:55:12Z

@sutartmelson This still LGTM. Feel free to fix the nits raised by myself and @manthey in an additional commit if you'd like (I'll reapprove), or just merge this yourself if you're happy as-is.

brianhelba · 2017-09-28T03:30:48Z

@sutartmelson Nevermind, let's merge as-is. I'll implement my suggestions in a forthcoming PR.

sutartmelson added bug domain: server labels Sep 20, 2017

Fully implement and test projection filtering

5340fb2

sutartmelson force-pushed the inclusion-projection branch from e5b10e0 to 5340fb2 Compare September 20, 2017 19:30

brianhelba requested changes Sep 22, 2017

View reviewed changes

Deprecate str support for fields param

cbbf841

sutartmelson force-pushed the inclusion-projection branch from b431c6b to cbbf841 Compare September 25, 2017 14:52

brianhelba previously approved these changes Sep 27, 2017

View reviewed changes

Test _isInclusionProjection edge cases

1e040b3

sutartmelson dismissed brianhelba’s stale review via 1e040b3 September 27, 2017 16:50

Refactor as static methods

59e6e5f

brianhelba approved these changes Sep 27, 2017

View reviewed changes

brianhelba merged commit 0f23904 into master Sep 28, 2017

brianhelba deleted the inclusion-projection branch September 28, 2017 03:30

brianhelba mentioned this pull request Sep 28, 2017

Ensure that the "fields" parameter to "Model.load" always works #2366

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fully implement and test projection filtering #2352

Fully implement and test projection filtering #2352

sutartmelson commented Sep 20, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

sutartmelson Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

brianhelba Sep 22, 2017

sutartmelson Sep 22, 2017

brianhelba left a comment

brianhelba commented Sep 27, 2017

sutartmelson commented Sep 27, 2017

brianhelba commented Sep 27, 2017

manthey commented Sep 27, 2017

brianhelba left a comment

brianhelba Sep 27, 2017

brianhelba Sep 27, 2017

brianhelba commented Sep 27, 2017

brianhelba commented Sep 28, 2017

		@@ -1267,24 +1338,16 @@ def load(self, id, level=AccessType.ADMIN, user=None, objectId=True,

		# Ensure we include access and public, they are needed by requireAccess
		loadFields = copy.copy(fields)

Fully implement and test projection filtering #2352

Fully implement and test projection filtering #2352

Conversation

sutartmelson commented Sep 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianhelba left a comment

Choose a reason for hiding this comment

brianhelba commented Sep 27, 2017

sutartmelson commented Sep 27, 2017

brianhelba commented Sep 27, 2017

manthey commented Sep 27, 2017

brianhelba left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianhelba commented Sep 27, 2017

brianhelba commented Sep 28, 2017