Skip to content

Conversation

@malthejorgensen
Copy link
Contributor

Currently doing a query like BlogPost.objects(authors__in=author).count(), where author is a single MongoEngine-Document, raises an obscure error:

raise ValidationError(message, errors=errors, field_name=field_name)
mongoengine.errors.ValidationError: u'id' is not a valid ObjectId, it must
be a 12-byte input or a 24-character hex string

This pull requests adds a "sanity" check, and raises a TypeError explaining that you need to use an iterable with the in-operator.

The correct thing to do is: BlogPost.objects(authors__in=[author]).count(), but that's not obvious when you read the ValidationError shown above.

Review on Reviewable

Copy link
Member

@wojcikstefan wojcikstefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @malthejorgensen, this is great! It's almost ready to merge except for a few tweaks/generalizations. Will you be able to address my comments any time soon? Also, sorry it took that long to get back to you!

value = value['_id']

elif op in ('in', 'nin', 'all', 'near') and not isinstance(value, dict):
if isinstance(value, BaseDocument):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all of the above truly expect an iterable, then we should raise an error for any non-iterable, not just a BaseDocument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the problem is actually that BaseDocument is iterable.

Copy link
Contributor Author

@malthejorgensen malthejorgensen Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the checks mentioned in this StackOverflow thread will say that BaseDocument is iterable. (for k in user will iterate over each field of the document user)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I missed that, thanks for clearing that up! I think in such case we can change this condition to:

# Raise an error if the in/nin/all/near param is not iterable. We need a
# special check for BaseDocument, because - although it's iterable - using
# it as such in the context of this method is most definitely a mistake.
if not hasattr(value, '__iter__') or isinstance(value, BaseDocument):
   raise TypeError(...)

The comment is crucial, too, given that this is a pretty unintuitive isinstance check.What do you think @malthejorgensen ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I'll make the change

BlogPost.objects(authors__in=author).count()

# Check correct usage
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is redundant - we already check it above.

self.fail("Using __in-operator raised type error when it shouldn't have")

User.drop_collection()
BlogPost.drop_collection()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are redundant - by convention we should just take care of dropping existing collections at the beginning of each test (which you already do).

@malthejorgensen
Copy link
Contributor Author

I'll take a look at it now :)

@malthejorgensen
Copy link
Contributor Author

malthejorgensen commented Dec 5, 2016

  • I've added a separate test, so that there's now one for non-iterable values and one for Document values.
  • I've removed the redundant part of the test (default behaviour of __in), and the redundant database cleanup (at the end of the test)
  • I've changed the check so that there's a check for iterables and one for Document values

There's a bunch of small commits so squashing would make sense, and I also have a version of the branch that is rebased on the latest master -- but pushing it will remove the code review you made.

Copy link
Member

@wojcikstefan wojcikstefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution @malthejorgensen

@wojcikstefan
Copy link
Member

@malthejorgensen you can now push the rebased version of the branch. Ping me when you do and I'll merge it. Thanks again!

@malthejorgensen malthejorgensen force-pushed the feature/in-operator-catch-no-list branch 2 times, most recently from 0266c21 to 1a01a1d Compare December 6, 2016 13:08
@malthejorgensen
Copy link
Contributor Author

malthejorgensen commented Dec 6, 2016

Absolutely no problem – I'm just glad to help! I'm only sorry that the process is turning out to be a bit of hassle for you.

I'm now the getting the flake8 error ./mongoengine/queryset/transform.py:30:1: C901 'query' is too complex (44) from Travis – at the time where I originally forked, I guess the flake8-checks weren't there.

I have a different branch (feature/in-operator-catch-no-list-flake-C901) which should pass, where I've moved the value preparation out into its own function, but if we trust flake8's analysis, it's really just hiding an overarching problem – namely that the query()-function is getting too complex. And the decision of how to break it into smaller pieces, is a hard one.

We could also change the max-complexity=42-setting in setup.cfg, but that's also not ideal.

Ping @wojcikstefan

@wojcikstefan
Copy link
Member

@malthejorgensen I apologize for all the back and forth, but since I merged #1428 a) there are conflicts again, b) setup.cfg increased allowed complexity to 45, which should cover your tweak (if it doesn't just increase it for now - I'll deal with simplifying it soon).

Also, please use _import_class('BaseDocument') inline in the method (e.g. like CachedReferenceField) so that we don't have to introduce a circular import.

Doing a query like `BlogPost.objects(authors__in=author).count()`, where
`author` is a single Document, will now raise a TypeError explaining
that you cannot use a single `Document` with the `in`-operator.
i.e. `BlogPost.objects(authors__in=[author]).count()`

Similarly using the `in`-operator with a non-iterable will also raise a
TypeError explaining that you need to use an iterable with the `in`-operator.

The same applies to the `nin`, `all`, and `near` operators.

Note:
  The added code uses `_import_class('BaseDocument')` to break any circular
  dependency.
`queryset/transform.py::query()` has complexity 47.
@malthejorgensen malthejorgensen force-pushed the feature/in-operator-catch-no-list branch from 9fb9788 to de8d65d Compare December 12, 2016 09:51
@malthejorgensen
Copy link
Contributor Author

No problem @wojcikstefan – I'm just sorry to be taking your time for a small PR like mine
I think it should be ready now

@wojcikstefan
Copy link
Member

Thank you very much @malthejorgensen !

@wojcikstefan wojcikstefan merged commit 76524b7 into MongoEngine:master Dec 13, 2016
@wojcikstefan
Copy link
Member

FYI I made you tests a bit more concise while retaining all of the logic we want to test: 8e884fd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants