Skip to content
This repository has been archived by the owner on Jan 23, 2024. It is now read-only.

Bulk update limit #92

Merged
merged 16 commits into from Aug 15, 2016
Merged

Bulk update limit #92

merged 16 commits into from Aug 15, 2016

Conversation

wojcikstefan
Copy link
Member

@wojcikstefan wojcikstefan commented Aug 12, 2016

Primarily to minimize the effects of a poorly constructed request.

After this change, flask-mongorest will by default limit bulk updates to 1k documents. If more than that would be affected, a 400 response is returned.

This PR also introduces a method which you can use to validate the request before processing of a bulk update starts.

thomasst and others added 8 commits August 10, 2016 21:13
- Introduces two helper methods (`update_objects` and `update_object`) that can be overridden in subclasses
- Makes sure `view_method` is propagated to subresources
- Fixes an issue where `has_change_permission` is called after the object is updated in bulk updates
- Fixes an issue where `obj.save()` is unnecessarily called after `update_object()` (which already saves) in bulk updates
@wojcikstefan
Copy link
Member Author

@thomasst pls review

@thomasst
Copy link
Member

I was initially thinking we would make the validate_bulk_update default implementation to do the check, but either way is fine with me.

@wojcikstefan
Copy link
Member Author

Yea, I hear ya, though I'd prefer to expose the validation method before we git the database. basically, it's a similar concept to what validate_request does for single PUTs and POSTs. I think of the extra bulk_update_limit check as something separate.

@thomasst
Copy link
Member

Some more thoughts on the validation logic:

  • What if validate_request was called for the PUT request (with obj == None initially), and you could implement your validation based on the view_method?
  • Then, validate_bulk_update could take a list of objs as a parameter.

@wojcikstefan
Copy link
Member Author

wojcikstefan commented Aug 15, 2016

@thomasst I looked into your suggestion and I don't like the outcome. I find it more confusing than the original implementation (i.e. the one from this PR). Few reasons why:

  1. It's confusing that first you call validate_request w/o an obj and then you call it again once per obj - this seems like a poor design, where a single method has many different roles.
  2. Behavior of validate_request would be inconsistent, because we'd call it before get_objects for bulk update, and after get_object for a single PUT.
  3. It might be a breaking change that requires us to inspect all existing resources and ensure validate_request's code differentiates properly between single PUT and bulk PUT requests.

Given all of the above, I propose we move forward with this implementation.

@thomasst
Copy link
Member

Okay, I agree that the purpose of the current version of validate_request should be to validate the input data for the given obj that is being modified (or None if we're creating an object). However, the naming of validate_request is confusing since it implies that it's called once per request, and not once per object, so maybe the naming should be improved. Note that we're already doing a "hack" in the task view so we can read out the update dict and figure out if certain fields were supplied with the request and are being modified.

Secondly, after thinking about it, I'm a bit hesitant of adding another method-specific method (validate_bulk_update) because it seems very specific to the bulk update (e.g. why not also have validate_update / validate_create?) and can probably be implemented more generically. Also, in this PR you're implementing validation logic in get_objects and not even using the method to validate the bulk update.

My proposals are:

  • Either introduce a generic request validation method that is called once per request, has access to the parsed request data, and lets you implement per-request validation.
  • Or, don't introduce any new validation methods, and let subclasses implement validation by either overriding the view methods (def put), or by e.g. overriding get_objects() and checking for view_method == methods.BulkUpdate.

@wojcikstefan
Copy link
Member Author

Yea, I agree adding that extra validation method only makes things more inconsistent. Given time constrains, I'll go with not introducing any new methods.

@philfreo
Copy link
Member

philfreo commented Aug 15, 2016

@thomasst
Copy link
Member

After giving this another look I'm seeing another important issue: get_objects now no longer returns a queryset, and the bulk limit validation doesn't subclasses into account.

One of the nice things about mongorest since its beginning has been that you could override get_objects on the resource and override the parent queryset, like this:

class MyResource(Resource):
    def get_objects(self):
        return super(MyResource, self).get_objects().filter(status='X')

This now breaks on multiple levels: 1) we can't call .filter() anymore since the queryset is pre-evaluated, 2) even if we rewrite the method to not rely on a queryset, the bulk validation doesn't take the new filter into account.

@thomasst
Copy link
Member

I haven't fully thought it through yet, but would keeping get_objects as a queryset, and implementing the length check in process_objects work?

@wojcikstefan
Copy link
Member Author

I don't think that's true. Only get_queryset has that guarantee. get_objects afaik returned a list of objects in most cases.

@wojcikstefan wojcikstefan merged commit cf5535b into master Aug 15, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants