-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race conditions for parallel requests due to cache #371
Comments
I think this is caused by the fact that |
Thanks for the report and the test case. Indeed your test passes when removing the caching mechanism. |
Thanks for reporting @ThiefMaster . I won't have time to work on this for the next few days, but I'd gladly review/merge/release a patch. @ThiefMaster @lafrech Are either you up for this? |
I'm looking into fixing it right now |
My approach would be passing around the cache explicitly (and creating the cache dict in How much of the parsing-related functions are considered part of the public API? Adding a new |
I added a possible fix in #372 |
Yes. Here's a quick draft, FlaskParser only: https://github.com/marshmallow-code/webargs/commits/fix_cache_race_condition. You just beat me to it, with a patch covering all parsers.
I'm afraid the change will be a breaking one. We could also release a fix on the 5.x branch that would only disable the cache. |
The problem I see there is that this will re-parse the json body for every single field in the schema. However, I think we could fix it for flask by storing it as an attribute on the flask req context instead of using the cache... |
Thinking about this again, why don't we create a new Parser instance whenever we parse something? That way |
Sure. I can't comment about the performance impact. I suppose it would be more important in Tornado, as the cache was introduced for Tornado in the first place. Anyway, that would be a fallback solution, until user code is migrated to webargs 6.0. |
I don't think this can be done in a non-breaking fashion either, but it could be worth investigating. I suppose you mean make Users instantiating a Parser themselves (e.g. https://github.com/Nobatek/flask-rest-api/blob/master/flask_rest_api/arguments.py#L14) should modify their code. |
Check #373 - this should be backwards-compatible, without losing the benefit of caching. |
Nice. We could use this as a 5.x hotfix. @sloria would you like to review and release? I think it is good to go. I'd support issuing a breaking change / less twisted implementation 6.0 version. I see two options:
The latter sounds more appealing. Besides, it only breaks the code for users instantiating the parser themselves, but not for those calling |
for 6.0 the second option indeed sounds better - passing around the cache all the time is pretty ugly |
Shall we file a CVE for this? Everything that allows validation to be circumvented is a potential security issue. The fact that it is cross-requests makes it even worse. Imagine a resource allowing a user to set personal info, including a password. If a user repeatedly submits information that is intentionally long to deserialize, which could be achieved if the schema contains a self-nested structure for instance, or accepts long arrays, it is likely that everyone using that same resource at the same time will end up with the password set by the long request. |
The patch is released in 5.1.3. I've also requested a CVE ID and will report on Tidelift once that's done. We can discuss refactoring the solution in #374 . Thank you @ThiefMaster for the quick response on this. |
Because the cache is no longer used field-by-field to fetch data, there's significantly less value in keeping it. Combined with the fact that each parser instantiation was already clearing the cache to avoid a security bug ( marshmallow-code#371 ), the cache is no longer actually used at all in most (any?) contexts. Remove the cache and all of the machinery associated with it (Parser._clear_cache, Parser._clone, and relevant checks).
Because the cache is no longer used field-by-field to fetch data, there's significantly less value in keeping it. Combined with the fact that each parser instantiation was already clearing the cache to avoid a security bug ( marshmallow-code#371 ), the cache is no longer actually used at all in most (any?) contexts. Remove the cache and all of the machinery associated with it (Parser._clear_cache, Parser._clone, and relevant checks). Resolves marshmallow-code#374
Because the cache is no longer used field-by-field to fetch data, there's significantly less value in keeping it. Combined with the fact that each parser instantiation was already clearing the cache to avoid a security bug ( marshmallow-code#371 ), the cache is no longer actually used at all in most (any?) contexts. Remove the cache and all of the machinery associated with it (Parser._clear_cache, Parser._clone, and relevant checks). Resolves marshmallow-code#374
Because the cache is no longer used field-by-field to fetch data, there's significantly less value in keeping it. Combined with the fact that each parser instantiation was already clearing the cache to avoid a security bug ( marshmallow-code#371 ), the cache is no longer actually used at all in most (any?) contexts. Remove the cache and all of the machinery associated with it (Parser._clear_cache, Parser._clone, and relevant checks). Resolves marshmallow-code#374
I just noticed that something in webargs or marshmallow isn't thread-safe. Take this minimal example"
Run it with threading enabled:
Now send two requests in parallel, with different values:
The output from these two requests is:
Clearly not what one would have expected! 💣
The output of the
print
statement showing the request data and what the field receives confirms the issue:Tested with the latest marshmallow/webargs from PyPI, and also the marshmallow3 rc (marshmallow==3.0.0rc4, webargs==5.1.2).
The text was updated successfully, but these errors were encountered: