Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aiohttp] Multipart handling #987

Closed
wants to merge 11 commits into from

Conversation

kornicameister
Copy link
Contributor

@kornicameister kornicameister commented Jul 5, 2019

Changes proposed in this pull request::

  • different handling for multipart and non-multipart requests
  • tests confirming the behavior
  • minor list to typle changes for collection that ought to be immutable

Fixes #975
Supersedes #976

Potentially requires:

@kornicameister kornicameister mentioned this pull request Jul 5, 2019
@kornicameister kornicameister changed the title [aiohttp] Multipart handling [WIP] [aiohttp] Multipart handling Jul 5, 2019
@kornicameister
Copy link
Contributor Author

I stumbled into one issue.
There's a slight problem when it comes to handling mixed multipart the one that's 3rd example (/mixed) endpoint in newly added openapi spec.

Will try to dig into that. I might need to compare flask implementation to make it reliable the same.

Copy link
Contributor

@prawn-cake prawn-cake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job. Had some comments, the one starting with (nit) address some potential issues rather than request for changes.

# body has been read or not
# body_exists refers to underlying stream of data
'has_body': req.body_exists,
'can_ready_body': req.can_read_body,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo here -> can_read_body

logger.debug('Reading multipart data from request')
data = yield from req.post()

files = {k: v for k, v in data.items() if isinstance(v, web.FileField)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Having one loop will save some milliseconds for some high-load connexion service

if isinstance(v, web.FileField) and k not in files:
files[k] = v
elif isinstance(v, web.FileField) and k in files:
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any concerns of using collections.defaultdict instead of handling an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I have doubts if that is correct. It works for many files but if I mix normal fields and files everything goes wrong.

Also form is processed in weird way, I will post details łatwe for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting thing. I noticed that if you try to submit multiple fields for Flask it will ignore them and take just the first value.
I might want to create an issue on its own for that :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: #992

@kornicameister
Copy link
Contributor Author

Ok, along the way I noticed that file params in aiohttp at least are not propely taken care of if pythonParams is in place. Eh..I will create an issue later.
In the meantime, we need #954 to be merged (right now I modified that part of code myself).

@kornicameister
Copy link
Contributor Author

Ok, I think that is good to go for some initial review. My tests pass so I believe that if we can ask for some input from @dtkav or @hjacobs. Next step, I want to do is to enable/check uploading files directly (without multipart) but for sure, not in this PR.

@kornicameister kornicameister changed the title [WIP] [aiohttp] Multipart handling [aiohttp] Multipart handling Jul 8, 2019
@@ -216,8 +216,9 @@ def redirect(request):
@aiohttp_jinja2.template('index.j2')
@asyncio.coroutine
def _get_swagger_ui_home(self, req):
return {'openapi_spec_url': (self.base_path +
self.options.openapi_spec_path)}
return {'openapi_spec_url': (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but flake8 was unhappy about it.

@kornicameister
Copy link
Contributor Author

rebased that on top the master to include fixes for pytest incompatibility

@kornicameister
Copy link
Contributor Author

@hjacobs sorry for being so annoying 😺 but we kinda need this. I think I 2 different PRs that my PR might require to work properly (they are enlisted in top comment).

Would you mind checking them out first. I actually left some reviews there but 2nd pair of eye would be most welcomed nad helpful.

@kornicameister
Copy link
Contributor Author

@prawn-cake do you have any idea if form items (those that are not files) shoudl reach handler exploded or wrapped in body?

@prawn-cake
Copy link
Contributor

@kornicameister not sure if I got what you mean.

@kornicameister
Copy link
Contributor Author

I meant the case where you have multipart/request and how it suppose to be passed to handler?
Each part of multipart should be sent as a separate argument apart from files. All parts apart from files should be sent to handler as form argument same as application/json goes in as body.

@prawn-cake
Copy link
Contributor

Okay, got it. Good question, I have no answer.
What would be the pros and cons of each in your view?

In general, it's a bit hard to read code-specific comments without code context. Took me awhile to wrap my head around

@kornicameister
Copy link
Contributor Author

Yeah, I know...I suppose the pros can be that if you're dealing with either json request or form request you expect consistent behavior in a handler no matter what is sent. It is always called a body, but then again I have little to no idea about the meaning of x-body-name. Ughr...didn't want to say that, but honestly...there's a mess enough to make it hard to make changes reliably.

@prawn-cake
Copy link
Contributor

About x-body-name

In the OpenAPI 3.x.x spec, the requestBody does not have a name.
By default it will be passed in as 'body'. 
You can optionally provide the x-body-name parameter in your requestBody schema to override the name of the parameter that will be passed to your handler function.

And going back to the original question

do you have any idea if form items (those that are not files) shoudl reach handler exploded or wrapped in body?

If exploded option is set in the openapi file, then I'd expect handler gets it already unpacked (exploded), I'd minimize handler's work to do extra parsing etc and other logic which doesn't seem to belong to it.

@aneuway2
Copy link

aneuway2 commented Sep 26, 2019

[Swagger 3, Python 3, Flask]

Just had a similar issue between application/json and multipart/form inputs

Oddly i was able to work around this by putting the application/json input before the multipart/form in the swagger json.

Connexion returned the application/json endpoint as a byte though which was unexpected but I could easily work around this by converting the dict in the {"body: {} back to the appropriate type using .decode('utf8') and a combination of json.loads

@ddurham2
Copy link
Contributor

ddurham2 commented Nov 17, 2019

I did not find that this fixed #975. My openapi spec has the request contain two items in a multipart/form-data: an application/json object and a file upload. It still hits the same stack traces as #975.

@kornicameister
Copy link
Contributor Author

hey @ddurham2 .
Sorry, but most likely I won't be able to work on this PR anymore. Anyone is welcome to pick it up and start it over.

@hjacobs if you want, feel free to close it. I am keeping it around for anyone to just have a branch to start from if they want.

@ddurham2
Copy link
Contributor

ddurham2 commented Nov 17, 2019

@kornicameister, too bad. I may continue looking at it. Did you find a suitable alternative? Did you abandon connexion or use something other than its aiohttp driver? Thanks

@kornicameister
Copy link
Contributor Author

@ddurham2 I try to play around axion (my own invention) where I try to combine typings and OpenAPI. It is purely academical and for fun and I do not expect that it will replace connexion, at least not yet ;-). The problem in connexion, is that it is really hard to wrap head around current implementation ;( which is yet another reason for me to not use it now at least with aiohttp.

@ddurham2
Copy link
Contributor

ddurham2 commented Nov 19, 2019

@prawn-cake: I'd like to see connexion handle things with aiohttp as well as it does with the flask api. However, in some dark corners, there are some things in abstract.py that seem to assume flask's datatypes (e.g. operations.abstract.py:_get_file_arguments() assumes a werkzeug.datatypes.MultiDict in that it calls getlist, but aiohttp passed along a normal dict). Hence, is the goal with aiohttp to make aiohttp_api.py pass along flask types (making abstract.py happy) or to alter abstract.py to handle either flask or aiohttp types? (thereby making it rather not abstract)

I'm inclined to convert things to flask types in aiohttp_api.py:get_request() when constructing ConnectionRequest().. but then these things get passed to handlers.. which, when writing for aiohttp connexion seems to ask the handler writer to expect aiohttp types and to return them as well.

I need a direction. Thanks

@ddurham2
Copy link
Contributor

ddurham2 commented Nov 19, 2019

Well, I discovered that this "dark corner" was committed to master just a few days ago (#1000). But my general question still stands.

@kornicameister
Copy link
Contributor Author

@ddurham2 that's exactly my point. There's no solid contract that Flask/aiohttp or any other thing could rely on.

@wilcoschoneveld
Copy link

What is the status on this PR?

@@ -273,7 +273,7 @@ def _get_body_argument(self, body, arguments, has_kwargs, sanitize):

if x_body_name in arguments or has_kwargs:
Copy link
Contributor

@ddurham2 ddurham2 Jan 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kornicameister, you and I are both trying to get aiohttp to work well with connexion. I can't wait until this PR is accepted.

Possibly another thing to add in this area of change:
Can you at all fathom why the 'or has_kwargs' is here??? I'm trying to create decorators for my handlers. The decorators generically pass all args along as **kwargs. However, when **kwargs is present in the handler, then this code kicks in and rather than returning parameters from the body broken out, it returns the body itself as a single arg. This makes much more difficult to create decorators.

There's a similar condition (but not problematic for me) also on line 263 above.

The original change doesn't describe why. It's just "connexion 2.0" in the blame.

I would propose removing the 'or has_kwargs' so that parameters are the same whether you break them out as named arguments in the handler or accept any named argument in the handler. Thoughts?

For now I've worked around it by expanding 'body' into the kwargs at the beginning of the decorator's wrapper, but I don't know if that handles all cases, and it just seems inconsistent of connexion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddurham2 I have tried to untagle the code in different parts of connexion without too much luck. I am afraid I can't help you. Due to various issues, like this one, we're forced to move away from connexion thus it is no longer my top priority to invest my time here.

Although, I am tracking what's going on, and my original opinion on state of affairs in connexion had not changed. Without strong API contract enforced & required by connexion it will be hard to make connextion, Flask and aiohttp implementation compliant.
And, since that contract is not in place, it is close to impossible or, at the very least, really hard to ensure proper implementation.
One example of such situation is handling this multipart, I believe I've already mentioned that, where files from request are assumed to be dist-ish structure and as long as dict-ish structure is returned from Flask and aiohttp there little thing that makes a difference and that's implementation of that dict-ish being. TBH & AFAIR connexion requires that being to just be iterable or have get method which is rather poor expectation (all hail strong typed languages where weird stuff like that has lower chance to occur :/ ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were up to me, I'd have ditched that PR and focus on work to prepare strong internal API contract toward making aiohttp and Flask plugins.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you find an alternative to connexion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are alternatives but concept is a mirror reflection, essentially it is code that comes first.
I am spending my time playing around with strongly typed self-creation :) now, it is available in my repositories ( axion ). It's no place to talk about it, but if you decide to check it out, let me know. It's mess from place to place but maybe sometime later I will clean it all up.

ddurham2 added a commit to ddurham2/connexion that referenced this pull request Apr 22, 2020
ddurham2 added a commit to ddurham2/connexion that referenced this pull request Apr 22, 2020
…uplicate keys as the rest of connexion expects

    - Based parly on existing PR: spec-first#987
@kornicameister
Copy link
Contributor Author

I am closing this PR. @ddurham2 took it over in #1222

@kornicameister kornicameister deleted the fix_975 branch May 22, 2020 18:04
RobbeSneyders pushed a commit that referenced this pull request Feb 18, 2022
* Added unit tests to demonstrate the problems of #975
    - Taken mostly from existing PR: #987

* now splitting out multipart POSTs into files[] and form[], handling duplicate keys as the rest of connexion expects
    - Based parly on existing PR: #987

* rewrote how operations/openapi.py::_get_body_argument() works to better build the arguments[] list according to what the spec says and what the handler accepts.  This fixes a bug when requests contain mixed files and form values and the handler is expecting variable names matching the request property names.

* Adding unit tests to improve code converage test

* post merge fixes - using 'async' keyword now in new unit test file

* unit test improvements -- now testing the contents of the files we upload too

* making some code a bit clearer regarding duplicate names of file submissions

* fixing up unit tests since merging main

* fixing isort-check-tests and flake8

* clarified a comment

* comment correction

* after discussions with maintainer, reverted _get_body_argument back to the original where it does not attempt to break out the body into individual arguments for the handler.  But left in changes that make the normal behavior of not passing a body argument to a handler without one more consistent when the body itself is empty or not an object type.

* fixing unit tests after after reverting _get_body_argument behavior
vbxx3 pushed a commit to Savannah-Group/connexion that referenced this pull request Apr 9, 2022
* Added unit tests to demonstrate the problems of spec-first#975
    - Taken mostly from existing PR: spec-first#987

* now splitting out multipart POSTs into files[] and form[], handling duplicate keys as the rest of connexion expects
    - Based parly on existing PR: spec-first#987

* rewrote how operations/openapi.py::_get_body_argument() works to better build the arguments[] list according to what the spec says and what the handler accepts.  This fixes a bug when requests contain mixed files and form values and the handler is expecting variable names matching the request property names.

* Adding unit tests to improve code converage test

* post merge fixes - using 'async' keyword now in new unit test file

* unit test improvements -- now testing the contents of the files we upload too

* making some code a bit clearer regarding duplicate names of file submissions

* fixing up unit tests since merging main

* fixing isort-check-tests and flake8

* clarified a comment

* comment correction

* after discussions with maintainer, reverted _get_body_argument back to the original where it does not attempt to break out the body into individual arguments for the handler.  But left in changes that make the normal behavior of not passing a body argument to a handler without one more consistent when the body itself is empty or not an object type.

* fixing unit tests after after reverting _get_body_argument behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File uploading with aiohttp (status 500)
5 participants