Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for aiohttp and multipart/form-data uploads #1222

Merged
merged 16 commits into from
Feb 18, 2022

Conversation

ddurham2
Copy link
Contributor

@ddurham2 ddurham2 commented Apr 22, 2020

I'm trying to clean some things up around aiohttp and file uploads, and close a few tickets in the process. There are a couple of abandoned PRs (mentioned below) which I'm trying to revitalize efforts around.

Changes proposed by this fix:

  • in aiohttp request handling, splitting out multipart POSTs into files[] and form[], handling duplicate keys as the rest of connexion expects
  • rewrote how operations/openapi.py::_get_body_argument() works to better build the arguments[] list according to what the spec says and what the handler accepts. This fixes a bug when requests contain mixed files and form values and the handler is expecting variable names matching the request property names
  • introduce unit tests to demonstrate the problem

Replaces PRs:

Fixes:

Should Fix: (I consulted the original authors, but no response)

Relates To:

setup.py Outdated Show resolved Hide resolved
…uplicate keys as the rest of connexion expects

    - Based parly on existing PR: spec-first#987
…er build the arguments[] list according to what the spec says and what the handler accepts. This fixes a bug when requests contain mixed files and form values and the handler is expecting variable names matching the request property names.
@coveralls
Copy link

coveralls commented Apr 22, 2020

Coverage Status

Coverage increased (+0.04%) to 96.33% when pulling 13488a5 on ddurham2:fix_975-renewed into 1b78f03 on zalando:master.

@ddurham2 ddurham2 marked this pull request as ready for review April 23, 2020 07:32
@dfeinzeig
Copy link
Contributor

@ddurham2 did you get a chance to confirm the "may fixes" as confirmed fixes? i desperately need the fix for 992 but also need 1193. thanks!

@dfeinzeig
Copy link
Contributor

oh, these fixes are specific to aiohttp and don't seem to work with flask?

@ddurham2
Copy link
Contributor Author

Some of the changes could actually influence Flask behavior since they change shared code.

@ddurham2 ddurham2 force-pushed the fix_975-renewed branch 2 times, most recently from 28d6074 to c277980 Compare May 20, 2020 22:46
@ddurham2
Copy link
Contributor Author

@hjacobs any thoughts on merging this? Anything I need to do to prepare it further? Thanks!

# pass the body as a whole as an argument named, 'body', along with the
# individual body properties. STOP.
#
# #3 Else, we pass the body's individual properties which exist in [arguments].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this piece makes sense to add. IIRC this was the behavior for swagger2, but I think it's an anti-pattern. Passing in a body object avoids some issues like parameter name conflicts, and is much less "magical".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd sure break a lot of code to remove #3. It's the whole "Automatic Parameter Handling" that's mentioned in the examples in README.rst. This PR is only preserving that behavior, not changing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @dtkav here. This isn't the current behavior for OpenAPI 3 and I don't think we should support it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobbeSneyders, thanks for taking the time to look at this.

I just wanted to get clarification on this. So you're saying that if there's a body parameter named 'x' declared in the spec and there's an argument named 'x' in the handler, then it should not try to match that body parameter to the handler argument? Isn't that inconsistent with how query and path parameters work? Is there a reason they should differ in behavior? I'm guessing what @dtkav was getting at was that body params may conflict with the others? But couldn't they conflict with each other just the same? Is there a conflict resolution strategy for path and query?

If the fear is breaking existing apps, that's understandable. My thought was that dealing with body directly is breaking the abstraction that connexion is trying to put over those lower level details.

Again, I'm willing to change it.. just trying to understanding the reasoning for future reference.

Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're saying that if there's a body parameter named 'x' declared in the spec and there's an argument named 'x' in the handler, then it should not try to match that body parameter to the handler argument?

Indeed, we should only pass the body as a whole. I don't think this is inconsistent with the query and path parameters. Query and path parameters are top level, just as the body. The content of the body is one level below. We only pass the top level.

Next to this, passing the body is a lot less magical as @dtkav mentioned. The user knows what to expect, which isn't the case with the current implementation in this PR, shown by the 5 paragraph comment needed to explain the behavior 🙂

Path and query parameters can indeed also conflict with each other. Currently the query parameters overwrite the path parameters. Probably not ideal, however conflicting parameters should ideally be avoided by the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I reverted back to the original behavior for _get_body_argument() which no longer breaks body parameters into handler arguments.

However, in the process I learned, or.. remembered (this original change being so long ago now).. that the reason I though body should be split out is because uploaded files in the body are broken out as individual handler arguments
which is an existing behavior apparently.

My main change here was to make the aiohttp handler consistent with flask. I did that by constructing the ConnexionRequest() object in aiohttp_api.py the same as flask does regarding files.

So, I'm just pointing out that some future cleanup might want to make the file elements stay inside of body, where I'd think you'd say they belong due to the reasoning that they're not top-level parameters. But I think it doesn't work that way because this is basically flask's existing behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're ready to merge now. Thanks!

@RichardBruskiewich
Copy link

RichardBruskiewich commented Mar 9, 2021

@dtkav @kornicameister (cc: @ddurham2)

I don't know if this is the best context to raise this, but it isn't just the multipart/form-data which is broken here, but also, the application/x-www-form-urlencoded processing.

A similar runtime error to the multipart upload occurs in the current validation.py which has the following code:

                data = dict(request.form.items()) or (request.body if len(request.body) > 0 else {})
                data.update(dict.fromkeys(request.files, ''))  # validator expects string..

The problem arises in that request.body from a application/x-www-form-urlencoded POST is a binary (byte) string of urlencoded parameters. Needless to say, a byte string doesn't have an update method, so data.update throws an exception.

What I notice is the following in PR #1222 is that that form argument is added to the ConnexionRequest object:

       # within async def get_request(cls, req):
        return ConnexionRequest(url=url,
                                headers=headers,
                                body=body,
                                json_getter=lambda: cls.jsonifier.loads(body),
                                files={},
                                form=form,
                                files=files,
                                context=req)

That's a good step forward. However, where is it initialized with the contents of the body taken as urlencode parameters?

I'm not quite sure where to intervene in the code, but this bug is currently a show stopper for my use of AIOHTTP in my project (which is in a somewhat urgent need-to-get-the-prototype-running state for an NIH funded biomedical information system project).

Cheers
Dr. Bruskiewich

@RichardBruskiewich
Copy link

@ddurham2, thank you for your diligent work on this PR. I actually forked 'connexion' then merged your PR code, with perhaps just a few local tweaks (not sure if some of them are still necessary), and the PR essentially fixed my problems with the 'multipart/form-data' uploads, so that I was finally able to get an aiohttp version of my application working.

I hope that the Connexion team will soon consider merging this PR or an equivalent, to make things work for everyone.

@ddurham2
Copy link
Contributor Author

ddurham2 commented Mar 12, 2021

@RichardBruskiewich, Thanks for your kind words. Glad it worked for you too. As for merging, sadly, I'm increasingly fearful that this project is abandoned by the authors. I'm hoping a new official fork can start or at least that the maintainers will happily give maintenance perms to someone else.

@ddurham2
Copy link
Contributor Author

humble bump

@RobbeSneyders RobbeSneyders changed the base branch from master to main February 11, 2022 07:53
@coveralls
Copy link

coveralls commented Feb 11, 2022

Pull Request Test Coverage Report for Build 1851302698

  • 22 of 22 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 97.079%

Totals Coverage Status
Change from base Build 1819970533: 0.02%
Covered Lines: 2858
Relevant Lines: 2944

💛 - Coveralls

@ddurham2
Copy link
Contributor Author

@RobbeSneyders - Hi Robbe.. I am SO pleased to see connexion getting the needed attention and development it needs. I have a project that heavily depends on it, and it has the most convenient design for implementing an openapi service based on python, and best of all it implements an aiohttp backend which is important for me.

That said, I would request that you take a look at this pull request and consider merging it. It fixes a few linked issues. And a few other people complaining about the same problem. The main problem solved is that connexion cannot handle a form submission that has files and other parameters at the same time.

I've had to maintain a custom fork with these changes since finding its problem and creating the fix, and really would like to get back on the main release.

Thank You

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @ddurham2!

If you can drop the automatic body parsing, we should be able to merge.

connexion/apis/aiohttp_api.py Outdated Show resolved Hide resolved
# pass the body as a whole as an argument named, 'body', along with the
# individual body properties. STOP.
#
# #3 Else, we pass the body's individual properties which exist in [arguments].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @dtkav here. This isn't the current behavior for OpenAPI 3 and I don't think we should support it.

connexion/apis/aiohttp_api.py Show resolved Hide resolved
…o the original where it does not attempt to break out the body into individual arguments for the handler. But left in changes that make the normal behavior of not passing a body argument to a handler without one more consistent when the body itself is empty or not an object type.
@@ -6,11 +6,11 @@ info:
title: "{{title}}"
version: "1.0"
paths:
"/mixed_single_file":
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I just moved the bottom two endpoints to the top because that's the same order they are in in the test handler and test requester files

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ddurham2, LGTM :)

@RobbeSneyders RobbeSneyders merged commit eb97cf9 into spec-first:main Feb 18, 2022
vbxx3 pushed a commit to Savannah-Group/connexion that referenced this pull request Apr 9, 2022
* Added unit tests to demonstrate the problems of spec-first#975
    - Taken mostly from existing PR: spec-first#987

* now splitting out multipart POSTs into files[] and form[], handling duplicate keys as the rest of connexion expects
    - Based parly on existing PR: spec-first#987

* rewrote how operations/openapi.py::_get_body_argument() works to better build the arguments[] list according to what the spec says and what the handler accepts.  This fixes a bug when requests contain mixed files and form values and the handler is expecting variable names matching the request property names.

* Adding unit tests to improve code converage test

* post merge fixes - using 'async' keyword now in new unit test file

* unit test improvements -- now testing the contents of the files we upload too

* making some code a bit clearer regarding duplicate names of file submissions

* fixing up unit tests since merging main

* fixing isort-check-tests and flake8

* clarified a comment

* comment correction

* after discussions with maintainer, reverted _get_body_argument back to the original where it does not attempt to break out the body into individual arguments for the handler.  But left in changes that make the normal behavior of not passing a body argument to a handler without one more consistent when the body itself is empty or not an object type.

* fixing unit tests after after reverting _get_body_argument behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants