Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requests containing files cannot be replayed. #64

Closed
bboe opened this issue Apr 30, 2015 · 12 comments
Closed

Requests containing files cannot be replayed. #64

bboe opened this issue Apr 30, 2015 · 12 comments
Milestone

Comments

@bboe
Copy link
Contributor

bboe commented Apr 30, 2015

When specifying files via files=[...] requests results in choosing a different multipart_boundary each time, thus making it quite difficult to match. There also appear to be some issues with cassettes and binary data for body matching.

Perhaps when using the betamax adapter, the multipart_boundary can be fixed to a constant value rather than a uuid that requests uses.

@sigmavirus24
Copy link
Collaborator

So there's going to be a severe layers violation here if we try to do that. The preparation of that multipart body happens well before the betamax adapter sees it. That stuff happens before we even get the data.

There's no way we as a library can do this and I'm not certain what the best fix is. As I see it, we have two options:

  1. Users could use the MultipartEncoder which allows users to stream large uploads & to select their own boundary.
  2. In the betamax-matchers package, we could use the undocumented MultipartDecoder to parse the body and do matching that way.

Personally I'm leaning towards 2. Thoughts @bboe?

@sigmavirus24
Copy link
Collaborator

Note: Currently the MultipartDecoder is designed to work with responses, but it can just as easily work with request objects.

@bboe
Copy link
Contributor Author

bboe commented May 4, 2015

I suppose #2 is the way to go if we cannot change the multipart boundary. My other option would be to get requests to accept specifying the multipart boundary (urlib3 does, but not requests) so that I can set it static (there's not reason to change the boundary per request).

I started down this former path (not actual decoding though) and ran into another set of encoding errors between versions of python as the request in memory is in binary format.

@sigmavirus24
Copy link
Collaborator

My other option would be to get requests to accept specifying the multipart boundary (urlib3 does, but not requests) so that I can set it static (there's not reason to change the boundary per request).

So this isn't something people frequently ask for. The "blessed" way of doing that is with the MultipartEncoder from the toolbelt. I think @Lukasa would agree that there isn't a good API that we can think of that would provide that functionality in requests.

I started down this former path (not actual decoding though) and ran into another set of encoding errors between versions of python as the request in memory is in binary format.

You mean the request stored by Betamax or something else? I suspect that the MultipartDecoder will be able to handle this. I haven't used it extensively, but it appeared to be very robust and well built.

Since this seems amenable, I'll prioritize this. I'm running the PSF's elections next week so I may not be able to attend to this until after I kick those off. By the @bboe, if you're not already a contributing member of the PSF, you should become one. It's free!

@sigmavirus24 sigmavirus24 modified the milestone: 0.5.0 May 27, 2015
sigmavirus24 added a commit to betamaxpy/betamax_matchers that referenced this issue Jun 16, 2015
@sigmavirus24
Copy link
Collaborator

@bboe I started working on this in betamaxpy/betamax_matchers@6155015

@bboe
Copy link
Contributor Author

bboe commented Jun 16, 2015

Awesome! Thanks @sigmavirus24

@sigmavirus24
Copy link
Collaborator

So I have a question for you @bboe:

If you take a look at betamaxpy/betamax_matchers@2199dea, you'll see that a first implementation of this makes it fairly... naïve. Can you point me to how you're using files= in requests? I think I'm going to make it sort the parts anyway, but that will have to happen by each part's headers which should be the same for parts in semantically equivalent multipart/form-data requests.

I haven't tested any of this, but if you can provide early feedback that'd be awesome.

@bboe
Copy link
Contributor Author

bboe commented Jun 16, 2015

It's a little complicated but here goes.

The following are the tests that I would like to run. They are commented out currently because they are not replayable. Also the test methods should be decorated with betamax, rather than calling betamax_init inside the tests:
https://github.com/praw-dev/praw/blob/master/tests/test_images.py#L86

PRAW uses files by preparing a request associated with a dummy session:
https://github.com/praw-dev/praw/blob/f85fbba050d08bc1b0102979a9b3c91805878a79/praw/internal.py#L160

This prepared request can be used in one of two ways. Most commonly it is sent to a rate limit handler that ensures requests are only made within the API's time limits:
https://github.com/praw-dev/praw/blob/f85fbba050d08bc1b0102979a9b3c91805878a79/praw/handlers.py#L101

The second way is to permit distributed rate limit handling. In that case, the prepared request is serialized and transmitted to another process that handles the request:

I hope that helps some. I really appreciate you looking into it. I should have some time this weekend to be of any assistance.

@sigmavirus24
Copy link
Collaborator

So looking at the upload_image handler I think I should be able to solve your problem by sorting the parts. I'll take a much closer look tonight though to be safe.

@sigmavirus24
Copy link
Collaborator

Oh and speaking of a betamax decorator, we've had a similar request here: #61

@sigmavirus24
Copy link
Collaborator

To make your testing easier @bboe I have released betamax_matchers 0.2.0 (https://github.com/sigmavirus24/betamax_matchers#multipart-form-data-body-matcher). Please file bugs there if you find problems with the matcher.

🍺 Cheers!

@bboe
Copy link
Contributor Author

bboe commented Jun 18, 2015

Thank you! I will look at it this weekend. I appreciate your efforts 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants