Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom headers in multipart/form-data requests #1936

Merged
merged 15 commits into from
Jan 13, 2022

Conversation

adriangb
Copy link
Member

@adriangb adriangb commented Nov 12, 2021

Multipart requests allow HTTP headers to be set on individual form items.
This pull request allows multipart file uploads to specify those additional headers on a per-file basis.

Similar pull request against the Starlette project: encode/starlette#1311

Edit by @tomchristie: Updated description

if "Content-Type" in headers:
raise ValueError(
"Content-Type cannot be included in multipart headers"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we don't need to do this check.

It's odd behaviour for the developer to set the content_type and then override it with the actual value provided in the custom headers. But it's not broken.

My preference would be that we don't do the explicit check here. In the case of conflicts I'd probably have header values take precedence.

I'm not absolute on this one, but slight preference.

Copy link
Member Author

@adriangb adriangb Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I thought it'd be a good idea to check what requests does here. It looks like it silently ignores the header in the header. That is:

requests.post("http://example.com", files=[("test", ("test_filename", b"data", "text/plain", {"Content-Type": "text/csv"}))])

Gets sent as text/plain.

Digging into why this is the case, it seems like it's just an implementation detail in urllib3. It happens here.

I'm not sure what the right thing to do here is, but if you feel like it's best to go with no error and making header values take precedence, I'm happy to implement that.

Another alternative would be to have the 3rd parameter be either a string representing the content type or a headers dict. We can't really make the 3rd parameter always be a headers dict because that would be a breaking change for httpx.
This would eliminate the edge case, but deviates from requests' API. It seems pretty reasonable that if I'm specifying headers I'm doing advanced stuff and so specifying the content type in the headers directly would not be an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the right thing to do here is, but if you feel like it's best to go with no error and making header values take precedence, I'm happy to implement that.

I reckon let's do that, yeah.

Another alternative would be to have the 3rd parameter be either a string representing the content type or a headers dict. We can't really make the 3rd parameter always be a headers dict because that would be a breaking change for httpx.

I actually quite like that yes, neat idea. The big-tuples API is... not helpful really. But let's probably just go with the path of least resistance here. Perhaps one day we'll want an httpx 2.0, where we gradually start deprecating the various big-tuples bits of API in favour of a neater style.

Copy link
Member Author

@adriangb adriangb Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon let's do that, yeah.

👍 donzo

Another alternative would be to have the 3rd parameter be either a string representing the content type or a headers dict. We can't really make the 3rd parameter always be a headers dict because that would be a breaking change for httpx.

I actually quite like that yes, neat idea. The big-tuples API is... not helpful really. But let's probably just go with the path of least resistance here. Perhaps one day we'll want an httpx 2.0, where we gradually start deprecating the various big-tuples bits of API in favour of a neater style.

Agreed! I added a comment in the code explaining the reasoning behind the big tuple API (inherited from requests) and how we might want to change it in the future.

filename, fileobj = value # type: ignore
else:
# corresponds to (filename, fileobj, content_type, headers)
headers = {k.title(): v for k, v in headers.items()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should .title() case here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... I see the comparison case. Huh. Fiddly.

httpx/_multipart.py Outdated Show resolved Hide resolved
if content_type is not None and "Content-Type" not in headers:
# note that unlike requests, we ignore the content_type
# provided in the 3rd tuple element if it is also included in the headers
# requests does the opposite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay maybe we should instead do it the other way. If the 4-tuple is used, just ignore the content_type variable. That'd be okay enough, matches requests more closely, and we can forget about fiddly case-based header checking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requests does the opposite: it ignores the header in the 4th tuple element. so we'll still need the case-based header checking if we want to do exactly what requests does. either way, we need to know if the content type header exists in the 4th element tuple so we can either ignore the 3rd element or overwrite it with the 3rd element.

Copy link
Member

@tomchristie tomchristie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this pull request except that I would rather we don't force-change the casing on the headers. That introduces a hidden little bit of behaviour surprise that I'd rather avoid.

Obvs we do still want to do a case-insensitive comparison for the Content-Type case tho.

content_type = guess_content_type(filename)

if content_type is not None and "Content-Type" not in headers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps...

has_content_type_header = any(["content-type" in key.lower() for key in headers])
if content_type is not None and not has_content_type_header:
    ...

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adapted it to any("content-type" in key.lower() for key in headers) (so it'll stop early).
Also removed the {header.title() ...} line.

@tomchristie tomchristie changed the title feat: allow specification of additional headers in multipart/form-data requests Allow custom headers in multipart/form-data requests Jan 13, 2022
@tomchristie
Copy link
Member

Lovely stuff. 👍

@tomchristie tomchristie merged commit 0f1ff50 into encode:master Jan 13, 2022
@adriangb adriangb deleted the multipart-advanced branch January 13, 2022 09:16
@tomchristie tomchristie mentioned this pull request Jan 26, 2022
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants