Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Allow obtaining original header capitalization #37834

Open
JohnRusk opened this issue Mar 13, 2020 · 13 comments
Open

net/http: Allow obtaining original header capitalization #37834

JohnRusk opened this issue Mar 13, 2020 · 13 comments
Milestone

Comments

@JohnRusk
Copy link

@JohnRusk JohnRusk commented Mar 13, 2020

What version of Go are you using (go version)?

1.13

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Windows, x64

What did you do?

Make a HTTP request, read it's headers, send those headers somewhere else.

What did you expect to see?

Headers are preserved exactly

What did you see instead?

Header capitialization gets canonicalized

I realize this has been discussed before (about 7 years ago).

However, in the time since then many users have been adversely affected by this. E.g.
traefik/traefik#466
Azure/azure-storage-azcopy#113

So I'd like to find out, would the Go team consider not a change to the current behaviour, but simply a way for an HTTPResponse to provide an additional map, that maps canonicalizedName -> originalName. If we could just get that map, then those of us who really need header case preservation could use the information it contains to achieve what we need. (Find out the original capitalization when we read a response, and then directly manipulate the outbound request map when we forward that data on).

I'd be happy to contribute the code for the above, if we expected it to be accepted.

@toothrot
Copy link
Contributor

@toothrot toothrot commented Mar 13, 2020

Hey @JohnRusk! I understand that one of your concerns is with Azure/azure-storage-azcopy#113, in which the API is expecting capitalized headers, despite RFC2616 specifying that header fields are to be treated as case-insensitive.

Issue #5022 was resolved by preserving case on outbound requests:
f0396ca

I don't think it's reasonable to keep a mapping of every header field transformation around for every request. As @bradfitz mentioned, http/2 explicitly lowercases all fields. I feel like making changes here for broken servers (which I understand are out of your control) doesn't seem reasonable.

Is it possible to keep a map of these special header fields around in your application and re-write them on your outbound request? I assume there is a subset that are sensitive, and it is not all HTTP headers. Their casing is preserved on outbound requests as I understand #5022.

/cc @bradfitz @rsc

@JohnRusk
Copy link
Author

@JohnRusk JohnRusk commented Mar 13, 2020

Is it possible to keep a map of these special header fields around in your application and re-write them on your outbound request?

Not really, because the set of header fields that exists is customer-determined, and may vary from customer to customer and even from file to file.

I don't think it's reasonable to keep a mapping of every header field transformation around for every request.

What about some kind of hook then, some kind of function that get's called as they are pulled of the wire, and before they are canonlicalized. If the hook is null, it doesn't get called.

e.g.(roughly)

headerName := ... <raw value off the wire>
if request.headerNameNotifier != nil {
   request.headerNameNotifier(headerName)
}
headerName := canonalicalize(headerName)

Then customers like us could provide request.headerNameNotifier and everyone else could leave it nil.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Mar 14, 2020

There are a few dup bugs in this space with a bunch of conversation. I can't find them at the moment (GitHub search is failing me) but they're somewhere.

@JohnRusk
Copy link
Author

@JohnRusk JohnRusk commented Mar 14, 2020

I had trouble searching too. I have managed to find a few though. Are any of these the ones you were thinking of?

  • #5022 . I see that there you proposed a workaround for writing (which has been implemented, albeit in a different form, if I understand correctly. But there's still no workaround for reading).

  • #18495 I notice there you wrote "I don't want to complicate Go and encourage buggy libraries from assuming case". I think the issue many of us have is that we have to interact with non-Go code that already assumes case, and assumes that we will preserve case (which we can't, when we're coding in Go). To be clear, no other library is assuming that we will be case sensitive, but they are assuming we will be case preserving when we handle their data.

  • #18196

  • #22864

  • Related : #18476

  • Related, but not a dup: #13767

  • Related, I think: #29965

@JohnRusk
Copy link
Author

@JohnRusk JohnRusk commented Mar 23, 2020

@ACECEO yes, it's a workaround in the sense that it works. No in the sense that it's a totally different product, with a different emphasis and style of usage. For customers who use the tool I work on, AzCopy, this Go issue remains a blocker.

@JohnRusk
Copy link
Author

@JohnRusk JohnRusk commented Mar 26, 2020

@bradfitz Any update re thoughts on this? As noted above, the key issue is that we currently can't write case-preserving code that reads headers and forwards them on. In our tool, AzCopy, this is a big deal for certain customers who need to move their data around in Azure.

@zezha-msft
Copy link

@zezha-msft zezha-msft commented Apr 14, 2020

Hi @ACECEO @toothrot, any additional thought?

@cy33hc
Copy link

@cy33hc cy33hc commented Jun 4, 2020

RFC2616

I've read the RFC2616, it just mentions that the header is case-insensitive. Nothing says that you should modify the the headers as you forward them.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Jun 4, 2020

I feel like I've replied to this a number of places, but I'll summarize here as well.

  • HTTP/2 makes this whole issue moot. So this bug only matters for HTTP/1<->HTTP/1 interactions, which becomes less relevant with each passing month.
  • Code that's sensitive to HTTP/1 header case is already broken; catering towards such code implicitly encourages such brokenness or at least allows it to continue, without providing pressure on those applications to get fixed

As such, any fix here must have super minimal cost, both in terms of API surface (cogntitive load cost, reading more godoc) and runtime cost (we shouldn't allocate or populate a new data structure with this info unless the calling code opted in to wanting that). Notably, we can't change the representation of http.Header at this point.

It probably needs to be opt-in at the http.Server level. Perhaps a new bool there. That means "middleware" can't assume the original case is present, and can't ask for it to be present retroactively, as we'd only parse/preserve it if they opted in. So perhaps that's too inconvenient.

Maybe we just add a new RawHTTP1Header []byte field to http.Request, documenting that it's only for HTTP/1 and it's the exact bytes from the Request-Line to the final closing newlines. Then let code outside of the standard library deal with it. It'd also need to be documented that the slice is not a copy and it aliases some internal buffer and is only valid until the next request on the connection. That kinda sucks as an API requirement, but it's not too dissimilar to e.g. https://golang.org/pkg/bufio/#Reader.ReadSlice. And it sucks less than making all users pay for an allocation that very little code will ever use.

If somebody has an alternate API proposal or implementation I'll take a look, assuming it meets these general requirements.

@JohnRusk
Copy link
Author

@JohnRusk JohnRusk commented Jun 4, 2020

Hi Brad. That suggestion about RawHTTP1Header sounds like a good option. Minimal impact, but enough to let us do what we need to. I presume the internal buffer, which it aliases, already exists. Is that right? If so, what about exposing it as method instead of a field? Perhaps called something like GetRawHTTP1Header() or CopyRawHttp1Header()? The method would make a copy from the internal buffer. But the copy would only happen if the method was called. So the only users who would pay the price of the allocation are those who actually need and use this feature. Perhaps that would be cleaner, in terms of API design, because the result of the method call is indeed a copy, and so there's no need to worry about the duration of its validity. (You just need to call the method before the next request is made on the connection).

Will it work client side too? (Just checking because you mention http.Server in the earlier paragraph, but we need it client-side, and I imagine other users will too.)

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Jun 5, 2020

Perhaps called something like GetRawHTTP1Header() or CopyRawHttp1Header()? The method would make a copy from the internal buffer.

My reservation with a method is that they render pretty prominently in godoc HTML compared to, say, a struct field.

Perhaps that would be cleaner, in terms of API design, because the result of the method call is indeed a copy, and so there's no need to worry about the duration of its validity.

It would be cleaner, but OTOH this is definitely in the realm of "you should know what you're doing" territory so making a super comfortable & safe API isn't the highest priority.

BTW, the other constraint that would need to be documented on the lifetime of this slice is that it's only valid before the Request.Body is read from. Because that can also advance the buffer.

But now that I think about it more, I'm not even sure that this API works, as I think we only use a 2KB or 4KB bufio.Reader to read from the connection (so that's all we have in memory at a time) and we permit by default up to 1MB of headers (https://golang.org/pkg/net/http/#DefaultMaxHeaderBytes). So RawHTTP1Header []byte on by default won't work, at least in the general case. (And documenting that it only works for "small" HTTP requests seems pretty terrible.)

So we probably do need some sort of opt-in mechanism, and then it can't work in HTTP middleware packages that don't have control over the http.Server setup.

Will it work client side too?

Yeah, it probably should. I don't think we'd want to accept a change that only did one, lest it turn out that whatever API we pick isn't sufficient for the other. Seeing it work with both would be a good sign that the API was sufficient.

@JohnRusk
Copy link
Author

@JohnRusk JohnRusk commented Jun 6, 2020

Good point re whether it should be documented prominently or not. (I agree with you that "not prominent" is preferable).

The question of whether it only works on small requests seems key. I think folks who have meaningful info in headers (such as some of our users) could easily have more than 4KB. Do you have any other suggestions? Shall we (our team here) see if we can come up with any alternative that meets the general requirements that you've outlined?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.