Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HEAD for GET routes #238

Closed
alehano opened this issue Aug 1, 2017 · 58 comments
Closed

HEAD for GET routes #238

alehano opened this issue Aug 1, 2017 · 58 comments

Comments

@alehano
Copy link

alehano commented Aug 1, 2017

Standard go http router allows to do HEAD requests to GET endpoints and return the same response but without body.
Chi response 405 for GET routes with HEAD method.

@pkieltyka
Copy link
Member

pkieltyka commented Aug 1, 2017

Make sure to specify .Get() and .Head() requests separately. 405 measures method is not allowed, which leads me to think you didnt define it.

@alehano
Copy link
Author

alehano commented Aug 2, 2017

Yes, I can set .Get() and .Head() method for the same route. But It's duplicate code.
Maybe you can add .GetHead() method to set GET and HEAD requests for the same route or something like this?
But for me it's better to respond for HEAD requests in .Get() directly.
Web crawlers and robots may do HEAD requests, so I want always to respond, instead they can think my page is broken.

According to HTTP/1.1 RFC:

9.4 HEAD
The HEAD method is identical to GET except that the server MUST NOT
return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request. This method can
be used for obtaining metainformation about the entity implied by the
request without transferring the entity-body itself. This method is
often used for testing hypertext links for validity, accessibility,
and recent modification.

@vektah
Copy link

vektah commented Aug 2, 2017

A way to automatically respond to a HEAD request using the GET handler if a HEAD handler hasn't been provided would be nice 👍

@pkieltyka pkieltyka reopened this Aug 2, 2017
@pkieltyka
Copy link
Member

@alehano I'll give this some more thought. I could add .GetHead() but, if they both point to the same handler then it will be up to the handler to ensure there is no message body returned based on r.Method == "HEAD" in the handler.

I could set an EmptyBodyHandler set as the default for a HEAD in that situation, unless overridden with a .Head(). The changes to logic is why chi settles to do less and leaves it to the developer's choice.

My one thought is to offer chi.Options{} as an argument to NewRouter() to set such a behaviour, but it goes against my wishes to keep chi super lean and small.

@alehano
Copy link
Author

alehano commented Aug 2, 2017

I love that chi lean and small.
Now I made a helper:

func GetHead(r chi.Router, pattern string, h http.HandlerFunc) {
	r.Get(pattern, h)
	r.Head(pattern, h)
}

And it works as expected. Go http decide return body or not, based on method. Bit it's doesn't looks nice, as well as adding .GetHead() to the chi.
Why not just make .Get() method like this:

func (mx *Mux) Get(pattern string, handlerFn http.HandlerFunc) {
	mx.handle(mGET, pattern, handlerFn)
	mx.handle(mHEAD, pattern, handlerFn)
}

Maybe with checking if Head() not already added.

@pkieltyka
Copy link
Member

@alehano it doesnt feel right to me to have GET and HEAD do the same thing as a default. But if others feel that is how it should work, I can make a concession for sure and add that line.

@alehano
Copy link
Author

alehano commented Aug 2, 2017

@pkieltyka Ok, I also want to hear other's thoughts.
For me, I don't see useful cases to have separate .Head() method. Because it always must be the same as GET, but without body.

Btw this how Go http package handle this:
https://github.com/golang/gofrontend/blob/master/libgo/go/net/http/transfer.go#L226
https://github.com/golang/gofrontend/blob/master/libgo/go/net/http/transfer.go#L105
https://github.com/golang/gofrontend/blob/master/libgo/go/net/http/transfer.go#L110

func noResponseBodyExpected(requestMethod string) bool {
	return requestMethod == "HEAD"
}

// ...
t.ResponseToHEAD = noResponseBodyExpected(t.Method)
// ...

if t.ResponseToHEAD {
	t.Body = nil
	if chunked(t.TransferEncoding) {
	t.ContentLength = -1
	}
} 

@c2h5oh
Copy link
Contributor

c2h5oh commented Aug 2, 2017

@pkieltyka he has a point:

https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

9.4 HEAD

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

The response to a HEAD request MAY be cacheable in the sense that the information contained in the response MAY be used to update a previously cached entity from that resource. If the new field values indicate that the cached entity differs from the current entity (as would be indicated by a change in Content-Length, Content-MD5, ETag or Last-Modified), then the cache MUST treat the cache entry as stale.

emphasis mine.

@DmitriyMV
Copy link

DmitriyMV commented Aug 2, 2017

IIRC currently neither gorilla/mux nor julienschmidt/httprouter do that, so the implicit conversion from GET to HEAD doesn't look THAT useful. I may be wrong about that and in no way state that this shouldn't be done. I'm just a bit concerned about additional work that app may need to do, and which is going to be wasted.

The standard mux analogy is interesting, but Go default mux have several pain points, so I wouldn't call it a great example.

@alehano
Copy link
Author

alehano commented Aug 2, 2017

One more reason of uselessness separate .Head(). For example, you want to add handler to .Head() for adding some Header values. But in this case it'll remove Content-Length header. So it's contradicts to RFC. Or you have to calculate Content-Length of correspondent GET method, which is unlikely somebody will do it.

@DmitriyMV
Copy link

@alehano
Not exactly - https://stackoverflow.com/questions/3854842/content-length-header-with-head-requests

If our GET method result is dynamically generated - this clause can be averted.

@alehano alehano closed this as completed Aug 2, 2017
@alehano alehano reopened this Aug 2, 2017
@DmitriyMV
Copy link

DmitriyMV commented Aug 2, 2017

Well - in case our GET request handler has to fetch data from remote sources (such as DB, storages, API) - our "implicit" HEAD handler which will be actually a GET handler and it will be forced to get and wait the required data. And then discard it.

This is also concerning when we have cache system with complex rules - I can imagine the scenario when the cache gets invalidated two times - on HEAD and then on GET.

@VojtechVitek
Copy link
Contributor

If we do the automatic HEAD response, should we also think of having something similar for OPTIONS?

@pkieltyka
Copy link
Member

pkieltyka commented Aug 2, 2017

@DmitriyMV if we do implicitly set Head() in Get(), I would still allow users to override the Head handler by another explicit call to .Head() after the .Get(), so that should take care of it.

@vektah
Copy link

vektah commented Aug 3, 2017

Rather than get setting head, I was thinking more along the lines of a HEAD request falling through to get at request time:

  • If a HEAD handler is defined, use it. This makes optimisation possible / avoiding costly operations.
  • If a GET handler is defined, call it and throw the body away. Its extra work but http compliant without any work.
  • Otherwise send method not allowed

Additionally when a HEAD request is made to a GET handler, immediately after the headers are sent the context could be cancelled, avoiding external work and database calls where its supported.

@DmitriyMV
Copy link

DmitriyMV commented Aug 3, 2017

Please, NO implicit settings. If I want this behavior I will enable it explicitly. Adding new method is better than redefining behavior of the existing ones.

I would also like to point out, that HEAD lookup time will increase depending on the size of existing GET routes.

@vektah
Copy link

vektah commented Aug 3, 2017

Please, NO implicit settings. If I want this behavior I will enable it explicitly.

What about something like

router.MethodNotAllowedHandler = chi.PromoteHeadToGet

I would also like to point out, that HEAD lookup time will increase depending on the size of existing GET routes.

I'm not sure that's true, last I looked the same tree node had all methods on it. Admittedly its changed a bit since regexs were introduced.

@DmitriyMV
Copy link

I think it will be router.MethodNotAllowed(chi.PromoteHeadToGet).

@pkieltyka
Copy link
Member

I think the approach that makes sense is:

  1. Create new chi pkg-level emptyBodyHandler which just executes w.Write(nil)
  2. Implicitly set the HEAD method in the Mux#Get method as so..
func (mx *Mux) Get(pattern string, handlerFn http.HandlerFunc) {
	mx.handle(mGET, pattern, handlerFn)
	mx.handle(mHEAD, pattern, emptyBodyHandler)
}

Users can still explicitly set the HEAD for a route with Mux#Head

thoughts?

@DmitriyMV
Copy link

DmitriyMV commented Aug 3, 2017

That solution will not set the headers which GET handlerFn could set. The problem as I see it currently is that we want the GET handler headers but we do not want GET handler body.

@alehano
Copy link
Author

alehano commented Aug 3, 2017

router.MethodNotAllowed(chi.PromoteHeadToGet)

Looks good. If it possible to implement that way. Standard go http (on which chi built), already get rid of body in case HEAD request. So, we dont need to invent bycicle, just need a way to set the same handler for two methods.

@pkieltyka
Copy link
Member

router.MethodNotAllowed(chi.PromoteHeadToGet) is clever way to enable this option but it might be too slick. I'll give this some more thought considering everyone's feedback. Thanks all.

@VojtechVitek
Copy link
Contributor

Another approach:

r := chi.NewRouter()
r.Use(HeadForceEmptyBody)
r.GetHead("/handler", MyHandler) // both HEAD and GET
func HeadForceEmptyBody(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
                if r.Method == "HEAD" {
                    w = middleware.NewWrapResponseWriter(ioutil.Discard, 1)
                }
		next.ServeHTTP(w, r)
	})
}

@vektah
Copy link

vektah commented Aug 3, 2017

Yeah that would work, and it's pretty straightforward. The only thing I dont think like about it is it leaves the solo Get broken from a http perspective. We can catch it with a linter pass, but ideally the default is safe after configuring the router.

@DmitriyMV
Copy link

No changing of the default behavior, please.

@DmitriyMV
Copy link

DmitriyMV commented Aug 4, 2017

The middleware approach looks interesting, but it implies that we didn't define any proper HEAD handlers. The MethodNotFound approach is "pay later" rather than "pay upfront". Personally, I like to "pay later".

@pkieltyka
Copy link
Member

fixed in 1b51a16

@DmitriyMV
Copy link

DmitriyMV commented Aug 22, 2017

@pkieltyka Sigh. You just broke the backward compatibility. Can we please have any sort of switch to turn this off or on?

Also, please don't do things like that with future minor versions. Reserve them for major ones. I like chi, but with changes like this one, I will be forced to move my company projects away from it. Sorry.

@pkieltyka
Copy link
Member

@DmitriyMV are you referring to the FindRoute() definition change? or the fact that an undefined HEAD route will match its GET route? I believe by the RFC a HEAD should exist for GET routes: #238 (comment)

You of course can still explicitly define a Head handler for a route, etc.

@DmitriyMV
Copy link

DmitriyMV commented Aug 22, 2017

@pkieltyka I mean that I currently have custom defined NotFoundHandler handler and you just broke my code.

@pkieltyka
Copy link
Member

I have just released the change under https://github.com/go-chi/chi/tree/v3.2.0

You can use https://github.com/go-chi/chi/tree/v3.1.5 for compatibility with your app.

@pkieltyka
Copy link
Member

pkieltyka commented Aug 22, 2017

@DmitriyMV custom not found handlers still work as before, perhaps you had some weird logic handling HEAD requests in a custom not found handler? if so, perhaps setup a middleware looking for r.Method == "HEAD" and respond that way, which is a better approach (if you're doing some caching stuff)

again, we have version control / dependency management for a reason

@DmitriyMV
Copy link

@pkieltyka

This code:

package main

import (
	"net/http"

	"github.com/go-chi/chi"
)

func main() {
	r := chi.NewMux()
	r.Get("/hello", func(w http.ResponseWriter, r *http.Request) {
		println("I was called with METHOD", r.Method)
		w.Write([]byte("Greetings!"))
	})
	r.MethodNotAllowed(func(writer http.ResponseWriter, request *http.Request) {
		if request.Method == "HEAD" {
			println("I'm HEAD request! Hurray!")
		} else {
			println("I'm " + request.Method + " method request")
		}
	})
	http.ListenAndServe(":3333", r)
}

Behaves differently after commit 1b51a16ac893c99cc5978fca77249701129eedaf. Now there is no way to define such handler at all.

I agree about VCS point. But I also think, that inviting people to discuss breaking change, only to accept it unconditionally after two hours and no discussion is not a good way to have any sort of discussion. Given the fact, that better solutions were proposed without breaking existing code.

Regardless I thank you for your time and effort and I will show myself out.

@pkieltyka
Copy link
Member

pkieltyka commented Aug 22, 2017

indeed, that will not work as you've outlined there. But, Im not sure what you're trying to solve. Can you explain the logic you have where you must have a MethodNotAllowed handler that checks for HEAD requests to execute some logic? can you describe your use case.

@DmitriyMV
Copy link

CORS requests. I know you can use "/" - tho not every case. I also know that your new behavior actually contradicts RFC because we are still returning headers which we do not guarantee to not change between requests.

Anyway - the deed is done. I see no point in continuing this discussion. I wish you all the luck, but from now I will stay away from chi and advice my colleagues to do the same.

@pkieltyka
Copy link
Member

an alternate approach: #248 - I do prefer its explicitness in setting this behaviour, as well better than MethodNotAllowed which feels like abusing the API.

@pkieltyka
Copy link
Member

one more approach: #249

@pkieltyka
Copy link
Member

@DmitriyMV / others - chi is a project Im obviously passionate about and I've spent a number of years designing and iterating with great care. I do my best to be responsive and take my time with implementations because I care about the architectural integrity of the project. However, like everyone else in OSS, I have tons of other responsibilities outside of this project and for the most part it's maintained by just myself - sometimes, I will make quick judgements and they cannot all be perfect. I should have waited for the feedback of others before merging/making a release, but I was excited to come to a simple solution so I hit the big green "merge" button. But guess what, we can iterate forward easily - I've already offered two alternate solutions, and you can lock onto a previous tag until one lands.

In after 2 years of working on chi, no one has ever told me they will recommend against it because of a minor commit - thats unfortunate. Sure, recommend against it because theres a better option or a fundamental flaw, but not because you didn't have your way. This is my first OSS project with larger adoption, and its interesting to see the sentiment after a small difference in preference. I can only imagine how hard the cryptocurrency space must have it.

btw, side note: depending on what you're building.. I highly recommend checking out gRPC and as well if you're targeting the Web/browsers, check out https://github.com/improbable-eng/grpc-web. gRPC in my opinion is the future for writing APIs and services - and its getting better everyday.

@DmitriyMV
Copy link

DmitriyMV commented Aug 23, 2017

Look, I get it. I got angry too because I spent last 20 days tracking this issue, and searching for the better solution (I actually did look into RFCs and "headers problems"), even notifying you in your merge request about explicit switch variable, so you could think about it. Only to find it merged without any sort of answer, tho you yourself invited other people to discuss the proposed solution. So - yes, I got mad.

The actual proper solution should remove the set of headers, but the problem is - we cannot change ResponseWriter.Headers after we had called Write or WriteHeader. This is problematic because we should not return Content-Length for HEAD requests if the response will be dynamic. So at this moment, I really don't have a better solution other than proxying ResponseWriter which will get into GET handler for the HEAD request and modify it afterward in our "middleware". That, in turn, results in another problem - that ResponseWriter can contain a growing set of interfaces (more here github.com/felixge/httpsnoop). So atm - I really don't have any proper fix.

And this is the main reason why I got angry - how easy it was for you to break the existing behavior, even though the solution doesn't introduce proper fix. I know that breaking changes are sometimes necessary - you can't compromise your clients' security, for example. You may also want to change API because it can be simplified or made more robust. Or just because you think it will be for the better.

But you should always remember, that in OSS, other people depend on your code. And the more project matures, the more people use it. They are willing to discuss your ideas. They are willing to help you. They are even willing to migrate if your changes are actually good, and you provided them a "transition period". They, essentially, trust you. And when you break this trust, bad things happen...

I cannot monitor every commit and change individually - because I too have a lot of responsibilities and routine on my day to day job. This is why we have semver - so we collect the things we want to fix and only when we absolutely sure, we break the existing code. I pinned my code to version 3.1.5, but that does not mean others did it too. What is more important - that pin removes my ability to receive code updates - including security ones, if you find and fix a bug in your later versions.

Again - I'm not against breaking changes. I know that software evolves. But I'm against breaking it sporadically without really serious considerations or giving people a transition period. This kills any sort of trust, because other people's code has "just stopped working" even if they had used it incorrectly from the semantic point of view.

@pkieltyka
Copy link
Member

All good man. If you can look at #248 and let me know what you think - I am leaning towards that solution since it provides most flexibility to users through middleware. It adds a few APIs to deal with the routing during execution. It adds slightly more overhead for HEAD requests that use this middleware, but its negligible in a real-world scenario.

Also, check out: https://stackoverflow.com/questions/3854842/content-length-header-with-head-requests

@DmitriyMV
Copy link

DmitriyMV commented Aug 23, 2017

I did - I actually was the one who linked this answer in this issue 😉. There is the even direct answer to the question:

So the real answer is, you don't need to include Content-Length, but if you do, you should give the correct value.

So even if your new solution is no longer breaks existing code, it still is incorrect. I only glanced at the code - do you really need to re-assign Routes field variable every time?

I will look further into it, including the ability to change headers later this or next week. Currently, I tend towards this https://github.com/felixge/httpsnoop/blob/master/wrap_generated_gteq_1.8.go#L66 even tho it looks horrid.

P.S. Yes - we did look into GRPC - most of our new services are using github.com/grpc-ecosystem/grpc-gateway. Not all of them tho.

@pkieltyka
Copy link
Member

I'll look into what is the correct behaviour of the HEAD later, that can be solved in the middleware impl. But regarding the PR, I will have to re-assign Routes each time to make sure the context state is valid before its routed, it its also just a pointer value so there isn't much overhead. I benchmarked the execution and it has negligible impact.

@VojtechVitek
Copy link
Contributor

VojtechVitek commented Aug 24, 2017

I like both #248 and #249 solutions. But imho, I'm leaning towards the Middleware approach.

#248 - Middleware approach

FindHandler() is something that might be very powerful for lot more use cases, where you can change the route dynamically based on request and/or context. However, it can introduce endless routing loop if used incorrectly. Do you guys have any more use cases where this could be useful?

func FindHandler(rctx *Context, method, path string) http.Handler

Do we need to pass context into the function? And why?

I'm thinking of another use case that could be useful in tests and/or docgen. Take a look at this signature:

func Find(method, route string) (http.Handler, ...func(http.Handler) http.Handler)

it would be more consistent with chi.Walk() and you could for example use it in your tests to get list of middlewares for a specific endpoint, ie. to check if you applied some ACL middleware. That could be very handy.

#249 - Config approach

The Config approach is very simple and elegant in this context. However, once we introduce Config, we'd probably end up adding more and more setting fields into it, effectively making the API more complex with each such release. I like the simplicity of chi at this moment and I'm asking myself a question - do we need Config in the long run - or can we live without it and keep making strict decisions?

How would Config work with sub-routers, by the way? Would we need to pass the same Config when initializing the sub-routers? Or is the root Config propagated down the chain somehow automatically?

@DmitriyMV
Copy link

DmitriyMV commented Aug 24, 2017

Regarding Config - judging from the source code it's done automatically for new sub-routers as long as you use Route method. If you wish to explicitly mount the sub-router - you will have to initialize it correctly.

While Config approach is simpler, the middleware approach is more explicit and doesn't introduce this fix into the core of this router. That alone will allow us to use corrent "dirty" fix (documenting the fact, that headers will not be stripped from the resulting response) and introduce the proper one (once this golang/go#4146 or golang/go#16522 will be resolved) without breaking existing code.That and @VojtechVitek points about dynamic routing.

@pkieltyka
Copy link
Member

yea, I think #248 is better too because it keeps the core lean and extends the api slightly to offer more power for people to program their own middlewares to change routing behaviour for their needs.

btw, @VojtechVitek - FindHandler requires the *Context because its actually doing routing through the tree, where it stores the url param values during traversal. Perhaps there is a better name than FindHandler, as what its doing is Routing the method and path to a handler. Maybe, Traverse(...) http.Handler or Seek(...) http.Handler or Match(...) http.Handler

@DmitriyMV
Copy link

@lwc @alehano @vektah @c2h5oh any thoughts on the matter?

@alehano
Copy link
Author

alehano commented Aug 25, 2017

I agree with @VojtechVitek thoughts. I prefer #248 approach.

@cardonator
Copy link

#248 is probably for the best. I think the concern above about starting to rely too much on Config is important. That probably best avoided for as long as possible.

@pkieltyka
Copy link
Member

please review #248 again when you guys have a sec

@pkieltyka
Copy link
Member

merged #248

@DmitriyMV
Copy link

Thanks! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants