Redirects w/ POST on HTTP 302/303/308: Found #29

Closed
EvanCarroll opened this Issue May 5, 2011 · 17 comments

Comments

Projects
None yet
5 participants

BACKGROUND

rfc1945 HTTP 1.0

   302 Moved Temporarily

   The requested resource resides temporarily under a different URL.
   Since the redirection may be altered on occasion, the client should
   continue to use the Request-URI for future requests.

   The URL must be given by the Location field in the response. Unless
   it was a HEAD request, the Entity-Body of the response should
   contain a short note with a hyperlink to the new URI(s).

   If the 302 status code is received in response to a request using
   the POST method, the user agent must not automatically redirect the
   request unless it can be confirmed by the user, since this might
   change the conditions under which the request was issued.

       Note: When automatically redirecting a POST request after
       receiving a 302 status code, some existing user agents will
       erroneously change it into a GET request.

rfc2616 HTTP 1.1

10.3.3 302 Found

   The requested resource resides temporarily under a different URI.
   Since the redirection might be altered on occasion, the client SHOULD
   continue to use the Request-URI for future requests.  This response
   is only cacheable if indicated by a Cache-Control or Expires header
   field.

   The temporary URI SHOULD be given by the Location field in the
   response. Unless the request method was HEAD, the entity of the
   response SHOULD contain a short hypertext note with a hyperlink to
   the new URI(s).


   If the 302 status code is received in response to a request other
   than GET or HEAD, the user agent MUST NOT automatically redirect the
   request unless it can be confirmed by the user, since this might
   change the conditions under which the request was issued.

      Note: RFC 1945 and RFC 2068 specify that the client is not allowed
      to change the method on the redirected request.  However, most
      existing user agent implementations treat 302 as if it were a 303
      response, performing a GET on the Location field-value regardless
      of the original request method. The status codes 303 and 307 have
      been added for servers that wish to make unambiguously clear which
      kind of reaction is expected of the client.

So what happened is after the release of HTTP 1.0, prior to the advent of HTTP 1.1 which brought codes 303, and 307 the spec read that the client MUST NOT redirect POST verbs temporarily. Unfortunately, there was no mechanism in the spec to redirect POST verbs -- slight oversight -- and the browsers were redirecting solely on the basis of Location header. It is not common place for HTTP 302 to be legacy for the more explicit HTTP 303. This functionality should be permitted in Request too.

Both 302, and 303 require user-approval. But, because Request won't be able to easily get that I suggest we support a feature to auto-approve on behalf of the user.

In essence, if there is a Location header the client should reissue the request as a GET rather than a POST. I'm going to create an option for this functionality and default it to enabled, per the expected behaviors. I'm also going to amend HTTP 308 to redirect with the POST verb, per the explicit word of the HTTP 1.1 spec.

EvanCarroll added a commit to EvanCarroll/request that referenced this issue May 5, 2011

Minor code cleanup and fixed #29 with regards to redirecting HTTP 302…
…/303/307 and POST, added myself to contributors in package.json
Owner

mikeal commented May 24, 2011

so, at the moment we're just forwarding for all verbs.

this would essentially add an option to "turn off" forwarding on 302 for the POST verb.

if the user wanted to turn this off, why wouldn't they just turn off forwarding on POST?

do you have a specific use case?

Because you're not forwarding all Verbs! You're trying to obey the spec, but it's being done wrong. Just look at the HEAD for the latest code base: https://github.com/mikeal/request/blob/master/main.js

     if (response.statusCode >= 300 && 
          response.statusCode < 400  && 
          options.followRedirect     && 
          options.method !== 'PUT' && 
          options.method !== 'POST' &&
          response.headers.location) {

You're quite explicitly not forwarding PUT or POST

Owner

mikeal commented May 24, 2011

what do you think about just removing the verb check?

Bad idea, it would forever remove the ability to have a true HTTP 1.0, or HTTP 1.1 compatible client. In addition it doesn't solve one of the bigger problems solved with my patch:

      if (
        options.method == 'POST' &&
        (
          response.statusCode === 303 ||
          ( response.statusCode === 302 && options._http302ashttp303 )
        )
      ) {
        options.method = 'GET';
      }

So what you want is just if this browser convention convention is followed, for the verb to switch. You might also want to one day have the ability to provide a callback that does the approval component. So you follow redirects if they can be followed automagically, or you can provide user-approval if anyone builds upon this library and wishes to do that. My patch seeds that ability with options.approveRedirect, a separate state from followRedirect.

Owner

mikeal commented May 24, 2011

this is the part of the patch i'm most concerned with. changing the request's semantics from POST to GET makes me nervous. it makes sense in the browser, but I think you'll want much more predictable behavior when using request than you might in the browser. a POST should attempt to POST data and follow forwarding logic (in accordance with the spec), if the spec says we shouldn't forward then I think we should bail out and give the callback the 30x response, not change the semantics to a GET.

that are too many lazy servers that return 200 when they mean 201 so it's common when doing statusCode checks to just check ( status > 200 && status < 300 ), if we change the semantics to a GET then the user will think the data was written on 200.

You must have the ability to follow the browser convention, and the HTTP spec. The browser convention is redirect POSTs to GETs in HTTP1.0's 302 responses. This has to be implemented or people will forever complain, rightfully so, about having to do additional work. I was trying to scrape Craigslist when I noticed this.

Check out this https://post.craigslist.org/c/hou?lang=en

Now the form is destined for this resource: https://post.craigslist.org/k/<token>, but alas a 302 is issued which will push you back over to https://post.craigslist.org/k/<token>?s=<option>, where you can continue your quest. That 302 in this context, in all major browsers, results in a GET request.

I would actually think forwarding POSTs, without changing the VERB, could be dangerous. Remember the server doesn't redirect the request in HTTP, it just tells the browser to do it. POSTs and PUTs could be destructive. Also, no where in the spec except do you ever forward a POST as a POST without user interaction. My suggestion is that we have two modes, one which permits some sort of functionality to obtain API approval, and another that automatically approves.

Owner

mikeal commented May 25, 2011

I'm not disagreeing that we should add this feature, I just don't think it should be the default.
I'm inclined to leave the no forwarding for POST and PUT defaults in and add an option to do things the browser way. I agree that auto-forwarding POST and PUT could be dangerous and destructive, but the browser conventions aren't well known by most developers and I'm afraid it's too far away from their expectaions.
Basically, defaults should be 1) safe 2) close to expectations. That doesn't mean we should not implement non-default functionality that maps to the browser conventions, we just need to keep the defaults in line with expectations.
Ideally what I'd like to see for this is, a new option that turns on this specific behavior, and another option that puts request in to "browser mode", which would set this and possibly future options.
The request.defaults call can be used to return a wrapper around the request module that changes the defaults, so using "browser mode" all the time would only be a one line code change.

In almost every case, the expectation will be to do what the browser does. They're using this library to integrate with HTTP servers, but I doubt very many of them have ever even touched the spec. They'll be dumbfounded when they browser is redirecting and their library isn't, even though the option is enabled. They'll wonder why they get "redirects" sometimes and not other times. Essentially after their long investigative task, they'll have to come to understand...

HTTP 302 redirects GET->GET transparently in the spec, Request.js, and the browser; but, if you post to something that redirects HTTP 302 all browsers mistakenly do this automatically, the spec requires user validation on the redirect, and Request does this through an additional option.

I think the reference implementation should be first and foremost the browser, with ability to strictly adhere to HTTP 1.0 and 1.1. If you go a different route, I would suggest making the library HTTP 1.0/1.1 aware first. Because, it isn't now...

Owner

mikeal commented May 25, 2011

if i'm talking to couchdb, and i try to write something to the wrong location and a load balancer in front of couchdb does a redirect i do not want that to transform in to a GET.

Then the load balancer should issue a HTTP 1.1 - 307 Temporary Redirect not a 302 redirect.

 Note: RFC 1945 and RFC 2068 specify that the client is not allowed
  to change the method on the redirected request.  However, most
  existing user agent implementations treat 302 as if it were a 303
  response, performing a GET on the Location field-value regardless
  of the original request method. The status codes 303 and 307 have
  been added for servers that wish to make unambiguously clear which
  kind of reaction is expected of the client.

Even the spec makes notes of the implementations breaking it.

Contributor

isaacs commented May 25, 2011

I'd recommend keeping the current behavior as-is.

  1. redirecting on a GET is mostly harmless, and expected.
  2. redirecting on a POST is kind of strange, and unexpected, especially if it sends the data to a different server or something.
  3. changing the verb is OK for browsers, but way too magical for a programmatic http client.
  4. the current behavior is to have request throw up its hands, and say "Sorry, I don't know what to do with this," and call your callback. It's easy enough from there to continue if you want, or make a GET, or just forward the request.

Compare to curl, where there is simply one flag to either always follow (and forward) all Location: headers, or not. Request is already pushing the bounds of cleverness. It's not a web browser, it's a programmatic http client. If you want a web browser, check out PhantomJS or Zombie.

Contributor

polotek commented May 26, 2011

I agree with @isaacs. I think safety and configurability trumps convention here. Also it's an important point that request is a server-side lib. If there is a convention that should inform this decision, it is other server-side libs, not the browser. And I doubt you'll find much consensus there.

Owner

mikeal commented May 26, 2011

i'm still of the opinion that it would be good to implement a "browser mode" that defaults request to behave almost exactly like a web browser. I just don't think that it should be the default.

Contributor

polotek commented May 26, 2011

@mikeal. Yeah I'm all for adding configs to get different behavior. But I would be careful calling it "browser mode". Or you'll get a lot more complaints or suggestions about what people expect from the browser. And then you'll remember that it's "browsers" and they do things differently. Rabbit hole.

Owner

mikeal commented May 26, 2011

browsers handle HTTP more or less the same, it's one of the few things they do similarly. i agree that it's a rabbit hole but it's a problem domain i think that someone needs to solve and I don't see a better place to do it than request.

Owner

mikeal commented Jul 21, 2011

i'm closing this out for now. there isn't any agreement on how to handle this in a way that everyone expects so I'm starting to think that it's not up to request to solve this problem and that this logic should be pushed higher up in the application layer.

@mikeal mikeal closed this Jul 21, 2011

Shame, this would have been a nice convenience feature. Let me know if you would still consider it and if I get time I will implement it.

@floatdrop floatdrop referenced this issue in sindresorhus/got Apr 28, 2015

Merged

auto-redirect only on GET and HEAD methods #57

mgorczyca added a commit to mgorczyca/request that referenced this issue May 13, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment