Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CookieJar for storing all cookies on redirects #129

Closed
wants to merge 10 commits into from

Conversation

icemilo
Copy link

@icemilo icemilo commented May 4, 2015

Hi, I love needle which simplifies http requests.
I needed to store all the cookies for future request to the same origin, but needle only stores the latest cookie from the last redirect.

I thought someone else might need this feature so I created a pull request.
I've added a global variable to store all the cookies site sets on redirects and requests with cookies saved in the global cookieJar. And also, I have added a config key as follow_set_all_cookies defaults to false. Also, added exports.cookies for future needle requests.

@tomas
Copy link
Owner

tomas commented May 5, 2015

Hey @icemilo, thanks for the PR. I don't mind adding "cookie jar" support into Needle (it seems like a few lines of code away), but I wouldn't do it using a global variable, because this makes it very unclean as you would easily share cookies across different requests. I think I'd prefer users to pass in explicitly the cookie jar they wanted to use for a specific request, e.g.:

  var needle = require('needle'),
      jar = {};

  needle.get('foobar.com/path', { cookie_jar: jar }, function(err, resp) {
    console.log(jar); // { cookie1: 'something' }
    needle.get('foobar.com/whatever', { cookie_jar: jar }, function(err, resp) {
      console.log(jar); // { cookie1: 'something', cookie2: 'something_else' }
    })
  })

Or maybe you could even use needle.defaults({ cookie_jar: jar }) function for this.

  needle.defaults({ cookie_jar: jar });

  needle.get('foobar.com/path', function(err, resp) {
    console.log(jar); // { cookie1: 'something' }
    needle.get('foobar.com/whatever', function(err, resp) {
      console.log(jar); // { cookie1: 'something', cookie2: 'something_else' }
    })
  })

I haven't tested this yet, but it might work.

@icemilo
Copy link
Author

icemilo commented May 6, 2015

Hey @tomas, I really appreciate for the suggestion. I didn't think of putting "cookie_jar" object as an option or default variable. To satisfy the needs, I have added cookie_jar to defaults and options. Please check on the commit 9fc4742
Thx

@lucaswxp
Copy link

Will this be implemented?

@tomas
Copy link
Owner

tomas commented Dec 23, 2015

Hey @icemilo, sorry I've taken so long to look into this. Your PR looks good for merging but I'm trying not to merge major changes without including tests. Would you be willing to add one or two so I can get this into master? :)

@icemilo
Copy link
Author

icemilo commented Dec 24, 2015

@tomas Alright, now that you've come back from a long journey, I will include tests when I have some time to work this one. It should not take long ;)

@tomas
Copy link
Owner

tomas commented Dec 24, 2015

Cool, thanks mate.

@icemilo
Copy link
Author

icemilo commented Dec 30, 2015

@tomas Could you check my commits for tests and see if it satisfies? :)

@tomas
Copy link
Owner

tomas commented Dec 30, 2015

@icemilo Looks good. I think I'll tweak the syntax a bit but everything seems OK. 👍

@icemilo
Copy link
Author

icemilo commented Dec 31, 2015

@tomas Alright, thanks for considering this PR to be merged :) Have a great New Year's eve 👍

@rchipka
Copy link

rchipka commented Jun 7, 2016

Can't this be implemented using only res.cookies and not a global cookie jar?

In my opinion, it would make more sense if res.cookies accumulated any redirect cookies. That way users don't have to worry about having a global cookies object or setting a config option.

We could pass redirect cookies to the next request and then store them in the res.cookies object with any other cookies that were set.

@tomas
Copy link
Owner

tomas commented Jun 7, 2016

@rchipka I was actually in the process of getting this merged and got to the same conclusion. My only doubt is whether you'd always want to have all of the redirect cookies stored and not just the last ones.

@rchipka
Copy link

rchipka commented Jun 7, 2016

@tomas I think that res.cookies should essentially reflect the current Cookies header, which would include any of the previously acquired redirect cookies according to the follow_set_cookies option.

I think that if needle were to have a global cookie jar, it should at least be a fully featured one. For example, it should be domain specific instead of just sending all the global cookies along in every request. It also shouldn't require any additional options.

@icemilo
Copy link
Author

icemilo commented Jun 7, 2016

@rchipka Thanks for the opinion. I needed this feature because long ago, I came across some endpoints which would redirect to another endpoint about 3 times and each endpoint needed all the cookies on redirects. Because follow_set_cookies option would just pass the cookies from the previous to the next endpoints, I decided to make a cookie jar which can be provided by user for later uses. Although this can be implemented internally(ex. having local cookie jar inside Needle), I preferred to have this open to options where its explicitly accessible for developers.

@rchipka
Copy link

rchipka commented Jun 7, 2016

@icemilo I think that users should configure options (including cookies) via the global default options or per-request options.

Giving the request function the same cookie object that the developer provided has no added benefit over the developer just setting any cookies in a global or per-request fashion. It'd be different if the developer gave a more sophisticated cookie object that needle could interact with to resolve domain specific cookies etc.

I'm not advising we go that route either as it adds even more complexity than requiring the developer to provide the cookie object as an option in the first place. I just see no reason why the developer wouldn't just use the cookies option, which would work just like the cookie_jar option. Setting a cookie globally is simply a matter of changing the defaults.

I think that we should choose one method and one option.

@icemilo
Copy link
Author

icemilo commented Jun 7, 2016

@rchipka If we would want to go with overriding defaults option then it would make more send if we have a global variable in Needle which acts as a cookie_jar, and let the users provide their own variable to override the default in Needle. If we would want to provide per-request options then there is no point of providing this option to users as the users have to write functions to request to each redirects manually .

Again, the reason why developers would not just use cookies option is because we don't want to manually create requests to each redirects if there are more than 5 redirects. They don't want to write 5 requests and handle 5 callbacks to just get one result from an endpoint. And follow_set_cookies option just passes the cookies from a previous endpoint to the next endpoint, which the next endpoint does not get the cookies from previous, previous endpoint.

From your points, I'm positive for having domain specific cookie jars and I think this can be an improvement later on.

I hope this makes it more clearer.

@rchipka
Copy link

rchipka commented Jun 7, 2016

@icemilo I guess my point is that the cookie_jar option is no different than the cookies option. The cookie_jar option just provides a global state for all cookies per needle instance. In my opinion, the state of the cookies should be per-request rather than per-instance, and if a developer wants to keep a cookie and use it globally, they can easily do so by setting the cookies option after they have access to the cookie.

The issue seems to be access to the cookies. The redirect cookies are completely inaccessible to the developer, and while a global cookie_jar object may retain these cookies, the context is completely lost. The developer has no idea which request acquired cookie (without writing even more code). The developer couldn't choose to modify or reject a specific cookie on the fly, or choose to only let a specific cookie be brought into the global settings.

The developer has no way, via a simple cookie_jar object, to know these things immediately. Which could be changed if Needle implemented a complicated cookie_jar object that did these things already. I suggest that Needle keep advanced cookie handling out of the code base and instead leave it up to the developer or some sort of add-on that gives Needle true standards compliant support for cookies.

While a developer would be able to modify the global cookies using this cookie_jar object, they would have no idea when to do so without additional checking and other overhead, which negates any convenience of having the global object.

The solution to this would be to have the previous endpoint pass any cookies to the next endpoint. Then, if the developer wants to update the "cookie jar" with any new cookies, they would simply copy the cookies from res.cookies into the needle defaults.

This would allow any developer to access any cookie and do anything with it that they choose, without adding any options/overhead/confusion/complexity. It's the way I'd expect Needle to work.

@icemilo
Copy link
Author

icemilo commented Jun 7, 2016

@rchipka this feature is not enabling developers to view and edit cookies. It's for the endpoints after, for example, 3 redirects needing cookies from the first endpoint. Because redirects are internally done in Needle, the third endpoint does not get the cookies from the first endpoint. And this is the only reason for cookie jar feature.

@rchipka
Copy link

rchipka commented Jun 7, 2016

@icemilo That's true, however this feature would imply to the developer that they can have some sort of control to view and edit the cookies because they have to manually set an option with a cookie jar object.

My point is, why not skip both of these things and simply pass along the cookies after each redirect. This way redirects 2 and 3 would receive all the cookies from redirect 1. Redirect 3 would receive all the cookies from redirect 1 and 2, etc. And then, after all the redirecting has finished, the developer could see all the cookies from all the redirects inside of res.cookies, without having any cookies automatically set globally.

This is the way I would expect Needle to work by default when using the follow_set_cookies option.

@icemilo
Copy link
Author

icemilo commented Jun 9, 2016

@rchipka Yes, that was my expectation when I first tried follow_set_cookies option. But because a lot of people are using Needle right now and we don't wanna break their things. So I made an extra option instead.

Guess this decision has to be made by @tomas. Will toss this decision to you :)

@tomas
Copy link
Owner

tomas commented Jun 15, 2016

After a discussion with my pillow I think I agree that the default behaviour when setting follow_set_cookies is to have them persist along different requests.

@icemilo Would you adapt your code so we can get this merged?

(And thanks for the discussion!)

@icemilo
Copy link
Author

icemilo commented Jun 15, 2016

@tomas Thanks for the decision(I know it would've been tough for you!) I will adapt my code soon and let you know.

@tomas
Copy link
Owner

tomas commented Aug 8, 2016

@icemilo Any news? I can take it from here if you prefer. :)

@icemilo
Copy link
Author

icemilo commented Aug 9, 2016

@tomas Hey, sorry to keep you waiting there. Its been a really busy days for me and finally done with it today(finally pushing to production tomorrow!) So I guess I can finish by this week. Thanks for your patience!

@smartpunter
Copy link

C'mon, such an easy fix taking so long.
Needle is a great lib, the only thing lacking is no-hassle cookie support through all requests, which is almost ready.

@tomas
Copy link
Owner

tomas commented Oct 21, 2016

@smartpunter You're right. I'll do the final changes when I have a minute. Thanks for the heads up!

@icemilo
Copy link
Author

icemilo commented Oct 24, 2016

@tomas Hey there, sorry for delaying this PR for too long. I have been really busy these few months. Anyways, I have updated the code, could you take a look? Thanks!

Copy link

@rchipka rchipka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@icemilo 👍 Thanks for updating this. I've made a few suggestions in my review.

@tomas The suggestions aren't anything major, so if the tests are passing and everything works this could be okay to merge as-is. However, I recommend you take a look and them and decide if any of the changes should be made before merging this.

@@ -415,7 +419,11 @@ Needle.prototype.send_request = function(count, method, uri, config, post_data,
set_timeout(config.read_timeout);

if (headers['set-cookie'] && config.parse_cookies) {
resp.cookies = cookies.read(headers['set-cookie']);
if (redirect_cookies !== null) {
resp.cookies = Object.assign(redirect_cookies, cookies.read(headers['set-cookie']));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Object.assign() not available before Node v5


if (config.follow_set_referer)
config.headers['referer'] = uri; // the original, not the destination URL.

config.headers['host'] = null; // clear previous Host header to avoid conflicts.

debug('Redirecting to ' + url.resolve(uri, headers.location));
return self.send_request(++count, method, url.resolve(uri, headers.location), config, post_data, out, callback);
return self.send_request(++count, method, url.resolve(uri, headers.location), config, post_data, out, cookies.read(headers['set-cookie']), callback);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using cookies.read() again here, why not just use resp.cookies directly?

if (config.follow_set_cookies && resp.cookies)
config.headers['cookie'] = cookies.write(resp.cookies);
if (config.follow_set_cookies && resp.cookies) {
config.headers['Cookie'] = cookies.write(Object.assign(resp.cookies, cookies.read(headers['set-cookie'])));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this could all be consolidated on line 422. If the user wants to follow_set_cookies, then parse_cookies has to be true anyways.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. No need to re-parse the header.

@tomas
Copy link
Owner

tomas commented Oct 25, 2016

Ok, I took @icemilo's code and refactored it to remove duplication and support older versions of Node (where Object.assign is not available).

I also made sure that if the last request (non-redirect) doesn't contain cookies, resp.cookies should still return the ones that were set by previous (redirect) requests. This implementation would simply forget the previous/stored ones, which I feel is a key part in this whole "remember cookies" mumbo jumbo.

@tomas tomas closed this Oct 25, 2016
@rchipka
Copy link

rchipka commented Oct 25, 2016

@tomas Good catch. Glad this issue is resolved. Hopefully no response cookies are lost anymore!

@icemilo
Copy link
Author

icemilo commented Oct 25, 2016

@rchipka Thanks for code review, I guess I need to be more careful! @tomas Thanks for merging this PR, although my code wasn't perfect. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants