Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add User-Agent for open call #20

Closed
wants to merge 1 commit into from
Closed

Conversation

krisdigital
Copy link

Got 403 when trying to identify https://www.ultimate-guitar.com/modules/rss/all_updates.xml.php

Maybe solves #19?

@geoffw8
Copy link

geoffw8 commented Jan 5, 2019

I mean, this repo is nothing to do with me - but your fix works so I've approved it. Would be good to have a spec for it ideally but I haven't looked at the rest of the coverage. I can't merge anyway, so its up to @damog

@damog
Copy link
Owner

damog commented Jan 5, 2019

I disagree that this is how the user agent should be overridden. Ideally, a set of headers is passed over to override the request, and not force all subsequent ones to Mozilla or whatever. Happy to look into it but happier to receive a patch that would do just that.

@krisdigital
Copy link
Author

@damog Don't know if it is overkill to make the accept-header customizable and how many feeds have this problem. But it is a strange behaviour considering

curl https://www.ultimate-guitar.com/modules/rss/all_updates.xml.php -I -H 'User-Agent:'
HTTP/2 200 
date: Sun, 06 Jan 2019 21:33:10 GMT
content-type: text/xml; charset=ISO-8859-1
server: nginx
...
open("https://www.ultimate-guitar.com/modules/rss/all_updates.xml.php")
OpenURI::HTTPError: 403 Forbidden

 open("https://www.ultimate-guitar.com/modules/rss/all_updates.xml.php", "User-Agent" => "feedbag")
=> #<Tempfile:/var/folders/61/d5yqnx1x20v2gn095pmqcpfc0000gn/T/open-uri20190106-20239-18y8z4h>

So curl works without user-agent, open does not.. Does not matter which user-agent is used unless it is not empty and it works..

@damog
Copy link
Owner

damog commented Jan 12, 2019

@krisdigital: I tried debugging this for a bit today without much luck. Is there a simple way to see the full request headers from OpenURI?

@krisdigital
Copy link
Author

@damog it is mysterious.. Here is what I found out

Open-Uri default headers

Accept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3
Accept: */*
User-Agent: Ruby
Host: localhost:8000

curl default headers

Host: localhost:8000
User-Agent: curl/7.54.0
Accept: */*

I used a little python script: https://gist.github.com/phrawzty/62540f146ee5e74ea1ab

So in my case, the server I was trying to speak to seems to have a problem with the user agent Ruby.. 💎

@damog
Copy link
Owner

damog commented Jan 19, 2019

@krisdigital -- interesting. I guess the user agent could be signed as Feedbag but I'd hate to break the simplicity of using the default open-uri behavior though. Regardless, if that server in particular doesn't like Ruby, well, I guess it's definitively up to the admin not to allow for automated access.

@krisdigital
Copy link
Author

@damog Understandable! I think we can close the ticket then.. Thank's for looking into it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants