Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

403 Forbidden #23

Closed
benboughton1 opened this issue Aug 8, 2016 · 17 comments
Closed

403 Forbidden #23

benboughton1 opened this issue Aug 8, 2016 · 17 comments

Comments

@benboughton1
Copy link

Is anyone else experiencing 'USGS not currently responding to requests'?

When catching the error it is giving me a 403 Forbidden.

My user name and password are working in Earth Explorer web interface when I try download the exact file using the link LANDSAT-Download generates.

I have logged out of Earth Explorer before using this script as well.

@dswanepoel
Copy link

dswanepoel commented Aug 9, 2016

I'm also getting a 403 Forbidden since yesterday when trying to log in via Python and wget.

@mkmitchell
Copy link
Contributor

I'm also experiencing this error.

@greenspin
Copy link

Same error here.

@dswanepoel
Copy link

dswanepoel commented Aug 9, 2016

This may be caused by the addition of a CSRF token in the ERS login form. I don't recall that being present before.

@olivierhagolle
Copy link
Owner

Dear all,
Thanks for signalling the change in USGS policy.
I am on holidays with a poor connexion, I can try that next week, but besides, I am not sure I know how to handle such a token in a python login. Any of you knows ?
Best regards,
Olivier

@mkmitchell
Copy link
Contributor

Can do. I have a working example for no_proxy so I'll change proxy to what I have but can't test it.

@olivierhagolle
Copy link
Owner

Great ! Thanks a lot ! Please do a pull request when you are ready, i'll try it as soon as possible, and will try to implement the proxy part, at least with CNES's proxy (which is a hard one)
Best regards
Olivier

@timburgess
Copy link

timburgess commented Aug 11, 2016

Looking at the form html, it appears there are two hidden fields, csrf_token and __ncforminfo. I imagine that both of those would have to be supplied on the form POST..

screen shot 2016-08-12 at 9 28 17 am

@greenspin
Copy link

Although I'm using JAVA Apache to get the Landsat files downloaded, I was facing the same problem there. Here is how I got it running again. Might be helpful for you as well:
The __ncforminfo token is not important, runs even without posting this token. The csrf_token must be read out and submitted again. The important change for me was to send the whole header information again when posting the username and password together with the csrf token. Here is the JAVA code, for completeness:

HttpClientContext context = HttpClientContext.create();
CookieStore cookieStore = new BasicCookieStore();
context.setCookieStore(cookieStore);
CloseableHttpClient client = HttpClientBuilder.create().build();
HttpGet get = new HttpGet("https://ers.cr.usgs.gov/login/");
get.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
HttpResponse response = client.execute(get, context);

Get the information for the csrf token from the response of the GET method.

List<NameValuePair> paramList = new ArrayList<NameValuePair>();
paramList.add(new BasicNameValuePair("username", user));
paramList.add(new BasicNameValuePair("password", pwd));
paramList.add(new BasicNameValuePair("csrf_token", csrf_token));
HttpPost post = new HttpPost("https://ers.cr.usgs.gov/login/");
post.setHeaders(get.getAllHeaders());
UrlEncodedFormEntity urlEncodedFormEntity = new UrlEncodedFormEntity(paramList, "UTF-8");
post.setEntity(urlEncodedFormEntity);
HttpResponse response2 = client.execute(post, context);

This gives me a 302, ready for download the files.
Hope this will help. Good luck, Gunther

@mkmitchell
Copy link
Contributor

I did a quick fix in case anyone needs this. I'm sure Olivier will make this much cleaner.
I had to pip install BeautifulSoup to parse the html.

If you need this going asap this works for me.

def connect_earthexplorer_no_proxy(usgs):
    cookies = urllib2.HTTPCookieProcessor()
    opener = urllib2.build_opener(cookies)
    urllib2.install_opener(opener)

    soup = BeautifulSoup(urllib2.urlopen("https://ers.cr.usgs.gov/login").read())
    token = soup.find('input', {'name': 'csrf_token'})
    params = urllib.urlencode(dict(username=usgs['account'],password= usgs['passwd'], csrf_token=token['value']))
    request = urllib2.Request("https://ers.cr.usgs.gov/login", params, headers={})
    f = urllib2.urlopen(request)
    data = f.read()
    f.close()
    if data.find('You must sign in as a registered user to download data or place orders for USGS EROS products')>0 :
        print "Authentification failed"
        sys.exit(-1)
    return

@olivierhagolle
Copy link
Owner

Thanks a lot Mike, It looks much simpler now, and it works !.
I did not know this BeautifulSoup library. The only drawback is that we need to install it.
Olivier

@dswanepoel
Copy link

Here is an alternative using regex (not as robust, but with no external dependency):

import re
...
data = urllib2.urlopen("https://ers.cr.usgs.gov/login").read()
m = re.search(r'<input .*?name="csrf_token".*?value="(.*?)"', data)
if m:
    token = m.group(1)

Another possible alternative that doesn't require external dependencies is https://docs.python.org/2/library/htmlparser.html

@mkmitchell
Copy link
Contributor

Nice using regex! I was going to look into that today.

@Opadera
Copy link

Opadera commented Aug 17, 2016

@mkmitchell
I get TypeError: 'module' object is not callable using your code posted above. any idea how to solve this?

@olivierhagolle
Copy link
Owner

olivierhagolle commented Aug 17, 2016

I am testing the suggestion of dswanepoel, which seems to work well. That will enable to avoid the BeautifulSoup (no Soup in summer ;) )

I will push the new version soon. I still need to test with the proxy version
Olivier

@olivierhagolle
Copy link
Owner

Done.

@christophe-06
Copy link

Dear Mr. Hagolle,
I used the Landsat-8 download script (with a list of products in inputs) a few months ago without problem. Today after one or two downloads, I get this error: "CSRF_Token not found". Is it a limitation on USGS side? Thanks for your help.
Christophe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants