Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem downloading a file via http: SSLCertVerificationError #186

Closed
MarkWieczorek opened this issue Jun 21, 2020 · 4 comments
Closed

Problem downloading a file via http: SSLCertVerificationError #186

MarkWieczorek opened this issue Jun 21, 2020 · 4 comments

Comments

@MarkWieczorek
Copy link

I am having problems using pooch to download files of the Earth's magnetic field from the SWARM mission that are available on an ESA web site. The filename is

https://swarm-diss.eo.esa.int/?do=download&file=swarm%2FLevel2longterm%2FMLI%2FSW_OPER_MLI_SHA_2D_00000000T000000_99999999T999999_0501.ZIP

and the errors include

SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)

and

SSLError: HTTPSConnectionPool(host='swarm-diss.eo.esa.int', port=443): Max retries exceeded with url: /?do=download&file=swarm%2FLevel2longterm%2FMLI%2FSW_OPER_MLI_SHA_2D_00000000T000000_99999999T999999_0501.ZIP (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)')))

I tried using pythons requests to do this manually, and got similar errors.

With requests, I found that I could specify the option verify=False, which according to the docs

verify – (optional) Either a boolean, in which case it controls whether we verify the server’s TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to True.`

So, if there is not a simple solution to this problem, perhaps pooch could allow you to specify verify as an optional variable which then gets passed on to requests. I understand that this is non ideal, and this generates the warning

response = requests.get('https://swarm-diss.eo.esa.int/?do=download&file=swarm%2FLevel2longterm%2FMLI%2FSW_OPER_MLI_SHA_2D_00000000T000000_99999999T999999_0501.ZIP', verify=False)
/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning,

However, if we are verifying the hash of the file, then there is no security issue.

@leouieda
Copy link
Member

@MarkWieczorek you can pass arguments to requests.get through keyword arguments of HTTPDownloader:

downloader = pooch.HTTPDownloader(verify=False)
fname = pooch.retrieve(url, hash, downloader=downloader)
# or
fname = POOCH.fetch(“mefile.txt”, downloader=downloader)

They are probably using an expired SSL certificate. Do you get the same error when accessing the website through a browser?

Alternatively, you could try using http instead of https if the server allows it.

@MarkWieczorek
Copy link
Author

Thanks. That worked.

For reference, the file downloads fine from a web browser, but I had the same problem when using http and https with requests (this is weird, I didn't think that http uses ssl.). I am also in contact with the people in charge of the ESA web site to see if they can fix this on their end.

The only thing that would be useful in this circumstance would be to

  • add this flag to the "custom downloaders" section in the web documentation, and
  • have the possibility to suppress the warning message (but I didn't see the option to do this with requests.)

@leouieda
Copy link
Member

Yeah, this is definitely something that should be fixed on their end as well.

have the possibility to suppress the warning message (but I didn't see the option to do this with requests.)

I imagine they are using the warnings built-in package, which you can use to suppress the warnings. But it might be good to keep them so you notice when they fix the issue.

add this flag to the "custom downloaders" section in the web documentation

Agreed! See #187

Closing this as it seems to have been resolved.

@MarkWieczorek
Copy link
Author

I found two solutions to this problem:

  1. It turns out that ESA has an FTP mirror. Using ftp instead of https works with no problems.
  2. setting verify=False works but generates an annoying warning that is not necessary since we verify the file hash. This warning can be disabled using the command requests.packages.urllib3.disable_warnings(requests.packages.urllib3.exceptions.InsecureRequestWarning)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants