Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install trouble - error with pip #156

Closed
aoneill-usgs opened this issue Jun 14, 2023 · 13 comments
Closed

install trouble - error with pip #156

aoneill-usgs opened this issue Jun 14, 2023 · 13 comments
Assignees
Labels
installation Any problems or questions about installation

Comments

@aoneill-usgs
Copy link

Describe the Question

I can't complete a good install, and am having trouble with pip. Installation instructions #1-#3 run fine; I only encounter the error once I call pip in step #4.
I get an error about SSL:CERTIFICATE_VERIFY_FAILED, however I have ensured I have the right SSL certs loaded (as supported by a successful clone on my computer). Any advice or tips are appreciated.

Screenshots

Screenshot 2023-06-14 153650

ALL software version info

Dell Desktop

  • OS: Windows 10 Enterprise
  • Using Chrome and Edge
@2320sharon 2320sharon added the installation Any problems or questions about installation label Jun 14, 2023
@dbuscombe-usgs
Copy link
Member

Andy, I have the exact same problem. I never try to use pip on the DOI network (I just do it at home). But I'm looking into it and hopefully I'll have some better instructions later

@aoneill-usgs
Copy link
Author

aoneill-usgs commented Jun 14, 2023 via email

@dbuscombe-usgs
Copy link
Member

I did this first to see where my pip.ini file was

(base) PS C:\Users\dbuscombe> pip config -v list
For variant 'global', will try loading 'C:\ProgramData\pip\pip.ini'
For variant 'user', will try loading 'C:\Users\dbuscombe\pip\pip.ini'
For variant 'user', will try loading 'C:\Users\dbuscombe\AppData\Roaming\pip\pip.ini'
For variant 'site', will try loading 'C:\Users\dbuscombe\AppData\Local\miniconda3\pip.ini'

None of those files existed, so I made a file C:\Users\dbuscombe\pip\pip.ini and copied this into it

[user]
	name = Dan Buscombe
	email = dbuscombe@gmail.com
[core]
	autocrlf = true
[credential "helperselector"]
	selected = <no helper>
[http]
	sslCAInfo = C:\Users\dbuscombe\Documents\DOIRootCA2.cer
[global] 
    cert = C:\Users\dbuscombe\Documents\DOIRootCA2.cer

I could then run pip command without ssl warnings, e.g. pip -v list or python -m pip install --upgrade pip

@dbuscombe-usgs
Copy link
Member

dbuscombe-usgs commented Jun 15, 2023

Further details. I made sure that my pip.ini, .condarc and .gitconfig were present and consistent. These are the paths and contents of those files, for reference:

c:\Users\me\pip.ini

[user]
	name = Dan Buscombe
	email = dbuscombe@gmail.com
[core]
	autocrlf = true
[credential "helperselector"]
	selected = <no helper>
[http]
	sslCAInfo = C:\Users\dbuscombe\Documents\DOIRootCA2.cer
[global] 
    cert = C:\Users\dbuscombe\Documents\DOIRootCA2.cer

c:\Users\me\AppData\Local\miniconda3\.condarc

ssl_verify: C:\Users\dbuscombe\Documents\DOIRootCA2.cer

c:\Users\me\.condarc

channels:
  - conda-forge
  - defaults
ssl_verify: C:\Users\dbuscombe\Documents\DOIRootCA2.cer
solver: libmamba

c:\Users\me\.gitconfig

[user]
name = dbuscombe-usgs
email = dbuscombe@gmail.com
[credential]
	helper = wincred
[filter "lfs"]
	clean = git-lfs clean -- %f
	smudge = git-lfs smudge -- %f
	process = git-lfs filter-process
	required = true
[http]
	sslCAInfo = C:\\Users\\dbuscombe\\Documents\\DOIRootCA2.cer

But I still get ssl errors when I try to download data from zenodo (required for creating ROIs)

⚠️Error
⚠️HTTPSConnectionPool(host='zenodo.org', port=443): Max retries exceeded with url: /record/7814755/files/global_shoreline_5deg_327.geojson?download=1 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)')))

Troubleshooting this, I am able to make the following unverified server request

var = requests.get("https://zenodo.org/record/7786276/files/global_shoreline_5deg_0.geojson?download=1", verify=False)

(response from server = 200 i.e. OK)

So I propose the following changes to make this work:

with requests.get(url, stream=True, verify=False) as response: on
https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/downloads.py#L33

with requests.get(url, stream=True, verify=False) as r: on
https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/common.py#L907

response = requests.get(root_url, verify=False) on https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/zoo_model.py#L232

@dbuscombe-usgs
Copy link
Member

It may also require this

import requests
from urllib3.exceptions import InsecureRequestWarning

# Suppress only the single warning from urllib3 needed.
requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)

https://stackoverflow.com/questions/15445981/how-do-i-disable-the-security-certificate-check-in-python-requests

@2320sharon
Copy link
Collaborator

2320sharon commented Jun 20, 2023

Hi @dbuscombe-usgs

Thank you for identifying the cause of the problem, the ssl certification process is failing due to the restricted network. Your solution does work, but it leaves the user vulnerable to Man in the Middle attacks.

Luckily, there is another option. We can provide a path to the cer file that we used in the pip.ini file and use it to verify the request. Unfornatelty, I don't have the same set up as you so I can't test this.

Can you run this simple script and see if it works? Hopefully this takes care of the ssl verification error.

All you need to do is:

  • Modify the cert_path
  • Run the script
import requests

# Path to your .cer file
_ = "C:\\Users\\user\\Documents\\DOIRootCA2.cer"

# The URL you want to download the data from
url = "https://zenodo.org/record/7814755/files/global_shoreline_5deg_327.geojson?download=1"

# Use the 'verify' parameter to specify your custom certificate
response = requests.get(url, verify=cert_path)

# Now you can check if this solution worked
print(response)

@dbuscombe-usgs
Copy link
Member

I can confirm that this works on my network. So, going forward, it appears the user will have to provide their certificate path to download any zenodo files. This has to happen for any functionality of the tool (e.g. image downloading). I think this will require modifying existing code in the coastseg package.

So I propose the following changes to make this work:

with requests.get(url, stream=True, verify=cert_path) as response: on
https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/downloads.py#L33

with requests.get(url, stream=True, verify=cert_path) as r: on
https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/common.py#L907

response = requests.get(root_url, verify=cert_path) on https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/zoo_model.py#L232

On the first change, is there any disadvantage to providing a new argument to def download_url_dict(url_dict): on https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/downloads.py#L31, i.e. def download_url_dict(url_dict, cert_path)

same idea with https://github.com/Doodleverse/CoastSeg/blob/76937f8737cb3944bfab267b7e6f2074c232cc47/src/coastseg/common.py#LL71C1-L71C79, just changing this line def get_filtered_files_dict(directory:str, file_type:str, sitename:str)->dict: to def get_filtered_files_dict(directory:str, file_type:str, sitename:str, cert_path:str)->dict:

The issue is, these would have to be optional arguments. The default would be an empty string I think, because this wont affect all users.

@2320sharon
Copy link
Collaborator

Hey @dbuscombe-usgs

Thanks for trying out the code and I'm happy to hear it works. Thanks for pointing out the code that needed to be changed, I modified each of the functions so that before the request was made the code would check if a file called certifications.json exists and if it does it would read the file and check if the cert_path exists. If the cert_path aka the path to the certification file exists then it is used to in the verify parameter of the GET request, otherwise a standard GET request is made.

I added the two new functions to common.py. get_cert_path_from_config returns the cert_path if the file exists and the cert_path exists. get_response returns the response from requests.get with or without verifying with the cert file.

Here is how the get_response function is used in download_url in common.py:

    # gets the response from the api and uses the cert file if it exits
    response = get_response(url, stream=True)
    with response as r:
        logger.info(r)
        if r.status_code == 404:

And here are the two new functions in common.py

def get_cert_path_from_config(config_file='certifications.json'):
    """
    Get the certification path from the given configuration file.

    This function checks if the configuration file exists, reads the config file contents, and gets the certification path.
    If the certification path found in the config file is a valid file, it returns the certification path. Otherwise,
    it returns an empty string.

    Args:
        config_file (str): The path to the configuration file containing the certification path. Default is 'certifications.json'.
    
    Returns:
        str: The certification path if the config file exists and has a valid certification path, else an empty string.
    """

    # Check if the config file exists
    if os.path.exists(config_file):
        # Read the config file
        with open(config_file, 'r') as f:
            config = json.load(f)
        
        # Get the cert path
        cert_path = config.get('cert_path')
        
        # If the cert path is a valid file, return it
        if cert_path and os.path.isfile(cert_path):
            return cert_path
    
    # If the config file doesn't exist, or the cert path isn't in it, or the cert path isn't a valid file, return an empty string
    return ''

def get_response(url, stream=True):
    """
    Get the response from the given URL with or without a certification path.

    This function uses the get_cert_path_from_config() function to get a certification path, then sends an HTTP request (GET) to the
    specified URL. The certification is used if available, otherwise the request is sent without it. The stream parameter
    defines whether or not the response should be loaded progressively, and is set to True by default.

    Args:
        url (str): The URL to send the request to.
        stream (bool): If True, loads the response progressively (default True).

    Returns:
        requests.models.Response: The HTTP response object.
    """
    cert_path = get_cert_path_from_config()
    if cert_path:  # if cert_path is not empty
        response = requests.get(url, stream=stream, verify=cert_path)
    else:  # if cert_path is empty
        response = requests.get(url, stream=stream)
    return response

I can't test this on my computer because I don't have the correct setup. Can you pull the issue_156 branch to your local repository, run pip install -e ., then modify the certifications.json to include the path to your cert, to see if this code works? I've already verified that the code works if no cert path is provided or the certifications.json does not exist.

image

@dbuscombe-usgs
Copy link
Member

Hi @2320sharon thanks for this. I will try your proposed workflow today

@dbuscombe-usgs
Copy link
Member

So, I've never checked out a specific branch ... what am I doing wrong?

git checkout -b issue_156

then

git pull

?

(that doesnt seem to do anything)

@dbuscombe-usgs
Copy link
Member

Nevermind, for now I just downloaded the zipped version of the branch. I'll test the new workflow now

@dbuscombe-usgs
Copy link
Member

Success!!! Files now download from zenodo. We're back in business - thank you so much @2320sharon !

@2320sharon
Copy link
Collaborator

This bug fix has been implemented in coastseg 0.0.71

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installation Any problems or questions about installation
Projects
None yet
Development

No branches or pull requests

3 participants