Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python scraper with no changes started failing with ssl error 'certificate verify failed' on 1 Oct 2021 #1276

Open
cofiem opened this issue Oct 17, 2021 · 1 comment

Comments

@cofiem
Copy link

cofiem commented Oct 17, 2021

A scraper I built started failing with an SSL error on 1 Oct 2021.

I'm not sure how to fix this?

Injecting configuration and compiling...
�[1G       �[1G-----> Python app detected
�[1G-----> Installing python-3.6.2
�[1G-----> Installing pip
�[1G-----> Installing requirements with pip
�[1G       Collecting lxml==4.6.3
�[1G       Downloading lxml-4.6.3-cp36-cp36m-manylinux2014_x86_64.whl (6.3 MB)
�[1G       Collecting requests==2.26.0
�[1G       Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB)
�[1G       Collecting certifi>=2017.4.17
�[1G       Downloading certifi-2021.10.8-py2.py3-none-any.whl (149 kB)
�[1G       Collecting charset-normalizer~=2.0.0
�[1G       Downloading charset_normalizer-2.0.7-py3-none-any.whl (38 kB)
�[1G       Collecting idna<4,>=2.5
�[1G       Downloading idna-3.3-py3-none-any.whl (61 kB)
�[1G       Collecting urllib3<1.27,>=1.21.1
�[1G       Downloading urllib3-1.26.7-py2.py3-none-any.whl (138 kB)
�[1G       Installing collected packages: urllib3, idna, charset-normalizer, certifi, requests, lxml
�[1G       Successfully installed certifi-2021.10.8 charset-normalizer-2.0.7 idna-3.3 lxml-4.6.3 requests-2.26.0 urllib3-1.26.7
�[1G       
�[1G       �[1G-----> Discovering process types
�[1G       Procfile declares types -> scraper
Injecting scraper and running...
Reading petition list
Traceback (most recent call last):
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/connection.py", line 426, in connect
    tls_in_tls=tls_in_tls,
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 450, in ssl_wrap_socket
    sock, context, tls_in_tls, server_hostname=server_hostname
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/app/.heroku/python/lib/python3.6/ssl.py", line 401, in wrap_socket
    _context=self, _session=session)
  File "/app/.heroku/python/lib/python3.6/ssl.py", line 808, in __init__
    self.do_handshake()
  File "/app/.heroku/python/lib/python3.6/ssl.py", line 1061, in do_handshake
    self._sslobj.do_handshake()
  File "/app/.heroku/python/lib/python3.6/ssl.py", line 683, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:748)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/.heroku/python/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/app/.heroku/python/lib/python3.6/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='epetitions.brisbane.qld.gov.au', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:748)'),))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "scraper.py", line 254, in <module>
    petitions.run()
  File "scraper.py", line 42, in run
    petition_list_page = self.download_html(self.petition_list)
  File "scraper.py", line 208, in download_html
    page = requests.get(url)
  File "/app/.heroku/python/lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/app/.heroku/python/lib/python3.6/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/app/.heroku/python/lib/python3.6/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/app/.heroku/python/lib/python3.6/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/app/.heroku/python/lib/python3.6/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='epetitions.brisbane.qld.gov.au', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:748)'),))
@cofiem
Copy link
Author

cofiem commented Oct 17, 2021

Ah, the problem might be the Let's Encrypt certificate expiration on 30 Sept 2021.

https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant