Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL/TLS error in Django connecting to api.rollbar.com on exceptions #215

Open
taylor-cedar opened this issue Oct 30, 2017 · 17 comments
Open
Assignees
Milestone

Comments

@taylor-cedar
Copy link

taylor-cedar commented Oct 30, 2017

This error happens for about 2 out of every 3 requests to send an exception to rollbar. It occasionally works, but mostly doesn't. The following stack trace is printed only occasionally. Sometimes it will fail with no exception stack trace.

Traceback (most recent call last):
   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
     chunked=chunked)
   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 346, in _make_request
     self._validate_conn(conn)
   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 850, in _validate_conn
     conn.connect()
   File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 326, in connect
     ssl_context=context)
   File "/usr/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 329, in ssl_wrap_socket
     return context.wrap_socket(sock, server_hostname=server_hostname)
   File "/usr/lib/python3.6/ssl.py", line 401, in wrap_socket
     _context=self, _session=session)
   File "/usr/lib/python3.6/ssl.py", line 808, in __init__
     self.do_handshake()
   File "/usr/lib/python3.6/ssl.py", line 1061, in do_handshake
     self._sslobj.do_handshake()
   File "/usr/lib/python3.6/ssl.py", line 683, in do_handshake
     self._sslobj.do_handshake()
 ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:749)

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 440, in send
     timeout=timeout
   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
     _stacktrace=sys.exc_info()[2])
   File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
     raise MaxRetryError(_pool, url, error or ResponseError(cause))
 urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.rollbar.com', port=443): Max retries exceeded with url: /api/1/item/ (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:749)'),))

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/usr/lib/python3.6/site-packages/rollbar/__init__.py", line 1209, in _send_payload
     _post_api('item/', payload, access_token=access_token)
   File "/usr/lib/python3.6/site-packages/rollbar/__init__.py", line 1255, in _post_api
     verify=SETTINGS.get('verify_https', True))
   File "/usr/lib/python3.6/site-packages/rollbar/lib/transport.py", line 7, in post
     return _session.post(*args, **kw)
   File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 555, in post
     return self.request('POST', url, data=data, json=json, **kwargs)
   File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
     resp = self.send(prep, **send_kwargs)
   File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
     r = adapter.send(request, **kwargs)
   File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 506, in send
     raise SSLError(e, request=request)
 requests.exceptions.SSLError: HTTPSConnectionPool(host='api.rollbar.com', port=443): Max retries exceeded with url: /api/1/item/ (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:749)'),))

Python 3.6.1
Django 1.11.1
uwsgi 2.0.14
requests 2.18.4
open-ssl 1.0.2k 26 Jan 2017
Docker Alpine Linux

Please let me know if I can provide more info. It seems like some kind of race condition, but not sure, since it sometimes works. Maybe it's a thread safety issue?

@taylor-cedar
Copy link
Author

I am testing with the debug error view, so maybe related to this? #180 . Not sure because I can't see clubhouse.

@rivkahstandig3636
Copy link
Contributor

Hi @taylor-cedar, Are you still having this issue? If so, what version of pyrollbar are you on?

@rivkahstandig3636
Copy link
Contributor

Closing due to inactivity.

@iasci
Copy link

iasci commented Jun 1, 2018

I think this issue is still valid. We are experiencing the same problem with the following setup:

  • Ubuntu 16.04.4 LTS
  • OpenSSL 1.0.2g 1 Mar 2016
  • Django 1.10

I read some similar issues on other client libraries and seems like it can be related to thread (un)safety of requests's sessions. See: https://stackoverflow.com/questions/18188044/is-the-session-object-from-pythons-requests-library-thread-safe

@mjhanke
Copy link

mjhanke commented Sep 18, 2019

Mentioned in #180, but we're running django version 2.1.2 and rollbar version 0.14.5 and are still seeing a very similar traceback as mentioned above:

[ERROR]	2019-09-17T04:19:55.506Z	d8024161-f0a6-4e42-b421-dbdf8ad4cb6c	Exception while posting item SSLError(MaxRetryError("HTTPSConnectionPool(host='api.rollbar.com', port=443): Max retries exceeded with url: /api/1/item/ (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))",),)
Traceback (most recent call last):
  File "/var/task/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/var/task/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/var/task/urllib3/connectionpool.py", line 850, in _validate_conn
    conn.connect()
  File "/var/task/urllib3/connection.py", line 326, in connect
    ssl_context=context)
  File "/var/task/urllib3/util/ssl_.py", line 329, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/var/lang/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/var/lang/lib/python3.6/ssl.py", line 817, in __init__
    self.do_handshake()
  File "/var/lang/lib/python3.6/ssl.py", line 1077, in do_handshake
    self._sslobj.do_handshake()
  File "/var/lang/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:852)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/task/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/var/task/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/var/task/urllib3/util/retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.rollbar.com', port=443): Max retries exceeded with url: /api/1/item/ (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/task/rollbar/__init__.py", line 1336, in _send_payload
    _post_api('item/', payload_str, access_token=access_token)
  File "/var/task/rollbar/__init__.py", line 1385, in _post_api
    proxy_password=SETTINGS.get('http_proxy_password'))
  File "/var/task/rollbar/lib/transport.py", line 33, in post
    return _session().post(*args, proxies=proxies, **kw)
  File "/var/task/requests/sessions.py", line 555, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/var/task/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/var/task/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/var/task/requests/adapters.py", line 506, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='api.rollbar.com', port=443): Max retries exceeded with url: /api/1/item/ (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))

Other useful context might be that this code is executing in an AWS lambda function environment

@bxsx bxsx closed this as completed Nov 15, 2021
@bxsx bxsx reopened this Nov 15, 2021
@subodhpareek18
Copy link

Facing the same issue while on flask. It's working fine when running on my local, but when deployed to AWS inside a docker container get an error.

Flask==0.10.1
rollbar~=0.15
[pid: 30|app: 0|req: 4/19] 69.162.124.233 () {46 vars in 720 bytes} [Fri Feb  4 07:51:23 2022] HEAD / => generated 0 bytes in 4 msecs (HTTP/1.1 302) 4 headers in 407 bytes (0 switches on core 0)
ERR0204 07:46:23.343 31 __init__.py:1499] Exception while posting item SSLError(SSLError(SSLEOFError(8, u'EOF occurred in violation of protocol (_ssl.c:590)'),),)
Traceback (most recent call last):
  File "/home/tvm/backend/venv/lib/python2.7/site-packages/rollbar/__init__.py", line 1497, in _send_payload
    _post_api('item/', payload_str, access_token=access_token)
  File "/home/tvm/backend/venv/lib/python2.7/site-packages/rollbar/__init__.py", line 1552, in _post_api
    proxy_password=SETTINGS.get('http_proxy_password'))
  File "/home/tvm/backend/venv/lib/python2.7/site-packages/rollbar/lib/transport.py", line 44, in post
    return _session().post(*args, proxies=proxies, **kw)
  File "/home/tvm/backend/venv/lib/python2.7/site-packages/requests/sessions.py", line 511, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/home/tvm/backend/venv/lib/python2.7/site-packages/requests/sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/tvm/backend/venv/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/home/tvm/backend/venv/lib/python2.7/site-packages/requests/adapters.py", line 447, in send
    raise SSLError(e, request=request)
SSLError: EOF occurred in violation of protocol (_ssl.c:590)

@jsam
Copy link

jsam commented May 15, 2023

Any update on this issue? We are facing the same problem using 0.16.2

@danielmorell
Copy link
Collaborator

I have a hunch this may be connected with how we handle threading and sessions when transporting payloads. Because our default 'threading' handler does not reuse transport threads we throw away each new requests.Session() we create after the payload has been sent. This is probably causing an unnecessary number of SSL handshakes and/or this may not be completely thread safe.

I would suggest trying our experimental thread_pool handler in rollbar==0.16.4b0. It creates a pool of worker threads that will retain the requests session between payload sends.

@jackie-greenbaum
Copy link

We are running into this issue on our AWS Lambda environment as well. Unfortunately, @danielmorell 's suggestion above of upgrading to 0.16.4b0 and using the thread_pool handler did not seem to fix the issue.

I noticed in my internal testing that I was only able to reproduce the bug when reporting to Rollbar from multiple lambda instances concurrently (though I'm not sure if the concurrency, the higher volume of requests in a short period, or something else entirely is the cause).

Our solution for now is to override rollbar.send_payload() to catch ConnectTimeoutError and ReadTimeoutError, wait a few seconds, and retry.

@echee-insitro
Copy link

We're also running into this issue in AWS lambda. The lambda runs 1x/daily with a reserved concurrency of 1. 0.16.4b0 didn't fix it.

@ghost
Copy link

ghost commented Jul 3, 2023

@danielmorell Isaac created a batched transformer that might be able to resolve this issue. Can you look into it to see if it could resolve this issue? It was this PR: #421

@danielmorell
Copy link
Collaborator

@rollbar-bborsits This seems like it has to do with the transport process but probably not the serialization and scrubbing process.

@ghost
Copy link

ghost commented Jul 6, 2023

@danielmorell Do you need more info to debug this issue?

@jeremy-insitro
Copy link

We've also been frequently experiencing these SSL errors with Flask (only when deployed to AWS inside a docker container), but setting handler="blocking" when initializing Rollbar with rollbar.init(...) seems to have fixed the issue for us.

We're running pyrollbar 0.16.3 and flask 2.0.2

@ghost ghost added the flask label Jul 25, 2023
@ghost ghost assigned ghost and danielmorell Jul 25, 2023
@ghost ghost removed their assignment Jul 25, 2023
@sqpollen
Copy link

sqpollen commented Jul 27, 2023

Also getting this error on a Wagtail/Django deployment on AWS App Runner. Does not happen on a standard EC2 deployment.

rollbar==0.16.3

ERROR:rollbar:Exception while posting item ConnectTimeout(MaxRetryError("HTTPSConnectionPool(host='api.rollbar.com', port=443): Max retries exceeded with url: /api/1/item/ (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fec47b2df10>, 'Connection to api.rollbar.com timed out. (connect timeout=3)'))"))

settings.py

ROLLBAR = {
     'access_token': 'token goes here,
     'environment': 'staging',
     'root': BASE_DIR,
 }

@danielmorell
Copy link
Collaborator

This error is not caused by the Python Rollbar SDK. It likely comes from the SSL implementation available to requests, or from a networking issue between your server, and the Rollbar backend. I would recommend reading this issue psf/requests#3605 for some context from the maintainers of requests.

I won't restate everything in that issue thread. However, one important take away is that if the issue is intermittent, it is likely a networking issue. If it is consistent, you may want to check the version of OpenSSL as well as the rest of your networking dependencies.

Lastly, the final error is thrown when the max retries is exceeded. If this is being throw intermittently, we could add a config option to silence these exceptions, and possibly write them to an agent file. What are everyone's thoughts on this?

@echee-insitro
Copy link

A config option to prevent pyrollbar exceptions from interrupting the application sounds good. Perhaps instead of silencing, if it's possible to write to the root python logger / stderr handler, while avoiding recursive logging by the rollbar handler itself? The idea is to preserve visibility into pyrollbar problems -- e.g. visible as warnings in AWS cloudwatch logs. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests