Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry sporadic crash due to HTTP timeout #478

Closed
charettes opened this issue Oct 20, 2023 · 8 comments
Closed

Telemetry sporadic crash due to HTTP timeout #478

charettes opened this issue Oct 20, 2023 · 8 comments

Comments

@charettes
Copy link

charettes commented Oct 20, 2023

We're experiencing sporadic crashes of the thread in charge of sending telemetry to split.io and it's polluting our logs with unnecessary noise.

Is it possible for TelemetryAPI not to crash with APIError when it encounters errors during record_* method and simply direct them to a logger that can be silenced?

Alternatively is it possible to disable Telemetry entirely as it's not a feature we're interested in?

@hbqdev
Copy link

hbqdev commented Oct 24, 2023

Hi @charettes

Can you share the version of the SDK, any logs that you have regarding the crash, and your SDK config parameters?

Likewise, are you able to reproduce this issue?

Regards,

@charettes
Copy link
Author

We are using splitio-client==9.5.1 and the traceback looks like

timeout: The read operation timed out
  File "urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
    # Permission is hereby granted, free of charge, to any person obtaining a copy
  File "urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "http/client.py", line 1377, in getresponse
    response.begin()
  File "http/client.py", line 320, in begin
    version, status, reason = self._read_status()
  File "http/client.py", line 281, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "socket.py", line 704, in readinto
    return self._sock.recv_into(b)
  File "urllib3/contrib/pyopenssl.py", line 326, in recv_into
    raise timeout("The read operation timed out")

ReadTimeoutError: HTTPSConnectionPool(host='telemetry.split.io', port=443): Read timed out. (read timeout=1.5)
  File "requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "urllib3/util/retry.py", line 410, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "urllib3/packages/six.py", line 735, in reraise
    raise value
  File "urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "urllib3/connectionpool.py", line 428, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "urllib3/connectionpool.py", line 335, in _raise_timeout
    raise ReadTimeoutError(

ReadTimeout: HTTPSConnectionPool(host='telemetry.split.io', port=443): Read timed out. (read timeout=1.5)
  File "splitio/api/client.py", line 140, in post
    response = requests.post(
  File "requests/api.py", line 116, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "ddtrace/contrib/requests/connection.py", line 97, in _wrap_send
    response = func(*args, **kwargs)
  File "requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)

HttpClientException: requests library is throwing exceptions
  File "splitio/api/telemetry.py", line 64, in record_init
    response = self._client.post(
  File "splitio/api/client.py", line 149, in post
    raise HttpClientException('requests library is throwing exceptions') from exc

APIException: Init config data not flushed properly.
  File "threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "splitio/client/factory.py", line 164, in _update_status_when_ready
    config_post_thread = threading.Thread(target=self._telemetry_submitter.synchronize_config(), name="PostConfigData")
  File "splitio/sync/telemetry.py", line 48, in synchronize_config
    self._telemetry_api.record_init(self._telemetry_init_consumer.get_config_stats())
  File "splitio/api/telemetry.py", line 79, in record_init
    raise APIException('Init config data not flushed properly.') from exc

@hbqdev
Copy link

hbqdev commented Oct 25, 2023

Hi @charettes
Thank you for the information, can you please confirm if the app is actually crashing or it's just the logs messages?

The telemetry config call happens only when the SDK is initializing, not ongoing.

Regards

@charettes
Copy link
Author

@hbqdev there is an unhandled exception in the thread that crashes it but does crash the thread itself.

Since it's a benign error we'd like a way to silence it entirely either by disabling telemetry entirely or by having the error redirected to a logger than we can silence.

The error happens in record_init in this case but we have very similar crashes that happen in both record_stats and record_unique_keys when spit.io servers take a while to answer.

@hbqdev
Copy link

hbqdev commented Oct 25, 2023

Hi @charettes
Thank you for the update, we are reviewing this with our team and will get back to you.

Regards

@hbqdev
Copy link

hbqdev commented Nov 6, 2023

Hi @charettes
This issue should now be fixed, can you please try again?

Regards

@charettes
Copy link
Author

charettes commented Nov 7, 2023

Thank you @hbqdev I'll update to 9.6.0 and report if it addresses the issue.

@hbqdev
Copy link

hbqdev commented Nov 7, 2023

Thank you @charettes

@hbqdev hbqdev closed this as completed Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants