Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional retries to connection attempts #293

Merged
merged 2 commits into from Mar 10, 2022

Conversation

mew1033
Copy link
Contributor

@mew1033 mew1033 commented Nov 19, 2019

I have a long running process that connects to Splunk and executes and manages multiple jobs over (potentially) hours. Every now and then I'll get a Connection reset by peer error that blows everything up. This should allow me to build my Splunk connection object with a little retry logic built into the sdk.

Traceback I'm getting for reference:

Traceback (most recent call last):
  File "/opt/myCode/myCode.py", line 2130, in <module>
    main()
  File "/opt/myCode/myCode.py", line 1308, in main
    config_data['fields']['matchFields'])
  File "/opt/myCode/myCode.py", line 839, in rebuild_lookups_from_database
    while not all([z.is_done() for z in jobs]):
  File "/usr/lib/python2.7/site-packages/splunklib/client.py", line 2703, in is_done
    if not self.is_ready():
  File "/usr/lib/python2.7/site-packages/splunklib/client.py", line 2715, in is_ready
    response = self.get()
  File "/usr/lib/python2.7/site-packages/splunklib/client.py", line 1009, in get
    return super(Entity, self).get(path_segment, owner=owner, app=app, sharing=sharing, **query)
  File "/usr/lib/python2.7/site-packages/splunklib/client.py", line 766, in get
    **query)
  File "/usr/lib/python2.7/site-packages/splunklib/binding.py", line 290, in wrapper
    return request_fun(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/splunklib/binding.py", line 71, in new_f
    val = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/splunklib/binding.py", line 680, in get
    response = self.http.get(path, all_headers, **query)
  File "/usr/lib/python2.7/site-packages/splunklib/binding.py", line 1184, in get
    return self.request(url, { 'method': "GET", 'headers': headers })
  File "/usr/lib/python2.7/site-packages/splunklib/binding.py", line 1242, in request
    response = self.handler(url, message, **kwargs)
  File "/usr/lib/python2.7/site-packages/splunklib/binding.py", line 1386, in request
    response = connection.getresponse()
  File "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse
    response.begin()
  File "/usr/lib64/python2.7/httplib.py", line 444, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.7/httplib.py", line 400, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib64/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
  File "/usr/lib64/python2.7/ssl.py", line 757, in recv
    return self.read(buflen)
  File "/usr/lib64/python2.7/ssl.py", line 651, in read
    v = self._sslobj.read(len or 1024)
error: [Errno 104] Connection reset by peer```

@mew1033
Copy link
Contributor Author

mew1033 commented Dec 16, 2019

I've been running with these changes on my copy of the sdk for the past 3 weeks and it has completely solved my connection reset problem. I'm not sure what else should be tested, but this is working and working well for me.

@mew1033
Copy link
Contributor Author

mew1033 commented Mar 17, 2020

Any comments here? This has definitely helped with my processes.

@shakeelmohamed
Copy link
Contributor

Hi @mew1033, we unfortunately have not had a chance to look at this PR due to resource constraints. I will personally see if we can get this merged in sooner rather than later. Thank you for your patience and understanding.

@shakeelmohamed shakeelmohamed self-assigned this Mar 25, 2020
@mew1033
Copy link
Contributor Author

mew1033 commented Mar 31, 2020

@shakeelmohamed Sounds good. Thank you for your response. I'll keep using my patched version for now.

@mew1033
Copy link
Contributor Author

mew1033 commented Mar 26, 2021

@shakeelmohamed Just wanted to follow up on this PR. Any chance of getting it merged? I Just rebased to current develop.

@shakeelmohamed shakeelmohamed removed their assignment Apr 5, 2021
@ashah-splunk
Copy link
Contributor

Hi @mew1033 thank you for your PR. We have had a look at the changes and just wanted to check with you the reason behind keeping the default retry time to 10s. Also would request to resolve the conflicts as well as see if you can add a unit test for the changes. Thank you for your patience and understanding.

@ashah-splunk ashah-splunk changed the base branch from develop to optional-retry March 10, 2022 13:12
@ashah-splunk ashah-splunk merged commit 1cdd852 into splunk:optional-retry Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants