Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeouts and Aborted Connections after running code for long times #617

Open
tednopple opened this issue Jun 5, 2015 · 36 comments · May be fixed by #675
Open

Timeouts and Aborted Connections after running code for long times #617

tednopple opened this issue Jun 5, 2015 · 36 comments · May be fixed by #675

Comments

@tednopple
Copy link

@tednopple tednopple commented Jun 5, 2015

I am using Twitter API to scan through tweets and users, but because of Twitter's API limits, scanning thousands of users/tweets can take hours. Fine. However, when I run Twitter API code through Python using Tweepy, after a few hours of the code running the code gets interrupted and I get one of two error messages:

tweepy.error.TweepError: Failed to send request: ('Connection aborted.', error(54, 'Connection reset by peer'))

Another error:

tweepy.error.TweepError: Failed to send request: HTTPSConnectionPool(host='api.twitter.com', port=443): Read timed out. (read timeout=60)

It seems Twitter has a timeout limit that stops access after a certain point? Or not? Is this a Twitter API issue, or a Tweepy one?

This doesn't happen for small sizes, only when the code has to spend more than about an hour running. Of course one sees the problem with walking away from script that should take hours to run, only to come back and find that it's aborted partway through.

No one seems to know why this happens. Posted on several forums and spoken to several people..Could be a bug.

@jamesonwatts

This comment has been minimized.

Copy link

@jamesonwatts jamesonwatts commented Jul 2, 2015

Any luck with this? I'm having the same problem

@natejlong

This comment has been minimized.

Copy link

@natejlong natejlong commented Jul 7, 2015

Also having this problem unfortunately

@ThaWeatherman

This comment has been minimized.

Copy link

@ThaWeatherman ThaWeatherman commented Jul 22, 2015

Bump, since I am too QQ...if it helps, here is my stack trace from within IPython

@mustekito

This comment has been minimized.

Copy link

@mustekito mustekito commented Jul 25, 2015

I'm getting this error also

@astreim

This comment has been minimized.

Copy link

@astreim astreim commented Aug 4, 2015

Me too. Any help?

@eabanoz

This comment has been minimized.

Copy link

@eabanoz eabanoz commented Aug 20, 2015

Besides these errors, I have also gotten this error code;

RATE LIMIT - waiting 15 minutes...
Traceback (most recent call last):
File "C:/Users/eabanoz/Desktop/Python_Scripts/Tweepy_Followers.py", line 73, in
print e.message[0]["code"]
TypeError: string indices must be integers, not str
TweepError Failed to send request: ('Connection aborted.', error(10054, 'An existing connection was forcibly closed by the remote host'))

Process finished with exit code 1

@ThaWeatherman

This comment has been minimized.

Copy link

@ThaWeatherman ThaWeatherman commented Aug 23, 2015

Just so everyone knows, the sixohsix twitter library works great. No issues with retry failing.

@lbvienna

This comment has been minimized.

Copy link

@lbvienna lbvienna commented Sep 7, 2015

Also getting this error. Anyone figure out a fix?

@scubjwu

This comment has been minimized.

Copy link

@scubjwu scubjwu commented Sep 22, 2015

I just got the same error two days ago. You can quickly reproduce the error by turning off your network connection. After briefly check the source codes of tweepy, I think the error is from request lib because the default timeout value in tweepy is set to be 60s. So just let your program sleep some time and make sure the network connection is Okay, and the codes should be able to be resume.

@jcapde

This comment has been minimized.

Copy link

@jcapde jcapde commented Sep 22, 2015

I agree with @scubjwu. To me it happened when the connection went off. I have "fixed" by catching ReadTimeoutError and ConnectionError within the Timeout except.
except (Timeout, ssl.SSLError, ReadTimeoutError, ConnectionError) as exc:

Note that ReadTimeoutError belongs to urllib3 packages
from requests.exceptions import Timeout, ConnectionError
from requests.packages.urllib3.exceptions import ReadTimeoutError

@tednopple

This comment has been minimized.

Copy link
Author

@tednopple tednopple commented Sep 22, 2015

SORRY. This was for a client and Tweepy (or whoever) apparently couldn't care less about responding to the issue, plus the project was compete, so I stopped following the thread.

Our workaround was to divide the tweets up into smaller batches. We ran the code several times at once using different keys and smaller batches, then join the files later.

For future projects I probably won't use Tweepy, due to lack of support. We were just in too deep to turn to another library this time.

Though, after reading the thread, it looks like it could be a connection issue. We were working on company server so maybe connection dropped every once in a while, but I seem to recall this happening on a home computer as well and someone with higher expertise saying this was not the case.

Solutions so far:
-@jcapde87 's Try/Except solution also seems like a good idea.
-Use another library (Twitter)
-Divide data/calls into smaller batches and combine results later.

:(

@scubjwu

This comment has been minimized.

Copy link

@scubjwu scubjwu commented Sep 22, 2015

Hi @tednopple. I am just curious why using smaller batches would help ;-) Is it because smaller batches will shorter your time to process the data, so that the codes won't be timed out? Actually my codes just fetch the data from twitter and store it into database, so the data processing time is negligible. My feeling is the time out error may be raised by the rate limitation problem. You could always try to catch the exception and let the codes resume after sleeping for a while.

@tednopple

This comment has been minimized.

Copy link
Author

@tednopple tednopple commented Sep 22, 2015

Yes, I think smaller batches just cut the processing time down. We were scanning followers. People with hundreds of thousands of followers caused the time out. So we divided the list to process less than 100,000 at a time

We wrote an exception to catch rate limits and wait the required 15 minutes. Despite this, the code would just abort all together with large data after hours of running. And the error was a Tweepy one.

@aspectlab

This comment has been minimized.

Copy link

@aspectlab aspectlab commented Sep 25, 2015

Thanks @jcapde87. Big help.

@ccphillippi ccphillippi linked a pull request that will close this issue Nov 22, 2015
@akisxyz

This comment has been minimized.

Copy link

@akisxyz akisxyz commented Dec 25, 2015

Thanks, @jcapde87 and @ccphillippi.

@allo-

This comment has been minimized.

Copy link

@allo- allo- commented Mar 8, 2016

I've got another error message to add:

File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 149, in execute
    raise TweepError('Failed to send request: %s' % e)
tweepy.error.TweepError: Failed to send request: [Errno 110] Connection timed out

Another wish:
Can you split the TweepError into more detailed Exceptions? Like TweepyTimeout(TweepError), so it can be caught for a retry?

@hcmbg

This comment has been minimized.

Copy link

@hcmbg hcmbg commented Apr 30, 2016

I was having this ReadTimeOutError issue. I resolved this issue by catching the TweepError. @allo- you can access the TweepError's attribute (reason, response, api_code) to more specifically handle TweepError. In the particular error of 'Failed to send request' (e.g., raised in line 189), TweepError object only has reason.

In the try except block, I added:

except TweepError as e:
if 'Failed to send request:' in e.reason:
print "Time out error caught."
time.sleep(180)
continue

I have a while loop that wraps around the try except block. So whenever this TweepError is raised, the code sleeps for 180 sec. and continue executing the while loop. You can test this by running the code without internet access. Within that 180 sec, re connect the internet, and it should work.

@ZeerakW

This comment has been minimized.

Copy link

@ZeerakW ZeerakW commented Jul 4, 2016

Interestingly, I've found that @jcapde87's solution isn't working in python3, anyone else have the same experience?

@raiprabh

This comment has been minimized.

Copy link

@raiprabh raiprabh commented Jul 28, 2016

A quick query, can specifying a longer timeout in api.timout in Tweepy object help in decreasing the occurrence of this issue?

@sam-s

This comment has been minimized.

Copy link

@sam-s sam-s commented Oct 9, 2016

I still observe this with python 2.7 and tweepy 3.5

@neuhaus

This comment has been minimized.

Copy link

@neuhaus neuhaus commented Dec 4, 2016

What's the best way to deal with broken connections?
I'm seeing this error when using the stream API (stream.filter() ) with tweepy 3.5.0 and python 3.5.2 after the program has run for a while.

@ZeerakW

This comment has been minimized.

Copy link

@ZeerakW ZeerakW commented Dec 4, 2016

@neuhaus It seems it's a timeout issue, so I would suggest letting the program sleep for now.

@sam-s have you tried using the fix from #675?

@sam-s

This comment has been minimized.

Copy link

@sam-s sam-s commented Dec 4, 2016

@ZeerakW
Yes I did a month ago and commented there that it's an improvement but not a complete fix.

@kjoth

This comment has been minimized.

Copy link

@kjoth kjoth commented Feb 9, 2017

Is there any other alternative to Tweepy?

@ThaWeatherman

This comment has been minimized.

Copy link

@ThaWeatherman ThaWeatherman commented Feb 14, 2017

@kjoth I answered this in a comment in this thread: #617 (comment)

@kjoth

This comment has been minimized.

Copy link

@kjoth kjoth commented Feb 14, 2017

@ThaWeatherman Thats great. Thanks dude. I will check it out.

@neuhaus

This comment has been minimized.

Copy link

@neuhaus neuhaus commented Feb 21, 2017

A code example how to catch a timeout and re-establish connection etc would be great.

@KhoaDuongUQ

This comment has been minimized.

Copy link

@KhoaDuongUQ KhoaDuongUQ commented Apr 22, 2018

@jcapde Can you provide an example of how to catch a timeout and re-establish connection?

@greysonevins

This comment has been minimized.

Copy link

@greysonevins greysonevins commented Jun 19, 2018

Any updates on this error?

@KhoaDuongUQ

This comment has been minimized.

Copy link

@KhoaDuongUQ KhoaDuongUQ commented Jun 19, 2018

@greysonevins Apparently not, but you can follow the 'old' method (without using Cursor) in this link and it works perfectly.

@vzts

This comment has been minimized.

Copy link

@vzts vzts commented Jun 21, 2018

In my case I didn't want to mess up the code inside the package and my main thread was not doing anything, so I did the following after reading thorough the lib code:

from ssl import SSLError
from requests.exceptions import Timeout, ConnectionError
from urllib3.exceptions import ReadTimeoutError

listener = TweetsStreamListener(...)
stream = tweepy.Stream(...)

# create a zombie
while not stream.running:
    try:
        # start stream synchronously
        logging.info("Started listening to twitter stream...")
        stream.filter(..., async=False)
    except (Timeout, SSLError, ReadTimeoutError, ConnectionError) as e:
        logging.warning("Network error occurred. Keep calm and carry on.", str(e))
    except Exception as e:
        logging.error("Unexpected error!", e)
    finally:
        logging.info("Stream has crashed. System will restart twitter stream!")
logging.critical("Somehow zombie has escaped...!")

Hope it helps.

@vzts

This comment has been minimized.

Copy link

@vzts vzts commented Jun 21, 2018

It will be best if there's a PR for this one generalizing the caught exception at here. I've seen somebody already sent related PRs like #675 and #797. Maybe it's maintainers being busy or there being some tricky issues, the PRs are not yet pulled. Anyway, my example may be an easy workaround if it suits your case.

@lwahedi

This comment has been minimized.

Copy link

@lwahedi lwahedi commented Jun 28, 2018

I incorporated a fix for the problem when using the cursor (tested when collecting followers from a user with too many followers to successfully collect before). It's in my Tweepy fork here:
https://github.com/lwahedi/tweepy.git@master

@lwahedi

This comment has been minimized.

Copy link

@lwahedi lwahedi commented Jul 9, 2018

Nevermind, it's not working, I just got lucky with timeouts. Will post again when I do manage to fix it.

@davekaj

This comment has been minimized.

Copy link

@davekaj davekaj commented Apr 1, 2019

vtzs solution worked. in the try statement I added in time.sleep(5) so that it doesn't try 100's of times in a few minutes

@t-duan

This comment has been minimized.

Copy link

@t-duan t-duan commented Sep 8, 2019

I tried to initiate api = tweepy.API(timeout=600) #default=60 every time after time.sleep(15*60). It seems to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

You can’t perform that action at this time.