New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix streams timeout & hangup behavior + ensure python2.6 compat #203
Conversation
FYI, here is the script I use to test the different combinations of modes/traffic/endpoint/python_version. I've successfully tried timeout and hangup situations on most combinations. Here are examples to run different possible combinations (assuming you have a simple keys.py file with your KEY, SECRET, OAUTH_TOKEN, OAUTH_SECRET set in the same dir): python2.6 test_stream.py block low regular python2.7 test_stream.py noblock low regular python3.2 test_stream.py noblock high user python2.6 test_stream.py timeout20 low+ user python2.6 test_stream.py timeout60 low+ user ... A good way to test Twitter's keepalive signal is to use timeout lower than 30 (which seems to be the average seconds between two signals) on the lowtraffic and then to 60 (twitter recommands 90) to understand the problem @ksecrist was pointing out. Interesting thing as well is that with the userstream, it turns out we can apparently run multiple connections at once (it refuses after a various total number which seems to be lower when in timeout mode) # -*- coding: utf-8 -*- import sys from twitter import OAuth, TwitterStream from keys import KEY, SECRET, OAUTH_TOKEN, OAUTH_SECRET # USAGE: # python test_stream.py [block|noblock|timeout] [low|high|high][+] [regular|user] if len(sys.argv) > 3: stream = sys.argv[3].lower() else: stream = "regular" if len(sys.argv) > 2: traffic = sys.argv[2].lower() else: traffic = "low" if len(sys.argv) > 1: mode = sys.argv[1].lower() else: mode = 'block' args = {'auth': OAuth(OAUTH_TOKEN,OAUTH_SECRET,KEY,SECRET)} qargs = {} if mode == 'noblock': args['block'] = False if mode.startswith('timeout'): args['timeout'] = int(mode.replace('timeout', '')) if stream == 'user': args['domain'] = 'userstream.twitter.com' method = 'user' else: method = 'statuses/filter' if traffic.endswith('+'): qargs['delimited'] = 'length' if traffic.startswith('high'): qargs['track'] = 'bieber,gaga' else: qargs['track'] = 'youpiiiiii' ct = 0 for a in getattr(TwitterStream(**args), method)(**qargs): if not a: continue ct += 1 if isinstance(a, dict): if "id_str" in a: print ct, a['id_str'], a['text'] elif "friends" in a: print len(a['friends']) else: print a else: print a |
I've versionned this as a gist for further use https://gist.github.com/RouxRC/9028951 |
@RouxRC This patch causes
The TwitterStream is returning @adonoho This branch seems to include your work from the other PR. Are you willing to close that one? Further, are there any specific tests you'd like to run with this PR to ensure that it supports your needs? |
@sixohsix I guess when ran without option,
@adonoho what do you think? |
@RouxRC the weird thing is that it returned none immediately, not after 60 sec. If you prevent the immediate return, I think that's fine. If I explicitly set timeout=None will I get None results.? I would prefer not to get them. But if so, that can be fixed in another PR. |
@sixohsix If you just remove |
Regarding the first yield, this is due to the fact the read buffer loop happens before the read socket one, so at the first start it necessarily yeilds none except if we add some specific logic. This was the current behavior already with non-blocking mode, I don't feel like this is a real issue that would need a specific if |
I believe it should be fixed. It utterly broke the test script. How many other users will update and discover their scripts immediately fail, despite never failing before? I can fix it myself later if you're busy, but I want to remove that initial None before I put this in the next release. Sorry for brevity. Writing from my phone. |
In this case, I believe it's better to back port to the no None yielded in timeout mode. This is the only way to ensure total portability to the existing users since removing only the first yield won't change the fact they will get other ones later when they might not expect them. |
Ahh now I understand! Yes I'd like you to backport. Thank you! |
done. I kept the change in the twitter-stream-sample though since It sounds more logical to me that the sample would use the same mode for each optional stream (the user and sitestream ones are already using the regular blocking mode, public one was the only one set with timeout) |
@RouxRC/Benjamin, While we have very different styles of coding, I've been very respectful of your code and ask the same in return. Making changes to my pull requests without any public justification is not an OK process. I've chosen my variable names and order of tests for specific reasons. If you are going to pull code out of my patches, please stick with the order I've chosen. For example, you've changed the sense of the boolean from Please, be respectful. Return my variable names to what I chose. If you think they should be different, make the case for your opinion in this public fora. Let the rest of us judge your decision on the technical merits. Anon, |
@adonoho I must admit I do not understand your reaction. I just reproposed your refacto since my PR was fixing a line from the code that was disappearing. Exactly as you did earlier with a previous concurrent PR of mine and which I never complained about even though I then had to repropose many things, and we are here now still repushing some fixes I did 2 months ago. The only change I did from your PR was the change of the variable name, since I believe saying modern_python is a judgment choice which should not be reflected in some code. But I seriously do not care at all about the name and will definitely not lose my time fighting about this... Same about the order of the if, I'll change it right now just for your pleasure since you seem to take this very seriously... Regarding versions of Python, I respect your position, just know that python3 users are highly less numerous than python2, and that python 2.6 is still the official release on various linux distributions like debian stable, which host thousands of production environments... |
Apart from that, any mind on the real questions at stake here? |
I've never had to set up contributor guidelines before, but this is the first time two people have been competing for bugfixes and questioning some fundamental stuff in this project. So, for what it's worth... If you do build on top of someone else's PR, please fetch and checkout their branch instead of including the work as a diff. By doing that you will keep their credentials in the history, which makes It's my fault for not calling this out in the past. I'll keep my eyes on it better in the future. This PR is fine as it is, but it'll be the last one to do so. I'm not hugely interested in variable naming controversies, but since it's become a hot topic, the name As for supporting 2.6, I have no trouble doing that perpetually for the API component, so long as it doesn't slow down other work. Right now the extent of that support is about six lines of code. That's fine by me. 2.6 stays in the codebase. I'll run a few last tests, and then, hopefully, merge and release this. Again, thank you for your contributions, and your patience with me. |
Thanks for your patience and openness Mike, and apologies for the situation which I really didn't expect at all and would have tried to avoid otherwise, sorry again about that. |
Fix streams timeout & hangup behavior + ensure python2.6 compat Here are the changes to fix the misbehavior of the timeout in case of low tweets by catching Twitter's keep-alive heartbeat signals thanks to the select.select originally added in #178 and still problematic as pointed out by @ksecrist in #202 I also generalized the hangup to all cases since there is no reason to stay in infinite loop after a hangup in non-blocking mode. And to make things easier and avoid merging issues, I adapted the refacto and fixed python2.6 compatibility from @adonoho's #201
Here are the changes to fix the misbehavior of the timeout in case of low tweets by catching Twitter's keep-alive heartbeat signals thanks to the select.select originally added in #178 and still problematic as pointed out by @ksecrist in #202
I also generalized the hangup to all cases since there is no reason to stay in infinite loop after a hangup in non-blocking mode.
And to make things easier and avoid merging issues, I adapted the refacto and fixed python2.6 compatibility from @adonoho's #201