New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken userstream and all other streams supposed to break after Jan 13 + add image support + fix diverses issues #195
Conversation
Twitter is moving all of its APIs towards HTTP1.1. https://dev.twitter.com/blog/deprecating-http-1.0-streaming-api Userstream passed already mid-november and has been borken for this lib since On Jan 6, Twitter will apply it as weel for all the other stream methods, meaning they probably would be all broken. The problem was in two pieces. First with this commit, we stop asking for gzip encoded data for streaming calls: for the HTPP1.0 calls, this was not taken into account anyway, but with HTTP1.1 the answer is now chunked and should therefore not caught zipped. (First part of fix for userstream and soon all streams cf python-twitter-tools#190 python-twitter-tools#116)
Chunked data sent via HTTP1.1 by Twitter is not as straight forward to read, chunks indicating size of tweets are present no matter what, so we find the json start in the buffer and forget the rest for now. (Second part of correction for userstream and soon all streams. Fix python-twitter-tools#190 python-twitter-tools#116)
Instead of trashing it as in previous commit, return the sizes of the messages sent by the API as regular messages if requested through the normal API argument 'delimited=length' https://dev.twitter.com/docs/streaming-apis/parameters#delimited (Last part of userstream and other streams fix)
When running a stream with a low traffic, the current timeout logic was hanging indefinitely without processing. This rewritten logic seems to be working fine now with both http1.0 and 1.1 and active or low traffic. @dkanygin: would love to get your code review on this (cf python-twitter-tools#178)
The existing error was never reached, luckily since it was using undefined arguments. moved it in its right place and added handling ssl errors as TwitterHTTPError. (Fix python-twitter-tools#136)
Added all missing methods from https://dev.twitter.com/docs/api/1.1 Also included some of the streaming methods which work with both GET and POST but accept arguments like "track" which can quickly require POST. (Closes python-twitter-tools#187 python-twitter-tools#145 python-twitter-tools#188)
Oups, the default for all other methods should be True, not False
Adapt argument calls to the intended behavior Fixes python-twitter-tools#177 python-twitter-tools#192 python-twitter-tools#126 python-twitter-tools#10
So the calls to send media (update_with_media, update_profile_image, update_profile_background_image and update_profile_banner) require to send queries differently, as multipart form, and oauth needs to be signed without taking any parameter into account. This does the trick, following the rules here and there: https://dev.twitter.com/docs/uploading-media https://dev.twitter.com/docs/api/1.1/post/statuses/update_with_media https://dev.twitter.com/discussions/1059 Closes python-twitter-tools#127
The fix python-twitter-tools#64 by @grahame was fixing the issue of sending tweets with '~' for python 3 but not for python 2 due to urllib.urlencode's lack of safe functions. Here's a dirty hack to do it anyway. Any better fix welcome :)
convert non string before replacing in it... (follows da33e70 )
Even when there is a low traffic, we know we didn't lose the connection yet when ready_to_read indicates a signal from Twitter, so we don't yield any timeout message and continue, but we should at least yield None like in the non-blocking mode to allow iterator's use to get the hand back and exit cleanly if required. Completes 3df3413 Ping @dkanygin python-twitter-tools#178
TwitterHTTPError should not be tried to be made from SSLError, the original behavior of just raising this is best, my bad... (Follows 040f13e )
The deadline was pushed by Twitter from Jan 6 to Jan 13 : https://dev.twitter.com/blog/deprecating-http-1.0-streaming-api |
Thank you, RouxRC. Would you recommend that those of us who need the streaming API switch to your fork? |
Hi @rspeer, well apparently my fix was not enough, userstream was still broken which I just fixed with the previous commit. pip uninstall twitter pip install git+https://github.com/RouxRC/twitter.git@fix-broken-streams |
There was a terrifying moment today when both the HTTP 1.0 and HTTP 1.1 streams went down. But I can confirm that, after that interruption, I could access the streaming API through RouxRC's branch. |
I had no idea about this change until this morning when the stream stopped working. RouxRC branch worked for me too. |
RouxRC, First thank you for your changes. I have a few comments.
Again, thank you for making your changes. They are helpful. Anon, |
Hi @adonoho, First of all, yes I regret having done all of these changes altogether. The thing is I started working with it and therefore kept completing my fix after having immegred myself into the code. But I should definitely have separated the commits into different pull requests per functionality, my bad. Regarding Python 3, as I explained at first, I did not get a chance to try it with python 3 and that's definitely a mistake. I'm surprised though that the use of "re" would break python3 use, it is fairly standard generic code, could you explain what's breaking? Finally, thanks for req.data and the None on lower thing, it does indeed work with both urllib's python3 and urllib2's python 2 now that get_data was removed in python 3.4, I'll add this fix in a commit right away. |
RouxRC, Allow me to say again, thank you for your changes. My comments were intended to be constructive criticism and I was hoping that you took them as such. Large pull requests are hard for any project owner to digest. As Mr. Verdone has been silent during this little Twitter crisis, I can only assume he's busy with more pressing issues. Regardless, when he returns, a smaller pull request that just addresses the streaming issue would probably be welcome. (I'm not opining on the worthiness of any of the changes. This is just a process observation about how to help the project owner digest your changes.) I will dig into what was happening with Python 3 later today. In particular, there was something weird causing Python to crash using strings. As that is the major difference between the two languages and you did not explicitly test with Python 3, I just rolled back to Python 2 to get my stream back online. I'll get you more data soon. Allow me to add though that putting the regular expression engine into the middle of tweet processing is probably not an efficient solution. What are you trying to really do here? What was the motivation for this fix? And what other ideas did you examine before you fired up the I will help you track this down. Nonetheless, when one submits pull requests to a library that publicly and not accidentally supports both Python 2 & 3, then good engineering dictates that one should test with both worlds. I want to encourage you to be the best engineer you can be and not just a coder. Again, thank you for your changes. Anon, |
Andrew, as I already said, I totally understand the pain of such a big pull request and I certainly should have try and make it more atomic. A few of these commits could certainly be taken totally separately as other pull requests. But others also build corrections on top of the new streaming handling. So splitting the lot is going to be time-consuming and I won't have the time in the coming days to resplit it soon unfortunately. Regarding re, the choice was made in order to handle weird hexadecimals lines by which begin some paquets sent by Twitter through certain streams (the userstream especially) looking like that sometimes two sometimes four characters: 7d3 {status:".... My issue was to remove them and only them (without removing the delimited length of tweets that twitter sends when delimited=true is set and which looks exactly the same). This regexp use does the trick by catching heading hexa and hexa-like on a separate line but not the delimited length that would start the packet after the removal of the first hexa head. But indeed using regexp can be core-consuming and maybe some other logic could replace it more efficiently. Any proposition is very welcome. Again I did all of this as an urgent fix when realizing the library was gonna break, because I'm using the library and wanted my streams to keep working, there may be better way to address all these problems and I'll be happy to use them if they're implemented another way. |
Update to use OAuth, take in command line arguments and modify the imports to function from within the module.
thanks @dkanygin
I removed the old twitter package using pip and installed your twitter package, and the ircbot connects and joins but does nothing. In the console, I see this error: Exception while querying twitter: I verified that /usr/local/lib/python2.7/dist-packages/twitter/ was gone after I ran 'pip uninstall twitter' so there shouldn't be any old files hanging around unless I'm missing something else. |
Hello @geeknik I don't know much about the package's ircbot, but looking at the error and the code, it looks like it is specifying the api_version to 1 instead of using the default 1.1 (and therefore I guess the bot was probably not working much more with the old version). It's easy to fix, I've just included a commit which should resolve it. You can also just delete line 151 of twitter/ircbot.py |
Thanks @RouxRC, that solved one problem, but now we've got another one and I don't know anything about Twitter's changes to be able to fix this one on my own: Exception while querying twitter: Any ideas? =) |
Thanks to @boogheta for solving my last group of errors, my IRC bot is back in business! ;) |
@geeknik I've never used it but it looks to me like the ircbot part of this lib hasn't been maintained for a while, this call to friends_timeline is a method that was deprecated by Twitter in API v1.1 : https://dev.twitter.com/docs/api/get-statusesfriends_timeline But if you want a proper IRC bot using twitter, I can only point you to the project I'm maintaining which is my reason for helping on this library :) It's called gazouilleur with a lot of other features: http://github.com/RouxRC/gazouilleur PS: @boogheta and myself are the same, it's just my professional account |
As said in PS : @boogheta and myself are the same, my message was just intented to explain the commit :) |
Oh, sorry, I missed the PS. Thanks for your help! =) |
I don't know if this is connected with just this version, but I found, that searching for keyword phrases stopped working, for instance, if I am searching for "icecream" I get results, but searching for "ice cream" ( i.e. [...].statuses.filter(track = 'ice+cream') returns no results. |
@gnembon Have you tried track='ice cream' ? I do not recall ever having to use "+" in the stream queries and I do get results without it when I do not with the +. By the way, I believe @adonoho's way of dealing with the streams is definitely more efficient than what I did here and if you only need the streams fixed, you should probably use his branch (see #196). |
Hi @RouxRC , I cherry-picked your update to twitter-globals, and your authorship change since those were pretty free-standing. For the rest, resubmit whenever you have the time and inclination. Thanks for your work! It is appreciated. |
Closing this here and reopening with merged cherrypicks of the only needed commits (see #197) |
Merry Christmas!... Twitter is moving all of its streaming APIs towards HTTP1.1 chunked encoded transfer and breaking the whole lib for us! https://dev.twitter.com/blog/deprecating-http-1.0-streaming-api
Userstream passed already mid-november and has been broken since, cf #190
I first didn't pay much more attention to it than reading the issue thread since I only use the public statuses.filter stream. But I just found out about this announce and browsing issues it became clear the userstream issue here was coming from this change, which did happen on November 23rd in fact (https://dev.twitter.com/discussions/23224#comment-53758) just as the issues reported here confirm.
The thing is the announce also says that on January 13, Twitter will apply it as well for all the other streaming calls, meaning they probably will all be broken as well.
So this motivated me to fix the issue for userstream and supposedly for the other ones further on.
The solution was to stop asking for gzip encoded data for streaming calls since they already answer not zipped with the HTTP1.0 calls (like statuses/filter for now) and should not be zipped when dealing with chunked transfer. This also required to adjust the parsing of the chunks to separate "delimited" length sent by twitter from actual data.
I've tried it out successfully on a combination of calls on (public stream HTTP1.0 | user stream HTTP1.1chunked) x (not blocking | blocking | with timeout) x (low traffic | high traffic | high traffic with argument 'delimited') with python 2.6 & 2.7 only. I don't believe I've modified anything incompatible but any tryout from py3 users would be great.
I also took the occasion to fix a couple other issues, including fixing some hanging when using the timeout with low traffic, providing TwitterHTTPErrors on streaming calls errors, fixing the broken follow create/destroy with cmdline and the irc bot, and completing the POST methods list.
Considering the urge, I will personnally migrate my IRC bot gazouilleur to my branch until this is pulled or handled otherwise.
Happy tweeting holidays!
PS: Twitter also announced an SSL change for Jan 6, but since all the calls are already using secure by default, we should't be at risk here.
edit 28/12:
I've completed this with fixing the tilde sending issue for python 2, and I added the support for API calls to send images as discussed in #127 , it now works when feeding the corresponding argument with raw read of a file. For instance: