Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using StreamingAPI (statuses/filter), Second Last Tweet Shows up in get_iterator() Loop #20

Closed
jgcaruso opened this issue Mar 24, 2014 · 3 comments
Assignees
Labels

Comments

@jgcaruso
Copy link

I am using the following pattern to process tweets in real-time:

r = api.request('statuses/filter', {'track': 'mykeyword'})
for item in r.get_iterator():
print item

I am using the latest version of TwitterAPI available on PIP (2.1.9)

Whenever a keyword is tweeted, within about a second the last tweet that matched the filter is printed on the screen. I found a similar report from another python Twitter Streaming API library where they seem to have found a solution: http://stackoverflow.com/questions/10083936/streaming-api-with-tweepy-only-returns-second-last-tweet-and-not-the-immediately

I'll lay out an example of what is happening:

  1. start the streaming api request
  2. tweet something that matches the filter "this is a test of the mykeyword filter"
  3. wait a long time (I waited several minutes as a test), no output is displayed by my application
  4. tweet something else that matches the filter "this is a second mykeyword filter test"
  5. the first tweet is output almost immediately (within a second)
  6. tweet something else right away that matches the filter "this is a third mykeyword filter test"
  7. the second tweet is output almost immediately
  8. continue steps 6 and 7 to see immediate results

It is almost as if the tweets are received immediately, they push out the last tweet that was received, but that newest tweet itself is not passed to the iterator and it just sits there waiting for something else to come in before it is able to be processed.

@geduldig geduldig added the bug label Mar 24, 2014
@geduldig geduldig self-assigned this Mar 24, 2014
@jgcaruso
Copy link
Author

I also just stumbled upon this, which seems like it may be more relevant than the link I posted above: ryanmcgrath/twython#202

It specifically references using the Requests library.

@geduldig
Copy link
Owner

I was able to confirm the issue. The last link you sited regarding buffer sizes was spot on. The default buffer (for streaming) is 512 bytes. Setting this to 1 byte fixes the issue. This fix is in the latest TwitterAPI release (2.1.13). However, it is not great for efficiency. A better solution is to use Twitter's "delimited" parameter for streaming endpoints. I probably will add this in a future release. Regardless, should be good now.

Thanks for catching this!
Jonas

@jgcaruso
Copy link
Author

I just pulled down your change and it works great.

Thanks for the quick turnaround!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants