Streaming API silently not working #9

Closed
nicola opened this Issue Jun 27, 2012 · 35 comments

Comments

Projects
None yet
5 participants

nicola commented Jun 27, 2012

Hello there, (yesterday it worked, but I tried today and it didnt, I didnt change anything in the code),

here is my code:

var Twit = require('twit');
var T = new Twit({
consumer_key: 'vroFy2FvOR8En7saib1A4Q'
, consumer_secret: '**'
, access_token: '10818002-
'
, access_token_secret: '*'
});

var stream = T.stream('statuses/sample');
stream.on('tweet', function (tweet) {
console.log(tweet);
});
stream.on('error', function (tweet) {
console.log(tweet);
});

stream.on('limitation', function (tweet) {
console.log(tweet);
});

if I do node twit.js I get nothing and the process ends, any hints?

Owner

ttezel commented Jun 27, 2012

I ran your exact code above locally and it works for me (I used my own oauth credentials). Can you check your credentials?

nicola commented Jun 27, 2012

I did check my credentials, on my laptop it doesnt work if I upload it on heroku it works, is there a way where I can debug ?

thanks n

Owner

ttezel commented Jun 27, 2012

okay can you check if both your laptop and your heroku server are running the latest version of twit? (0.1.6)

nicola commented Jun 30, 2012

yep, same, could it be a twitter api IP blocking? if yes how can I debug that?

Owner

ttezel commented Jul 1, 2012

yes, twitter's API could be blocking you. Try doing a REST API request like so:

T.get('statuses/home_timeline', function (err, reply) {
  if (err)
    return console.log('err', err)

  console.log('reply', reply)
})

the err object should give you a HTTP status code and any additional info provided by twitter. You should still be able to make unauthenticated requests by the way.

nicola commented Jul 3, 2012

i didn't get any error.
REST Apis works streaming not, they just stopped on the server too. :(

nicola commented Jul 3, 2012

How do I check if I'm blocked or not? is there any work around one could do? (e.g. also if I use another APP/User it still doesnt work)

zeiban commented Aug 3, 2012

I would like to second this issue. I'm running into the same problem. Sometimes it will work for hours and then suddenly stop working. Sometimes when I stop node and restart it will start working again but not always. I don't think its a rate limit and I'm not blocked as I can still make non-streaming requests. I'm running node v0.8.4 on a windows desktop. I thought maybe is was a limit with the nodes request pool but twit is the only code making requests.

Owner

ttezel commented Aug 3, 2012

Hey, this usage looks fine. Are you running the latest version of twit?

On Wed, Jun 27, 2012 at 5:30 AM, nicolagreco <
reply@reply.github.com

wrote:

Hello there, (yesterday it worked, but I tried today and it didnt, I
didnt change anything in the code),

here is my code:

var Twit = require('twit');
var T = new Twit({
consumer_key: 'vroFy2FvOR8En7saib1A4Q'
, consumer_secret:
'pRJXMsAkc9zgVNUIWdWZQcyoekjBXzun4WTP5DmK2w'
, access_token:
'10818002-mTYGgcJV6Aq0hyPxG4TgyM5Izxy438b5WSYu58zli'
, access_token_secret: 'vMRlH3gxHlmzZBbYMiqNoXHPWtpSe7iK8Fcvjg'
});

var stream = T.stream('statuses/sample');
stream.on('tweet', function (tweet) {
console.log(tweet);
});
stream.on('error', function (tweet) {
console.log(tweet);
});

stream.on('limitation', function (tweet) {
console.log(tweet);
});

if I do node twit.js I get nothing and the process ends, any hints?


Reply to this email directly or view it on GitHub:
#9

zeiban commented Aug 3, 2012

Yes, I just ran a 'npm install twit' just in case. I've got it running now with some extra debug output in the twit module. Hopefully it will give me a clue to what's going on when it happens again.

zeiban commented Aug 3, 2012

I don't know if this is related but started listening on the 'error' events to try and figure out whats going on. I noticed that I'm getting the occasional parse error. It looks like it trying to parse a partial json string. Here is an example.

Error parsing twitter reply: d_image_url_https":"https:\/\/si0.twimg.com\/profile_background_images\/583032745\/0bpxjch9h6zpm2r16v7x.jpeg","profile_sidebar_fill_color":"ffffff","url":"http:\/\/blog.logitech.com\/","name":"Logitech","listed_count":1337,"profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1675825649\/Logitech-logo-stacked-100k_normal.jpg","id":10503302,"verified":true,"time_zone":"Pacific Time (US & Canada)","utc_offset":-28800,"profile_sidebar_border_color":"ddffcc"},"id":231463598440988673}

This obviously this isn't a complete JSON string. Almost like the first part was lost.

Owner

ttezel commented Aug 3, 2012

Yes, this is normal. Twitter sometimes doesn't send parseable JSON data.
Twit parses any reply received from twitter, and the 'error' event is
called for replies from Twitter that are not parseable. Otherwise if the
reply is parseable, the object would be emitted from the 'tweet', 'limit'
or other events (as specified in the readme).

Cheers,

Tolga

On Fri, Aug 3, 2012 at 3:05 PM, zeiban <
reply@reply.github.com

wrote:

I don't know if this is related but started listening on the 'error'
events to try and figure out whats going on. I noticed that I'm getting
the occasional parse error. It looks like it trying to parse a partial json
string. Here is an example.

Error parsing twitter reply: d_image_url_https":"https:\/\/si0.twimg.com \/profile_background_images\/583032745\/0bpxjch9h6zpm2r16v7x.jpeg","profile_sidebar_fill_color":"ffffff","url":"http:\/\/ blog.logitech.com \/","name":"Logitech","listed_count":1337,"profile_image_url_https":"https:\/\/ si0.twimg.com\/profile_images\/1675825649\/Logitech-logo-stacked-100k_normal.jpg","id":10503302,"verified":true,"time_zone":"Pacific Time (US & Canada)","utc_offset":-28800,"profile_sidebar_border_color":"ddffcc"},"id":231463598440988673}

This obviously inst a complete JSON string. Almost like the first part was
lost.


Reply to this email directly or view it on GitHub:
#9 (comment)

zeiban commented Aug 3, 2012

So, the tweet is just lost if twitter decides to send partial json data? The tweet from my stream in the above example from logitech was never emitted as a 'tweet'. I only know about it because I caught the 'error' event for it.

Owner

ttezel commented Aug 3, 2012

Twit follows twitter's guidelines for parsing tweets from the streaming API
(https://dev.twitter.com/docs/streaming-apis/processing#Parsing_responses).
The tweet is not 'lost', if you want to recover the information in that
tweet, you have to have a callback listening on the 'error' event, and
attempt to parse the partial JSON string from twitter. I have not
implemented this yet. Is that something you would want as a user? Also, did
you not find a valid version of that same tweet while listening on the
'tweet' event?

Regards,

Tolga

On Fri, Aug 3, 2012 at 3:32 PM, zeiban <
reply@reply.github.com

wrote:

So, the tweet is just lost if twitter decides to send partial json data?
The tweet from my stream in the above example from logitech was never
emitted as a 'tweet'. I only know about it because I caught the 'error'
event for it.


Reply to this email directly or view it on GitHub:
#9 (comment)

zeiban commented Aug 3, 2012

No, that specific tweet from logitech was never emitted as a 'tweet' event. I've got other examples as well. I seems to happen about once every hour. Maybe it depends on the tweet contents?. I never noticed it until I started logging the 'error' event for the other issue. Would it help to try and pull the entire tweet using the REST api? I'll have to wait for one that doesn't cut off the id_str so I can get the ID.

kai-koch commented Aug 5, 2012

I can confirm problems with the streaming API, too
I connected to the sample endpoint for an hour.

The tweet through put dropped from 23 tweets to 7,5 Tweets per second in the end.
If I run this test from home, I start with slower through put and with faster decline of the through put rate. (Slower CPU and lower Bandwidth)

Watching the Memory consumption of the node process, my guess is:

  • there is a memory leak somewhere in the code
  • the Garbage Collection starts consuming more and more CPU time
  • until the process has 100% CPU load

I will run a 24 hour test on the server to see if I get an out of memory error.
Just started:
[2012-08-05T09:28:22.291Z] Collector Server started up
[2012-08-05T09:29:04.354Z] Current rate: 24.035 Chunks/per second
Total count(1011) Total Error (0)
[2012-08-05T09:29:48.419Z] Current rate: 23.256 Chunks/per second
Total count(2003) Total Error (0)
[2012-08-05T09:30:41.490Z] Current rate: 21.581 Chunks/per second
Total count(3004) Total Error (0)
[2012-08-05T09:31:45.557Z] Current rate: 19.693 Chunks/per second
Total count(4003) Total Error (0)
[2012-08-05T09:32:59.624Z] Current rate: 18.047 Chunks/per second
Total count(5005) Total Error (0)
[2012-08-05T09:34:08.693Z] Current rate: 17.324 Chunks/per second
Total count(6001) Total Error (0)
[2012-08-05T09:35:38.020Z] Current rate: 16.072 Chunks/per second
Total count(7003) Total Error (0)
[2012-08-05T09:37:10.363Z] Current rate: 15.17 Chunks/per second
Total count(8011) Total Error (0)
[2012-08-05T09:38:46.570Z] Current rate: 14.434 Chunks/per second
Total count(9011) Total Error (0)

As you can see in the first few minutes the through put is already declining.
I will report back on this in 24 Hours.
I will "beatify" and review the code in my own fork and look if I can spot a memory leak there.
or in the oauth modul.

On the "partial json string" Issue:
It is a bug in the parser!
benbarnett has a fork of twit, where the parser problem is fixed.
See: https://github.com/benbarnett/twit/blob/master/lib/parser.js

kai-koch commented Aug 5, 2012

[2012-08-05T10:19:44.580Z] Current rate: 8.112 Chunks/per second
Total count(25004) Total Error (0)
[2012-08-05T10:23:51.197Z] Current rate: 7.811 Chunks/per second
Total count(26003) Total Error (0)
[2012-08-05T10:28:12.649Z] Current rate: 7.53 Chunks/per second
Total count(27036) Total Error (0)
[2012-08-05T10:32:45.391Z] Current rate: 7.249 Chunks/per second
Total count(28003) Total Error (0)
[2012-08-05T11:20:04.756Z] Current rate: 4.343 Chunks/per second
Total count(29109) Total Error (0)

I terminate the test rate declined to fast,

zeiban commented Aug 5, 2012

Great, I'm glad I was able to help identify the issue or at least point you in the right direction. I'll take a look at benbarnett's parser fix. Thanks!

kai-koch commented Aug 6, 2012

I wrote a test to track the stream performance.
I hate to admit it, but the memory leak I reported was caused by the changes I made to my local codebase.

When running the test against a clean install of twit, it performs at about 55 chunks / per second, with 5% cpu load and about 30 mb of ram consumption. Which seems about right on my old notebook.

To get the test see:
https://github.com/kai-koch/twit/blob/master/tests/streamPerformance.js
It runs with the current Version of twit. Don't mind the new handler for warning, status_withheld, user_withheld.
They do not fire in the unmodified version of twit.

kai-koch commented Aug 6, 2012

I fixed the parser in my fork, seems that String.slice leaks memory in node.js or blocks or something.
The parser might be optimized with using regular expressions instead of String.split, but I am to dumb to get the Regex right. :-/

Contributor

matteoagosti commented Aug 21, 2012

I tried a slightly different approach in my fork, but as you did I only limited changes to parser.js
https://github.com/matteoagosti/twit

Contributor

matteoagosti commented Aug 21, 2012

@kai-koch I runned your benchmarking test and with the sample stream I've got an average of 65 Chunks/per second, 0.4% CPU consumption and 16MB of memory allocated.

By changing the end point to "statuses/filter" and tracking "google" I've got an average of 3.501 Chunks/per second, 0.5% CPU consumption and 18MB of memory allocated.

@matteoagosti
I am running your parser currently on winXP on a single core P4 with 2.60 Gh from home.
I get about 30 Messages per second. memory is around 20 mb with peak at 30mb and ~2% cpu usage +-2%

Your Parser does not check, if you get more than one Tweet in one chunk, so i get errors, if two tweets are present in one chunk.

[2012-08-21T08:23:22.584Z] Current rate: 30.063 Chunks/per second
Total(76659) Error(3) Tweets(70058) Deletes(6598) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0)

Contributor

matteoagosti commented Aug 21, 2012

You are right actually it fails when 2 tweets are in the same chunk. To prevent parsing of already parsed data I was starting from the incoming chunk rather than from scratch. Simply getting rid of this fixes the issue.

@matteoagosti: 2 short remarks on your code:

line 7 -13

var Parser = module.exports = function ()  { 
  this.message = '';

  EventEmitter.call(this);
};

util.inherits(Parser, EventEmitter);

Wouldn't it be better to declare this.message as

Parser.prototype.message = ''

after util.inherits(...);
I ask this because I do not know exactly how the call to the parrent constructor and util.inherits(..) works.
Do they overwrite all prototype properties defined, before they were called, or do they simply add the parrent prototype properties?

on line 31:
the start variable is declared a second time.

Contributor

matteoagosti commented Aug 21, 2012

Declaring the variable as you suggested would lead to a shared variable across Parser instances (even though there is actually only one instance).

Regarding line 31 you are right, I forgot to update it as it is coming from the original code; I tried to minimize the number of changes to ease the pull request.

I also updated again my fork and increased the parse speed by preventing a useless splice
matteoagosti/twit@e907014

I created a test to check what really happens during each chunk parse and I noticed that it may happen that you get 8 simultaneous Tweet :P Very unpredictable

@matteoagosti:
I am currently running my latest parser version https://github.com/kai-koch/twit/blob/master/lib/parser.js, and got about 10% better performance than with your old parser. and about 13 mb memory usage (peak 27mb)

When it's done, I try your new one.

[2012-08-21T10:19:22.721Z] Current rate: 33.691 Chunks/per second
Total(37396) Error(0) Tweets(34376) Deletes(3020) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0)

Contributor

matteoagosti commented Aug 21, 2012

@kai-koch
I finished testing my new parser, these are the results
[2012-08-21T11:42:18.212Z] Current rate: 36.204 Chunks/per second
Total(260495) Error(0) Tweets(243673) Deletes(16822) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0)
Server shutdown. Uptime: 119.92008333333334 min
Starttime was: 2012-08-21T09:42:23.007Z

I'll test your new one also.

p.s.
Node 0.8.5, MacBook Air 1.7GHz i5
Avg CPU 1.3%, Avg Mem 14MB

@matteoagosti:
I think twitter is messing with us. :)
I get about 42 chunks per second currently, with your new parser. :)

[2012-08-21T11:58:20.680Z] Current rate: 42.117 Chunks/per second
Total(82970) Error(0) Tweets(77552) Deletes(5418) Warnings(0) Limits(0) Scrub_geos(0) Status_withhelds(0) User_withhelds(0)

ps: node 0.8.6

Contributor

matteoagosti commented Aug 21, 2012

@kai-koch
LOL, yep :) The stream is not guaranteed to go always at the same pace, that's why we notice the differences. However I think that in a time span of 2 hours you get more or less a good overview. In general I think that the difference in our solutions is mainly into the way we do the parse:

  • I use the original approach by looking each char and going on until the end of the string
  • You rely on the split function and cycle over the resulting array

We could benchmark both approaches. However right now I'm fine with any of the two as finally I got back the stream working :P

kai-koch commented Sep 6, 2012

@matteoagosti: Found a bug in my parser:
On low traffic streams Twitter sends '\r\n' to keep the connection alive. So a check for empty array elements is needed or we would get a Syntax Error: "Unexpected end of input" from JSON.
Fix: kai-koch/twit@cbb7bfe

Your parser does that already.

Owner

ttezel commented Sep 20, 2012

Closing this issue as I merged in updates to the parser from @matteoagosti.

@ttezel ttezel closed this Sep 20, 2012

zeiban commented Sep 20, 2012

Great!

nicola commented Oct 20, 2012

I continue on getting nothing from Twitter :(

var Twit = require('twit')

var T = new Twit({
consumer_key: 'xxx'
, consumer_secret: 'xxx'
, access_token: 'xxx'
, access_token_secret: 'xxx'
});

var stream = T.stream('statuses/sample')

stream.on('tweet', function (tweet) {
console.log(tweet)
})

Owner

ttezel commented Oct 21, 2012

hey @nicolagreco, your usage of twit seems fine. Most likely the reason why you aren't getting tweets is one of two reasons:

  1. your Oauth credentials are invalid, or
  2. you are opening multiple streams to the Twitter API, and twit is closing the one you're currently running.

Can you try running the code below? What statusCode do you get? And are you getting any 'disconnect' messages?

Before you run the code though, make sure to update twit to the latest version (run npm update twit). You should be using v1.0.2.

var T = new Twit({
    consumer_key: 'xxx'
  , consumer_secret: 'xxx'
  , access_token: 'xxx'
  , access_token_secret: 'xxx'
})

var stream = T.stream('statuses/sample')

stream.on('tweet', function (tweet) {
  console.log(tweet)
})

stream.on('disconnect', function (disconn) {
  console.log('disconnect')
})

stream.on('connect', function (conn) {
  console.log('connecting')
})

stream.on('reconnect', function (reconn, res, interval) {
  console.log('reconnecting. statusCode:', res.statusCode)
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment