Skip to content

mdredze/twitter_stream_downloader

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

Twitter Stream Downloader

This tutorial provides an excellent introduction to collection Twitter data. It's more recent than updates to this library: http://socialmedia-class.org/twittertutorial.html

A simple Python script to download tweets from the Twitter streaming API. Works with API version 1.1.

The script to run is python/streaming_downloader.py.

This requires you to have a consumer_key, consumer_secret, access_token, and access_token_secret. To obtain these:

  • go to dev.twitter.com
  • login and create a new application
  • Create an access token
  • This will give you all four of the above. Remember, do not share these with anyone.

If you run the script with the --help flag it will show valid options.

The code creates files as year/month/timestamp.gz at least once every 24 hours. Changing this behavior isn't too hard but requires modifying the code.

The code requires tweepy. I am using version 3.8.0. https://github.com/tweepy/tweepy

The script also supports the flag "pid_file". This will create a file with the PID of the running job. This is useful if you want to create a cron job that watches the script to make sure it is still running.

stream_type: There are three supported stream types. location, keyword and sample. I didn't put in the username stream type, but it should be easy to add.

If you use a keyword file, the format should be: track=keyword1,keyword2,keyword3 ...

Location files are similar: locations=value1,value2,value3,value4

These files are provided using the "parameters-file" argument.

About

A simple Python script to download tweets from the Twitter streaming API. Works with API version 1.1.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages