Skip to content

evancasey/twitter-stream-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

twitter-stream-pipeline

Easily collect tweets off of the Twitter streaming API and store them in SQLite3.

Setup

Create a tokens.py file with your Twitter API tokens:

CONSUMER_KEY = ''
CONSUMER_SECRET = ''
ACCESS_TOKEN = ''
ACCESS_TOKEN_SECRET = ''

Create a keywords.txt containing the words you'd like to query for, separated by line:

word1
word2

Running Locally

Start collecting tweets on your local machine:

$ python runner.py -k keywords.txt -e your_email@gmail.com

Running on AWS

If you'd like to collect tweets on a remote machine, set up a free tier AWS EC2 instance here.

On your EC2 instance, run:

$ sudo yum install git
$ sudo yum install python-pip
$ git clone https://github.com/evancasey/twitter-stream-pipeline.git
$ cd twitter-stream-pipeline && sudo pip install -r requirements.txt

Create your tokens.py and keywords.txt files and then kick off the runner, this time using the nohup command to ensure that it runs even when the stty is cut off:

$ nohup python runner.py -k keywords.txt -e your_email@gmail.com

About

Easily collect and store tweets from the Twitter Streaming API

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages