Download random tweets in real-time from twitter.
Clojure
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
.gitignore
README.md
credentials.clj
project.clj
proxy-sample.clj

README.md

Twitter Sampler

I wanted to play around with a corpus of tweets, but none is openly distributed due to twitter terms of services. So I built this utility to create my own.

Usage

This program uses twitter streaming api, which can only be accessed by authenticated user. In order to create your credentials, you must create a twitter application using https://dev.twitter.com/apps/new.

Once your application is created, you can downland a sample configuration file and fill in the blanks. You can then download twitter-sampler-1.0.0-standalone.jar and run it:

java -jar twitter-sampler-1.0.0-standalone.jar -c credentials.clj -n 1000 -t '#clippers' tweets.json

where credentials.clj contains your twitter credentials, -n specifies the number of tweets to download, -t specifies an optional coma separated list of keywords or hash to track and tweets.json is the file where tweets are saved. You can also specify a proxy configuration file using -p, see proxy-sample.clj for a sample configuration file.

For details about tweets structure, see https://dev.twitter.com/docs/platform-objects/tweets.

How does it work?

This application is built with clojure and uses twitter-api to download tweets from the /statuses/sample or /statuses/filter endpoints of the twitter api.

License

Copyright (C) 2012-2014 Alexandre Patry

Distributed under the Eclipse Public License, the same as Clojure.