Download scripts for distributing twitter data.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Semeval Twitter data download script

For downloading tweets distributed using IDs to protect privacy. Uses the format of the Semeval Twitter sentiment analysis dataset


sixohsix/twitter tqdm/tqdm

easy_install twitter
easy_install tqdm


The first time you run this, it should open up a web browser, have you log into twitter, and show a PIN number for you to enter into a prompt generated by the script.

  1. Login to Twitter with your user name in your default browser.
  2. Run the script like this to download your credentials: python --dist=tweeti-a.dist.tsv
  3. Download tweets like so:
python --dist=tweeti-a.dist.tsv --output=downloaded.tsv

-Note that it takes about 18 hours to download the Semeval sentiment analysis training dataset.

Restarting after a partial download:

In case the script hangs in the middle of the download for whatever reason, use the --partial argument to specify the file containing partially downloaded results.
This way you won't have to start from scratch again:

python --dist=tweeti-a.dist.tsv --partial=downloaded.tsv --output=downloaded2.tsv

Task A Mention Test Script

To print out the mentions and annotations from task A you can use the script like so:

python downloaded.tsv

This just prints out the mentions with sentiment annotations for easier inspection.


  • You may need to manually change the link that is printed out for authorization to use https:// instead of http://
  • The time on your computer needs to be set accurately. Thanks to Canberk for noting this on the email list.