twitter_transcripts

Tools for processing twitter transcripts. Warning: these scripts are awful and ugly, so if you can possibly use anything else, please do so.

Workflow:

For each transcript:

Load up the conversation in Hootsuite Dashboard, select all the tweets you want to archive, and then view the source of this selection. Save the HTML source to a file. These scripts assume the specific HTML format used by Hootsuite, so they won't work if you copy HTML from any other twitter viewer, such as twitter.com.

Then, pipe this saved html file to the process_hootsuite.py script like so:

cat tweets_2013-01-01.html | ./process_hootsuite.py > clean_tweets_reversed_2013-01-01.html

The resulting cleaned an reversed html is suitable for posting on the web, or pasting into an existing html document.

However, you will still want to visually inspect this file. It's possible that not all of the tweets were parsed correctly. In particular, look for any links that are missing their closing </a> tag

If that's all you want to do, you're done. But if you want to use the transcript analysis tools, continue on:

For all transcript files:

./tweet_stats.py clean_tweets_reversed_*.html | ./user_stats.py > output.tsv

Detailed explanation of steps:

Copy html source of tweets from HootSuite Dashboard.
Feed this html into process_hootsuite.py, which cleans up the HTML and reverses the chronological order.
Feed all of the cleaned-up html files into tweet_stats.py (which extracts username, date, and timestamp for each tweet)
Pipe the output of tweet_stats.py into user_stats.py (which counts number of tweets for each date and user and outputs tsv)
Generate charts using transcript_charts.R and the tsv output of user_stats.py

Example R output:

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
chart_examples		chart_examples
raw_html_logs		raw_html_logs
README.md		README.md
extract_tweet_urls.py		extract_tweet_urls.py
process_hootsuite.py		process_hootsuite.py
process_twitter_com.py		process_twitter_com.py
transcript_charts.R		transcript_charts.R
tweet_stats.py		tweet_stats.py
user_stats.py		user_stats.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

twitter_transcripts

Workflow:

For each transcript:

For all transcript files:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

twitter_transcripts

Workflow:

For each transcript:

For all transcript files:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages