TSM - Twitter Subgraph Manipulator
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
TSM demo files.zip



Twitter Subgraph Manipulator by Deen Freelon

In short, TSM is a Python module that contains a few functions for analyzing Twitter and Twitter-like (i.e., directed and very sparse with community structure) network data. I wrote it for my own research purposes but thought someone out there might find it useful.

Installation instructions

Now available in PyPI! Just use: pip3 install tsm.

Alternatively, you can simply install TSM's dependencies manually and move tsm.py into your PYTHONPATH directory.


Here are some of the things TSM can do:

  • Support very large communities (millions of nodes/edges)--the only limit is your computer's memory
  • Extract retweets and @-mentions into edgelist format for network analysis and visualization
  • Partition networks into communities, isolate the N largest communities, and identify the most-connected users in each community
  • Measure the insularity of network communities (using EI indices) to determine the extent to which each looks like an echo chamber
  • Measure the overlap between network communities to determine which ones interact more and less often
  • Get the top retweets in a Twitter dataset and rank them by N of retweets and by community
  • Track Twitter (or other) communities over time: compute similarity scores (weighted or unweighted Jaccard coefficients) for partitioned network communities drawn from the same dataset at two different time slices
  • Discover which nodes intermediate between which communities
  • Find the most-used hashtags in each community (or dataset)
  • Find the most-used hyperlinks or web domains in each community (or dataset)

See tsm.py for a full description of TSM's functions and how to use them. The module should work as long as NetworkX and python-louvain are installed.


Here's what you need to use TSM:


The TSM demo files.zip file contains two IPython notebooks and a Twitter ID file that can be used to demo many of TSM's functions. Code and instructions are provided to hydrate the Twitter ID file. For testing purposes, here is a very brief (fabricated) sample demonstrating how input data for the t2e function should be formatted in a plain text file. This sample was created by Devin Gaffney (@DGaffney):

dgaff,"Twitter is pretty fun, isn't it, @dfreelon?"
dfreelon,Yes indeed @dgaff - @some_other_user weigh in?
some_other_user,Of course twitter is grand. Mostly because of @dril.
dril,Some weird tweet no one understands but everyone favorites @some_other_user
cnnbrk,Looks like @dril just tweeted


I gratefully acknowledge funding support from the US Institute of Peace in creating this module.