Generate second order ego networks for users from Twitter. Create egocentric graphs with the twitter api. Tweego allows you to resume data collection over multiple sessions.
pip install tweego
tweego [OPTIONS]
Options:
-fo, --first-order Flag: Collect first order ego
-u, --users Flag: Collect user data
-so, --second-order Flag: Collect second order ego
-g, --graph Flag: Generate graph file
-d, --dir PATH Directory to store data
-k, --keys-file PATH Location of the api keys JSON file
-n, --screen-name TEXT The screen name of the ego center user
-f, --follower-limit INTEGER Number of followers for the second order ego
--version Show the version and exit.
--help Show this message and exit.
Collect everything:
tweego -d "dataset" -k "keys.json" -n "github" -fo -u -so -g
Collect first order connections only:
tweego -d "dataset" -k "keys.json" -n "github" -fo
Collect users and second order connections only:
tweego -d "dataset" -k "keys.json" -n "github" -u -so
Tweego supports multiple api keys to speed up the data collection process. The api keys should be in a JSON file with the following format.
You can get these details from the Twitter developer website by creating a standalone app and then generating the keys and tokens.
[
...
{
"app_key": "<<app_key>>",
"app_secret": "<<app_secret>>",
"oauth_token": "<<oauth_token>>",
"oauth_token_secret": "<<oauth_token_secret>>"
},
{
"app_key": "<<app_key>>",
"app_secret": "<<app_secret>>",
"oauth_token": "<<oauth_token>>",
"oauth_token_secret": "<<oauth_token_secret>>"
},
...
]
Once Tweego is done, the folder structure should look like this:
dir
βββ users
β βββ user_id_1.json
β βββ user_id_2.json
β βββ ...
βββ screen_name
β βββ user_id_1
β | βββ user_id_1.txt
β βββ user_id_2
β | βββ user_id_2.txt
β βββ ...
βββ screen_name.txt
βββ screen_name.gml
The users
directory contains information about each user, the screen_name
directory contains the follower ids of users. The screen_name.gml
file contains the ego network and an application like Gephi
can be used to analyze it.