# Follower network
We start from the Twitter follower network constructed for the paper [Right and left, partisanship predicts (asymmetric) vulnerability to misinformation](http://doi.org/10.37016/mr-2020-55). The following data files are available at https://doi.org/10.7910/DVN/6CZHH5:
* `anonymized-friends.json` 
* `measures.tab`

Briefly, this network was constructed as follows:
* We collected all tweets containing links (URLs) from a 10% random sample of public posts between June 1 and June 30, 2017, through the Twitter Decahose. 
* We selected all accounts that shared at least ten links from a set of news sources with known political valence (Bakshy et al., 2015). 
* We further selected those who shared at least one link from a source labeled as low-quality (https://github.com/BigMcLargeHuge/opensources). 
* We excluded likely bot accounts according to the BotometerLite classifier (Yang et al., 2020).

In [7]:
import networkx as nx
import csv
import json

In [4]:
path = "FollowerNetwork/"

In [19]:
# File has 3 columns: ID \t partisanship \t misinformation \n
partisanship = {}
misinformation = {}
with open(path + "measures.tab") as fd:
    rd = csv.reader(fd, delimiter="\t")
    next(rd) # skip header row
    for row in rd:
        partisanship[int(row[0])] = row[1]
        misinformation[int(row[0])] = row[2]

In [20]:
with open(path + 'anonymized-friends.json') as fp:
    adjlist = json.load(fp)

In [21]:
G = nx.DiGraph() 

In [26]:
# Directed network follower -> friend
for s in adjlist:
    n = int(s)
    if n in partisanship and n in misinformation:
        G.add_node(n, party=partisanship[n], misinfo=misinformation[n]) 
        for f in adjlist[s]:
            G.add_edge(n,f)

In [29]:
print("{} nodes and {} edges initially".format(G.number_of_nodes(), G.number_of_edges()))
friends = nx.subgraph(G, partisanship.keys())
print("{} nodes and {} edges after filtering".format(friends.number_of_nodes(), friends.number_of_edges()))

58048 nodes and 10499218 edges initially
15056 nodes and 4327448 edges after filtering


In [30]:
nx.write_graphml(friends, path + 'follower_network.graphml')

In [None]:
##### TAKE THE CORE K>=2?? TAKE LARGEST CONNECTED COMPONENT?? DOES IT LOOK LIKE THE ECHO CHAMBERS IN THE PAPER?

In [44]:
# sample edges and/or core and/or giant component
sample_p = 0.1
friends_sample = friends.copy()
deleted_edges = random.sample(friends_sample.edges(), int(friends_sample.number_of_edges() * (1-sample_p)))
friends_sample.remove_edges_from(deleted_edges)

In [45]:
print("{} nodes, {} edges".format(friends_sample.number_of_nodes(), friends_sample.number_of_edges()))

15056 nodes, 432745 edges


In [46]:
nx.write_graphml(friends_sample, path + 'follower_network_sample.graphml')