# Follower network

Firstly, import the required modules, including NetworkX:

In [16]:
from pathlib import Path
import pandas as pd
import networkx as nx

## Data Loading
We will load the friend/follower relations between pairs of list members from friendships.csv into a Pandas Data Frame. 

In [17]:
in_path = Path("./business", "friendships.csv")
# create the Data Frame
df = pd.read_csv(in_path, sep='\t')
print("Read %d Twitter relations" % len(df))
# display a few rows
df.head(10)



Read 308 Twitter relations


Unnamed: 0,FRIEND,FOLLOWER
0,apbusiness,bbcbusiness
1,apbusiness,business
2,apbusiness,businessinsider
3,apbusiness,bw
4,apbusiness,cnnbusiness
5,apbusiness,fastcompany
6,apbusiness,financialtimes
7,apbusiness,forbes
8,apbusiness,foxbusiness
9,apbusiness,ft


## Creating a Directed Network

We will now construct a *directed unweighted network* such that:

- There is a node for each Twitter user.
- There is a directed edge between each unique user and follower pair.

First, get the set of all users:

In [18]:
# get set of all users
friends = set(df["FRIEND"].unique())
followers = set(df["FOLLOWER"].unique())
users = friends.union(followers)

# just print to see how many and what users we have
for index, element in enumerate(users, start=1):
    print(f"{index}: {element}")

1: foxbusiness
2: fastcompany
3: bw
4: businessdesk
5: ft
6: businessinsider
7: wsjecon
8: nasdaq
9: bbcworldbiz
10: financialtimes
11: cnnbusiness
12: nytimesbusiness
13: usatodaymoney
14: reutersbiz
15: ftfinancenews
16: harvardbiz
17: markets
18: forbes
19: marketwatch
20: apbusiness
21: entrepreneur
22: yahoofinance
23: irishtimesbiz
24: bbcbusiness
25: thestreet
26: indobusiness
27: reutersmoney
28: telebusiness
29: nbcnewsbusiness
30: yahoofinanceuk
31: wsjbusiness
32: business
33: wsj


Create a directed network, with a node for each user:

In [19]:
# here a DiGraph indicates a directed network
g = nx.DiGraph()
nodes = sorted(list(users))
for node in nodes:
    # we add the city name as an attribute
    g.add_node(node)

Create a directed edge between each unique friend and follower pair, based on the Twitter friendships.csv file

In [20]:
for i, row in df.iterrows():
    node1 = row["FRIEND"]
    node2 = row["FOLLOWER"]
    # ignore self-loops, in case they exist
    if node1 == node2:
        continue
    g.add_edge(node1, node2)

We can check the size of our new network:

In [21]:
print("Network has %d nodes and %d edges" % (g.number_of_nodes(), g.number_of_edges()))

Network has 33 nodes and 308 edges


In [22]:
nx.write_gexf(g, "twitter-directed.gexf")