### **Network Analysis**

<font color="red">File access required:</font> In Colab this notebook requires first uploading files **Friends.csv**, **Follows.csv**, and **Dolphins.csv** using the *Files* feature in the left toolbar. If running the notebook on a local computer, simply ensure these files are in the same workspace as the notebook.

In [None]:
# Setup
import csv
import matplotlib.pyplot as plt
%matplotlib inline
import networkx as nx

**Look at CSV files:** node pairs

### Undirected graph (Friends)

In [None]:
# Load graph from CSV file with no header
f = open('Friends.csv')
G = nx.read_edgelist(f, delimiter=',', nodetype=str)
print(G)

In [None]:
# Display graph
# First two lines size drawing for Jupyter notebook
# Note layout differs each time
plt.figure(figsize=(7,7))
plt.margins(x=0.1, y=0.1)
nx.draw(G, with_labels=True, node_size=1500, node_color='c')

In [None]:
# Density of graph
numnodes = G.number_of_nodes()
numedges = G.number_of_edges()
possedges = G.number_of_nodes() * (G.number_of_nodes()-1)
print('Number of nodes:', numnodes)
print('Number of edges:', numedges)
print('Possible edges:', possedges)
print('Density (edges divided by possible edges):', numedges/possedges)

In [None]:
# Did previous cell get right answer?
print('Using density function:', nx.density(G))

In [None]:
# Diameter and overall average shortest distance
print('Diameter:', nx.diameter(G))
print('Average shortest distance:', nx.average_shortest_path_length(G))

In [None]:
# Maximal cliques
cliques = nx.find_cliques(G) # cliques is iterator
for c in cliques:
    print(c)
# Try print cliques
# Modify code to only print cliques > 2

In [None]:
# Iterating through nodes of the graph
for n in G:
    print(n)

In [None]:
# Number of friends -'degree' is number of edges incident on a node
numfriends = G.degree
print(numfriends)
# for n in numfriends:
#    print(n[0], 'has', n[1], 'friends')
# Or can treat list of pairs like a dictionary
# for n in G:
#    print(n, 'has', numfriends[n], 'friends')

In [None]:
# Friends lists
for n in G:
    print(n, 'has friends:')
    friends = G.neighbors(n) # friends is iterator
    for f in friends:
        print(' ', f)

In [None]:
# Friends lists v2
for n in G:
    print(n, 'has friends:', list(G.neighbors(n)))

In [None]:
# Closeness centrality - average shortest distance to other nodes, normalized on reverse 0-1 scale
cc = nx.closeness_centrality(G)
print(cc)
# sorted_keys = sorted(cc, key=cc.get, reverse=True)
# print(sorted_keys)
# for k in sorted_keys:
#    print(k, 'has closeness centrality', cc[k])

In [None]:
# Betweenness centrality - number of shortest paths it's on, normalized on 0-1 scale
bc = nx.betweenness_centrality(G)
sorted_keys = sorted(bc, key=bc.get, reverse=True)
for k in sorted_keys:
    print(k, 'has betweenness centrality', bc[k])

### Directed graph (Followers)

In [None]:
# Load graph from CSV file with no header
f = open('Follows.csv')
D = nx.read_edgelist(f, delimiter=',', nodetype=str, create_using=nx.DiGraph())
print(D)

In [None]:
# Display graph
plt.figure(figsize=(8,8))
plt.margins(x=0.1, y=0.1)
nx.draw(D, with_labels=True, node_size=1500, arrows=True, node_color='c')

In [None]:
# Number of follows and followers
follows = D.out_degree
followers = D.in_degree
print('Number of follows: ', follows)
print('Number of followers: ', followers)
# Can treat list of pairs like a dictionary
# for n in D:
#    print(n, 'follows', follows[n], 'and has', followers[n], 'followers')

In [None]:
# Reciprocity - people that follow each other (see what's wrong and fix it)
for n in D:
    print(n, list(D.neighbors(n)))
#    for n2 in list(D.neighbors(n)):
#        if n in list(D.neighbors(n2)):
#            print(n, 'and', n2, 'follow each other')

In [None]:
# Cycles
cycles = nx.simple_cycles(D)
for c in cycles:
    print(c)

In [None]:
# Alternative reciprocity
cycles = nx.simple_cycles(D)
for c in cycles:
    if len(c) == 2:
        print(c[0], 'and', c[1], 'follow each other')

### <font color="green">**Your Turn: Dolphins Data**</font>

In [None]:
# Load the dolphin friends data and print the graph
f = open('Dolphins.csv')
G = nx.read_edgelist(f, delimiter=',', nodetype=str)
print(G)
nx.draw(G, node_size=100)

<b>Friendliest dolphin:</b><br><i>Find the dolphin with the most friends. Don't worry about ties - there's one who's the friendliest. Print the dolphin's identifier and the number of friends the dolphin has.</i>

In [None]:
# Reminder: G.degree returns a list with the number of edges incident on each node
# Hint: Iterate through the G.degree list keeping track of the dolphin with the most
# friends, and how many friends it has
YOUR CODE HERE

<b>Dolphin friend recommendation, a type of "link prediction":</b><br>
<i>Find all pairs of dolphins who are not friends but have at least four friends in common. Print each pair only once, and include a list of the friends they have in common.</i>

In [None]:
# The following code finds the common friends of all pairs of dolphins in the graph
# (without eliminating self-pairs or reverse-pairs).
# You can use it as a starting point for solving the problem.
for n1 in G:
    for n2 in G:
        common = set(G.neighbors(n1)) & set(G.neighbors(n2)) # performs intersection of two lists
        print('Dolphins', n1, 'and', n2, 'have friends', list(common), 'in common')