This notebook is an exploration of data where an anonymised group of people have answered a number of questions about:

- how well they know each other student (scale from 1 = don't know them, to 5 = we're best friends)
- their student experiences (scale from 1 = I didn't think about doing it, to 5 = I do it almost every day)
- their personality traits (scale from 1 = Not like me at all, to 7 = A lot like me)

Data in the file looks like this:
- unique name,
 - What is your name?,
- personality questions, 
 - "Anxious, easily upset",
 - "Calm, emotionally stable",
 - "Conventional, uncreative",
 - "Critical, quarrelsome",
 - "Dependable, self-disciplined",
 - "Open to new experiences, complex",
 - "Disorganized, careless",
 - "Extraverted, enthusiastic",
 - "Sympathetic, warm",
 - "Reserved, quiet",
- experience qiestions, 
 - Cooked a meal with others,
 - Climbed Arthur's Seat,
 - Cycled,
 - Danced to the music with others,
 - Go to Commonwealth Swimming pool,
 - Go to the university gym (pleasance),
 - Learned some words in a completely new language,
 - Made new friends for life,
 - Performed in a team sports,
 - Read a fiction book,
 - Saw a long-haired ginger Scottish cow,
 - Spoke to a person born in Edinburgh,
 - Tried eating haggis,
 - Visited Botanic Gardens,
 - Visited Castle,
 - Walk around the meadows,
 - Went out for a meal with friends,
 - Went to the cinema,Went to the Student Union,Went to the university library,
- familiarity with other students
 - p0,p1,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p2,p20,p21,p22,p23,p24,p25,p26,p27,p28,p29,p3,p4,p5,p6,p7,p8,p9

then each row of data will look like this:

```
p0,6,5,2,2,5,7,2,5,5,4,1,1,1,1,0,0,1,1,1,1,1,1,0,1,0,1,1,1,1,1,5,4,3,2,3,3,3,3,3,4,3,4,3,5,3,3,3,3,3,3,1,4,3,4,4,3,3,4,4,1
p1,5,2,5,5,7,7,5,7,7,1,1,1,0,1,0,1,1,1,1,0,0,1,0,0,1,1,1,0,0,1,5,1,5,2,1,2,2,4,5,5,1,4,5,5,5,2,5,1,3,5,5,5,5,5,2,4,4,4,5,5
```
We will draw some graphs and look at some simple connections between things.

In [None]:
# !pip install --upgrade decorator
# !pip install --upgrade networkx
# !pip install scipy==1.8.0 # please make sure your scipy version is 1.8.0 in order to run scripy.sparse.coo_array in the package of network
# !pip install --upgrade requests

If you get something wrong, please run the above code

In [None]:
import networkx as nx
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pprint as pp

# Collect network information from dataframe

In [None]:
df = pd.read_csv('data/graph_large.csv', index_col = 0)
print(df)

In [None]:
# Get all columns names
col_names = list(df.columns)
print(col_names)

row_names = df.index.values.tolist()
print(row_names)

In [None]:
# First 10 attributes are personality questions
personality_questions = col_names[:10]
print(personality_questions)

In [None]:
# Last 30 attributes are anonymised students
people = col_names[-30:]
print(people)

In [None]:
# The remaining parts are experience questions
experiences_questions = col_names[10:-30]
print(experiences_questions)

# Build the network with weights

Based on the given `df` and its relevant info, please build the network with the weights

In [None]:
DG = nx.DiGraph()
for row, row_values in df.iterrows():
    print('\nNode: ', row)
    for column, column_values in enumerate(row_values):
        #### optional: only keep connections >3
        isThisAPersonColumn = df.columns[column] in people
        if(column_values > 3 and isThisAPersonColumn): # only connect nodes if this is a 'person' column
            print('Connected to ', df.columns[column], ' with weight: ', column_values)
            DG.add_edge(row, df.columns[column], weight=column_values)

# Visualisation

In [None]:
plt.rcParams['figure.figsize'] = [15, 15]
# Note: we will reuse the same spring layout thoughout the graphs
pos = nx.spring_layout(DG)

nx.draw(DG, pos, with_labels= True, node_size = 500)
plt.show()

# Styling plots with graph metrics

In [None]:
degree = nx.degree_centrality(DG)
betweenness = nx.betweenness_centrality(DG)
pagerank = nx.pagerank(DG)
hits = nx.hits(DG)

In [None]:
pp.pprint(degree)

# pos = nx.spring_layout(DG)

size = [value * 1000 for value in degree.values()]

nx.draw(DG, pos, with_labels= True, node_size = size)
plt.title("Size scaled to degree")
plt.show()

In [None]:
pp.pprint(betweenness)

# pos = nx.spring_layout(DG)

size = [value * 10000 for value in betweenness.values()]

nx.draw(DG, pos, with_labels= True, node_size = size)
plt.title("Size scaled to betweenness")
plt.show()

In [None]:
pp.pprint(pagerank)

# pos = nx.spring_layout(DG)

size = [value * 10000 for value in pagerank.values()]

nx.draw(DG, pos, with_labels= True, node_size = size)
plt.title("Size scaled to PageRank")
plt.show()

In [None]:
pp.pprint(hits[0])

# pos = nx.spring_layout(DG)

size = [value * 10000 for value in hits[0].values()]

nx.draw(DG, pos, with_labels= True, node_size = size)
plt.title("Size scaled to hub scores")
plt.show()

In [None]:
pp.pprint(hits[1])

# pos = nx.spring_layout(DG)

size = [value * 10000 for value in hits[1].values()]

nx.draw(DG, pos, with_labels= True, node_size = size)
plt.title("Size scaled to authority scores")
plt.show()

# Look at experience and personality

In [None]:
# this cell will take a long time to render.
# if your notebook froze completely, in the menu you can Kernel > Restart

print("Edinburgh Experiences")

# pos = nx.spring_layout(DG)

# experiences_questions personality_questions
size = [value * 500 for value in degree.values()]

"""
Please be patient, the code needs some times to plot ALL
""" 
for experience in experiences_questions:
    colors =  ['red' if value == 1 else 'grey' for value in df[experience]]
    nx.draw(DG, pos, with_labels= True, node_size = size, node_color = colors)
    plt.title(f"Size scaled to {experience}")
    plt.show()

In [None]:
print("personality")

# pos = nx.spring_layout(DG)

"""
Please be patient, the code needs some times to plot ALL
""" 

for personality in personality_questions:
    size = [value * 500 for value in df[personality]]
    nx.draw(DG, pos, with_labels= True, node_size = size, cmap=plt.cm.Blues)
    plt.title(f"Size scaled to {personality}")
    plt.show()

# Do you have any better visualistion idea?