## Instructions

Execute the cells and also write your answer in the required spaces.

Make sure you update the `__author__` variable's value with your own name.

In [None]:
%matplotlib inline

import networkx as nx
import matplotlib.pyplot as plt
import seaborn as sns

import numpy as np
import pandas as pd

from datetime import datetime
import sys
sys.path.append('../../helper_libraries')

__author__ = "Wonjun Choi"
__completion_time__ = datetime.now()

print("Submission created by {} at {}".format(__author__, __completion_time__))

In [None]:
from utilities import read_UCINET_matrix, get_all_node_metrics, get_all_graph_metrics
from utilities import get_nodes_as_dataframe
from utilities import plot_network, run_all_simulations

In [None]:
sns.set_context("poster")
sns.set_style("ticks")

In [None]:
sheet_name="FRIENDSHIP"
G = read_UCINET_matrix(
    "../../data/krack-high-tec.xlsx",
    sheet_name=sheet_name,
    attribute_file="../../data/high-tec-attributes.xlsx",
    directed=False
)

## Node metrics
a) What are the top 3 nodes based on each of the centrality measures? How are these nodes different from the ones identified in the directed network in previous homework?

In [None]:
df_node_metrics = get_all_node_metrics(G)
df = get_nodes_as_dataframe(G)
df

## Explore relation between node attributes and metrics

Try running the cell below by choosing different values for each of the variables (make sure you omit the quotes where mentioned):

* `x`: 'AGE', 'TENURE'
* `y`: 'degree', 'betweenness', 'closeness', 'eigenvector', 'clustering'
* `hue`: None, 'LEVEL', 'DEPT'
* `col`: None, 'LEVEL', 'DEPT'

Report the relationships observed in the plots below and explain your findings.

b) What patterns do you observe between various attributes and the node metrics using the plots? Explain your answer using the screen shot of the plots (you can also save the plots as a figure by right clicking and selecting “save image as…”)

In [None]:
x="AGE"
y="degree"
hue="LEVEL"
col="DEPT"
g = sns.lmplot(
    x=x,
    y=y,
    hue=hue,
    col=col,
    data=df,
    ci=None,
    truncate=True, size=5
)

## Plot the data based on various properties of the graph

c)	Select one node metric (degree, betweenness, closeness, eigenvector, clustering) to size the node and one node attribute (DEPT, LEVEL) to plot the network diagram.

You can achieve this by setting the value for `size_attribute` and `color_attribute`

In [None]:
size_attribute="AGE"
color_attribute="LEVEL"
plot_network(
    G,
    node_sizes=size_attribute,
    node_color_col=color_attribute,
    factor=10
)

## Network simulation for comparison with network topology

d)	Run the network simulation and assess if the observed network is likely to be formed by: small world model, random graph model, preferential attachment model. Provide justification for each matching model and each rejected model.

In [None]:
run_all_simulations(G, iters=20)

## Download the notebook and upload the moodle

e)	Download the notebook by selecting “File > Download as > Notebook (.ipynb)”. Upload your notebook to moodle. 