## Assignment 6
This assignment requires you to work with Facebook network data, data preprocessing and `networkx`. Note that this is real data from real people!

### Part 1: Preparing data

The dataset you will be working with is available here: https://snap.stanford.edu/data/egonets-Facebook.html

You're first job is to: 

**1. Download the data**

In [1]:
import webget_v2

!python webget_v2.py https://snap.stanford.edu/data/facebook_combined.txt.gz

Downloading.. 


**2. Unpack the data**

In [2]:
import gzip
import shutil

with gzip.open('facebook_combined.txt.gz', 'rb') as f_in:
    with open('facebook_combined.txt', 'wb') as f_out:
        shutil.copyfileobj(f_in, f_out)

**3. Import the data as an undirected graph in `networkx`**

In [3]:
import networkx as nx
import matplotlib.pyplot as plt

edges = nx.read_edgelist('facebook_combined.txt')
g = nx.Graph(edges)

## Part 2: Analyse the data

Now, let's take a look at the network you imported. 

By *node degree* we mean the *number of edges to and from a node*. This is different in an undirected network, where in-degree == out-degree, and a directed network where in-degree != out-degree.

By *graph degree* we mean the *number of edges in the entire network*.

Hand-in code that display:
* **The number of nodes in the network**
* **The number of edges in the network**
* **The average degree in the network**

In [4]:
print(nx.info(g))




Name: 
Type: Graph
Number of nodes: 4039
Number of edges: 88234
Average degree:  43.6910


* **A visualisation of the network inside your notebook**

In [None]:
#Pygraphviz was not compatible with windows and adding pygraphviz to requirements for mybinder didn't work either,
# so the graph will not be the prettiest

def draw_graph(graph):
    nx.draw(graph, 
            node_size=30, width=.05, cmap=plt.cm.Blues, 
            with_labels=True, node_color=range(len(graph)))

draw_graph(g)
plt.show()

### Part 3: Find the most popular people

We're naturally interested in who has the most friends, so we want to extract **top 10**. That is, the 10 most connected people.

Hand-in:
* __Code that extracts and reports the 10 people with the most connections in the network__

In [None]:
graph_dict = dict(g.degree())
sorted_dict = sorted(graph_dict.items(), key=lambda x: x[1], reverse=True)
top_10_dict = dict(sorted_dict[:10])
print(top_10_dict)