## Part 1.

Analyse your egocentric VK network.

1. Compute centrality measures for your egocentric VK network from HA-2: degree, closeness, betweenness, eigenvector; interpret the ranking results (does ranking make sense?).
2. Find network communities using either `igraph` or `community-louvain` package, analyse the results, can you identify clusters related to your egocentric network?
3. Using different layouts plot you network (you can use different node characteristics including clustering labels as node size/color). 

https://python-louvain.readthedocs.io/en/latest/api.html

In [None]:
import networkx as nx
import igraph as ig
from community import community_louvain
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# !pip install python-louvain

In [None]:
G = nx.karate_club_graph()

 Check drawing routines https://networkx.org/documentation/stable/reference/drawing.html

In [None]:
# Using spring layout
pos_spring = nx.spring_layout(G)
plt.figure(figsize=(6, 6))
nx.draw(G, pos=pos_spring, with_labels=True, node_color='lightblue', edge_color='gray', node_size=500)
plt.title("Spring Layout");

### Community detection using Spectral Clustering

Given graph $G$ with $n$ nodes, find non-overlapping node "communities": $k$ groups of nodes that are densely intra connected and have low number of inter connections.

- Compute square diagonal matrix of node degrees $D$. 
    $$D_{ii} = \sum_i A_{ij}, D_{ij} = 0, i \neq j$$
- Construct graph Laplacian 
    $$L_{unnormed} = D - A$$
Find $0 = \lambda_0 \leq \lambda_1 \leq \lambda_2 \leq \ldots \leq \lambda_m$ smallest eigenvalues of $L$ and construct matrix $X$ by stacking $m$ corresponding eigenvectors ($v_1, \ldots v_m$) as columns of $X$. Matrix $X$ has size $n \times m$, its rows are "spectral representaion" of graph nodes.
   
- Run k-means algorithm on matrix X and assign nodes with labels obtained by k-means.

> You can use numpy/scipy eigenvector routines and sklearn KMeans
---

Implement 3 algorithms described in https://arxiv.org/abs/0711.0189 :
1. Unnormalized spectral clustering 
2. Normalized spectral clustering according to Shi and Malik (2000)
3. Normalized spectral clustering according to Ng, Jordan, and Weiss (2002)

## Compare 3 versions of Spectral clustering on `Karate Club dataset` using
1. `Adjusted Rand Index` (3 pairwise comparisons),
2. (implement) `Modularity` (3 numbers):
$$ Q(A, c) = \frac{1}{2m} \sum (A_{ij} - \frac{k_i k_j}{2m}) [c_i == c_j]  $$
3.  and visually, plotting points in a corresponding 2 dimensional spaces (spanned by eigenvectors).

Theoretical questions:

1. Why does the smallest eigenvalue of unnormalized Laplacian is always equal to 0? 
2. From network point of view, what does Symmetric normalization do? 
3. Under what conditions Symmetric and Random walk normalizations yield the same result? 
  
    
Sources:
- Andrew Ng paper on spectral clustering https://ai.stanford.edu/~ang/papers/nips01-spectral.pdf
- Tutorial on spectral clustering with multiple theoretical views on the problem https://arxiv.org/abs/0711.0189
- Amazing explanation from James R. Lee https://www.youtube.com/watch?v=8XJes6XFjxM

In [None]:
from sklearn.metrics import adjusted_rand_score, adjusted_mutual_info_score
from sklearn.cluster import KMeans
from scipy import linalg # https://docs.scipy.org/doc/scipy/reference/linalg.html#eigenvalue-problems