<a target="_blank" href="https://colab.research.google.com/github/skojaku/adv-net-sci/blob/main/notebooks/m04-friendship-paradox/exercise.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

In [None]:
# If you are using Google Colab, uncomment the following line to install igraph
# !sudo apt install libcairo2-dev pkg-config python3-dev
# !pip install pycairo cairocffi
# !pip install igraph

In [None]:
import igraph
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Hands-on: Degree Distribution 

Let's create a degree-heterogeneous network, Barabasi-Albert network.

In [None]:
g = igraph.Graph.Barabasi(n=3000, m=10)

Let's compute the degree of each node. You can use `Graph.degree` to get the degree of each node, or alternatively compute it from the adjacency matrix (via `Graph.get_adjacency`).

In [None]:
degree = g.degree()

# Alternatively, you can compute the degree from the adjacency matrix by
#A = g.get_adjacency()
#degree = np.sum(A, axis=1)

Let's plot the degree distribution using a simple histogram. To do that, we compute the *frequency* of each degree value.  

In [None]:
# Compute degree for each node
p_deg = ...

In [None]:
# Plot
fig, ax = plt.subplots(figsize=(8, 5))

...

Let's plot it in log-log scale.

Let's plot it in **Complementary Cumulative Distribution Function (CCDF)**.

In [None]:
ccdf_deg = ... # Compute the CCDF

In [None]:
fig, ax = plt.subplots(figsize=(8, 5))

ax = sns.lineplot(x=np.arange(len(ccdf_deg)), y=ccdf_deg)

ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlabel('Degree')
ax.set_ylabel('CCDF')
ax.set_title('CCDF: Smooth Power-Law Visualization')

`seaborn` offers a convenient function to plot the CCDF.

In [None]:
fig, ax = plt.subplots(figsize=(8, 5))
ax = sns.ecdfplot(degree, complementary=True, log_scale=(True, True), ax = ax)

ax.set_xlabel('Degree')
ax.set_ylabel('CCDF')

## Exercise 

The Barabasi-Albert network is a scale-free network, which means that the degree distribution follows a power law, i.e., 

$$
P(k) \propto k^{-\gamma}
$$

where $\gamma$ is the power-law exponent. From the figure above, how can we estimate the power-law exponent?  Write your derivation in the markdown cell below, or hand-write it. Then, identify the power-law exponent from the plot.

**Hint**:

Derive the analytical form of the CCDF of the power-law distribution and fit it to the data.

The CCDF is given by $F(k) = P(k' > k)$, i.e., fraction of data points that are greater than $k$. Alternatively, it can be written as

$$
F(k) = \int_k^\infty P(k') dk' 
$$

## Discussion: What could be wrong? 

While the above method for identifying the power-law exponent is useful to understand the degree heterogeneity, it is not a good practice to use a plot as a way to identify the power-law exponent. Why?  