<a href="https://colab.research.google.com/github/jjcrofts77/TMB-MATH34041/blob/main/content/notebooks/Chapter2/RandomGraphs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 2.1 Random Graphs

The most commonly employed null model is that of the Erd\"os-R\'enyi (ER) random graph, which we denote as $G(n,p)$\footnote{Strictly speaking this is the Gilbert model but it is typically referred to as the ER model; the two are equivalent for large $n$.}. To construct an ER Random graph we start with a set of $n$ isolated nodes and connect node pairs independently with some prespecified probability $p>0$ -- in practice we assign to each pair of nodes a random number $r\in[0,1]$ and place a link between node pairs if $p>r$. See Figure \ref{fig:ERexamples} for an illustration of a number of ER networks for $n=20$ nodes and a variety of $p$ values.

\begin{figure}
\centering{
\subfigure[$p=0.25$]{
\includegraphics[width=0.25\textwidth]{Figures/erdrey_p025.pdf}
}
\hfill
\subfigure[$p=0.5$]{
\includegraphics[width=0.25\textwidth]{Figures/erdrey_p050.pdf}
}
\hfill
\subfigure[$p=0.75$]{
\includegraphics[width=0.25\textwidth]{Figures/erdrey_p075.pdf}
}
\caption{Example ER random networks for $n=20$} \label{fig:ERexamples}
}
\end{figure}

We list some of the most important properties of ER random networks below:
\begin{enumerate}
 \item The expected number of edges of a network in the $G(n,p)$ ensemble is $\langle m\rangle = \frac{n(n-1)p}{2}$.
 \item The expected node degree is $\langle k\rangle = (n-1)p$.
 \item For large $n$ the degree follows a Poisson distribution $p(k) = \frac{e^{-\langle k\rangle}\langle k\rangle^k}{k!}$ as illustrated in Figure \ref{fig:ERDD} for ER networks with $1000$ nodes and $4000$ links. The solid line represents the expected distribution and the dots represent the values for an average over 100 realisations.
 \item The characteristic path-length for large $n$ is
 \[
  \langle l \rangle = \frac{\ln{(n-\gamma)}}{\ln{(pn)}} +\frac{1}{2}
 \]
 where $\gamma\approx 0.577$ is the so-called Euler-Mascheroni constant.
 \item The average clustering coefficient is $\langle C\rangle = p$.
 \item The spectral density of an ER random network follows Wigner's so-called semi circle law. That is, almost all the eigenvalues of an ER network lie in the range $[-2r,2r]$ where $r=\sqrt{np(1-p)}$ and within this range the density function is given by
 \[
  \rho(\lambda) = \frac{\sqrt{4-\lambda^2}}{2\pi}.
 \]
\end{enumerate} 

\bigskip
\noindent\textbf{Exercise:}  Prove property 5 above.

\begin{figure}
\centering
 \includegraphics[scale=0.5]{Figures/ER_Poisson}
 \caption{Empirical Erd\"os-Renyi degree distribution.}\label{fig:ERDD}
\end{figure}

\bigskip
In the following we consider property 3  in some detail; the remaining properties shall be considered further in the problem sheets -- with the exception of property 4 which is slightly more complicated.

It is important to note that $G(n,p)$ does not represent a single network but rather an ensemble of networks each of which occurs with probability
\[
P(G(n,p)) = p^m(1-p)^{\frac{n(n-1)}{2}-m},
\]
where here $m$ denotes the number of edges present. It follows that the probability of an ER random network having $m$ edges is given by
\[
P(m) = {\frac{n(n-1)}{2}\choose m}p^m(1-p)^{\frac{n(n-1)}{2}-m},
\]
i.e. it follows a binomial distribution, from which properties 1 and 2 follow (almost!) immediately. Using similar arguments to those given above we can show that the probability of a node having degree $k$ is given by
\[
P(k) = {n-1 \choose k}p^k(1-p)^{(n-1)-k}.
\]
The above gives us property 2 (again) for free (why?). 

Now, recall that for large $n$, small $p$, and $\langle k\rangle = (n-1)p$ we can approximate the binomial distribution given above by the Poisson distribution. To see this, recall that
\begin{align*}
 {n-1\choose k}p^k(1-p)^{n-1-k} &= \frac{(n-1)(n-2)\dots(n-k)}{k!}\left(\frac{\langle k\rangle}{n-1}\right)^k\left(1-\frac{\langle k\rangle}{n-1}\right)^{n-1-k}\\
 &\approx \frac{\langle k\rangle^k}{k!}\left(1-\frac{\langle k\rangle}{n-1}\right)^{n-1} \quad\text{if $k$ is small relative to $n$}\\
 &\approx  \frac{\langle k\rangle^k}{k!}e^{-\langle k\rangle} \quad\text{if $n$ is large}.
\end{align*}
The exponential factor comes from
\begin{align*}
\ln\left(1-\frac{\langle k\rangle}{n-1}\right)^{n-1} &= (n-1)\ln\left(1-\frac{\langle k\rangle}{n-1}\right)\\ 
&= (n-1)\left(-\frac{\langle k\rangle}{n-1} - \frac{1}{2}\frac{\langle k\rangle^2}{(n-1)^2} - \cdots \right)\\
&\approx -\langle k\rangle \quad\text{for large $n$}.
\end{align*}

Note that for large random networks this says that almost all nodes have the same degree, given by $\langle k\rangle$, i.e. the degree distribution is highly homogeneous. Much of the early work on complex networks was motivated by the fact that real-world networks - biological networks in particular - were found to have highly heterogeneous degree distributions, that is a much broader range of degrees were present than predicicted by the ER model.