### Step 1: Reading Edge Data
This involves reading the edges file, formatted as node1, node2 where each line represents there is a undirected, unweighted edge between the nodes. Then, an adjacency matrix is created for the original network where:

$$

A[i][j] = 
\begin{cases} 
1 & \text{if there is an edge between node } i \text{ and } j, \\
0 & \text{otherwise.}
\end{cases}
 
$$

The degree of a node \(i\) is calculated:

$$
\text{Degree}(i) = \sum_{j=1}^{n} A[i][j].
$$

---

### Step 2: Gilbert Random Graph
A Gilbert random graph is generated using the probability \(p\). Here the probability is calculated as:

$$

p = \frac{2m}{n(n-1)},

$$

where:
- \(m\): Total number of edges in the original network.
- \(n\): Total number of nodes.

**We can switch to the hardcoded probability also if needed.**

---

## Graph Representation

### Left Plot: Degree Distribution Comparison
- **Blue bars**: Show the normalized degree distribution of the original network. This reflects how real-world connections are distributed.
- **Red bars**: Show the degree distribution of a single Gilbert random graph. Random graphs typically have a Poisson-like distribution, where most nodes have a degree close to the average.

### Right Plot: Degree Distribution (100 Instances)
- **Blue bars**: Same as in the left plot, representing the original network.
- **Green bars**: Represent the average degree distribution of 100 random graphs. This helps smooth out fluctuations and better represents the typical behavior of a Gilbert random graph.

---

## Interpretation

### Left Plot
- The **original network** appears sparse or disconnected, as seen from the degree distribution, which indicates why Almost all nodes in the original network have very low degrees. This can also be seen in the network stat file in question 1.
- The **Gilbert random graphs** exhibit the expected random behavior, providing a basis for comparison. The Gilbert random graph shows a more typical degree distribution. The degree distribution of the random graph is centered around the average degree of the original graph.
- To further verify, check the input data and ensure the adjacency matrix is correctly constructed.

### Right Plot
- Same as the left plot; the original network's distribution is still heavily skewed, with most nodes having degree 0.
- The plot of 100 random graphs shows a smoother distribution, which is more towards of the typical behavior of Gilbert random graphs.

---

**Since the random graph might have different maximum degrees, leading to the degree distribution of different lengths which can cause error because of the different lengths of the arrays. So, I have taken the maximum degree of the original graph and then calculated the degree distribution for the random graph.**


**NOTE: In question 5 i have visulaized all the three graphs that i used in Q1,Q2,Q3 from the Assignment 1 with 3 different layout of each. Also i have saved two files of each layout png and pdf, pdf file is high definition file in which we can zoom in and see the nodes, edges and weights clearly. I have attached the cystoscope session file also in the submission.**

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter

In [3]:
def data(filename):
    edges = []
    with open(filename, 'r') as f:
        for line in f:
            node1, node2 = map(int, line.strip().split())
            edges.append((node1, node2))
    return edges

def allNodes(edges):
    nodes = set()
    for i in edges:
        nodes.add(i[0])
        nodes.add(i[1])
    all_nodes = set(range(3437))
    nodes = nodes.union(all_nodes)
    return sorted(list(nodes))

def gilbert(n, p):
    adj = np.zeros((n, n))
    for i in range(n):
        for j in range(i + 1, n):
            if np.random.random() < p:
                adj[i][j] = 1
                adj[j][i] = 1
    return adj

def degDis(adjMatrix, maxDegrees=None):
    degrees = np.sum(adjMatrix, axis=1)
    degree_counts = Counter(degrees)
    if maxDegrees is None:
        maxDegrees = int(max(degrees))
    dist = [degree_counts.get(d, 0) for d in range(maxDegrees + 1)]
    return np.array(dist) / len(degrees)

edges = data('1684.edges')
nodes = allNodes(edges)
n = len(nodes)
nodeIdx = {node: idx for idx, node in enumerate(nodes)}

nodeMat = np.zeros((n, n))
for edge in edges:
    i, j = nodeIdx[edge[0]], nodeIdx[edge[1]]
    nodeMat[i][j] = 1
    nodeMat[j][i] = 1

m = len(edges)
p = 0.1

orgDis = degDis(nodeMat)
maxDeg = len(orgDis) - 1

randDis = []
for _ in range(100):
    random_graph = gilbert(n, p)

    dist = degDis(random_graph, maxDeg)
    randDis.append(dist)


randDis = np.array(randDis)
avgRD = np.mean(randDis, axis=0)

plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
x = np.arange(len(orgDis))
plt.bar(x, orgDis, alpha=0.5,
        label=f'Original Network (n={n})', color='blue')
single_random_dist = degDis(
    gilbert(n, p), maxDeg)
plt.bar(x, single_random_dist, alpha=0.5, label='Random Graph', color='red')
plt.xlabel('Degree')
plt.ylabel('Normalized Frequency')
plt.title('Degree Distribution Comparison')
plt.legend()

plt.subplot(1, 2, 2)
plt.bar(x, orgDis, alpha=0.5,
        label=f'Original Network (n={n})', color='blue')
plt.bar(x, avgRD, alpha=0.5,
        label='Avg of 100 Random Graphs', color='green')
plt.xlabel('Degree')
plt.ylabel('Normalized Frequency')
plt.title('Degree Distribution (100 Instances)')
plt.legend()

plt.tight_layout()
plt.savefig('degree_distributions.png')
plt.close()

print(f"Number of nodes: {n}")
print(f"Number of edges: {m}")
print(f"Probability p for Gilbert model: {p:.4f}")

Number of nodes: 3437
Number of edges: 28048
Probability p for Gilbert model: 0.1000
