## **Implementation of Barabási-Albert (BA) Algorithm**

1. **Start with an initial fully connected network**
   - A small number (`nG`) of nodes are connected to each other.
2. **Add new nodes one by one** until we reach `n` nodes.
   - Each new node connects to `m` existing nodes with probability of connection is proportional to its degree (`kᵢ`).
3. **Repeat until all nodes are added.**

---

## **Code Explanation**

### **Function: `bAGraph(n, nG, m)`**
This function **generates a BA network from scratch** without using any inbuilt libraries.
For degree proportionality, Maintain a `degree_list`, where each node appears as many times as its degree.
- For each new node:
  - Select `m` target nodes.
  - Connect the new node to these `m` target nodes.
  - Update the `degree_list` accordingly.

### **Main Function: `main()`**
- Code Execution with different hardcoded values of `n`, `nG`, and `m` to generates **100 instances** of the network.
- Computes and prints **average clustering coefficient, characteristic path length, and degree statistics** for every instance generated.

### **The below code is running for 100 instances for each network configuration that is mentioned in the main function. The final graph which is plotted is the average of all the 100 instances for each network configuration.**
- Or simply For a configuration (m0=5, m=1, N=100), a new network is generated from scratch using baGraph(m0, m, N), meaning, i am starting with an initial complete graph of m0 = 5 nodes and iteratively add N- m0 = 95 new nodes, each forming m = 1 new connection based on preferential attachment. The finaly graph after adding 95 nodes is considered as one instance. 100 such instances are generated and the average of all the 100 instances is plotted.

In [5]:
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter
import time


def baGraph(m0, m, N):
    G = nx.complete_graph(m0)
    metrics = {
        'clustering': [nx.average_clustering(G)],
        'path_length': [nx.average_shortest_path_length(G)],
        'degree_distribution': [dict(G.degree())]
    }

    for i in range(m0, N):
        degrees = dict(G.degree())
        totalDegree = sum(degrees.values())
        probabilities = {node: degree / totalDegree for node, degree in degrees.items()}

        targets = np.random.choice(list(degrees.keys()), size=m, replace=False, 
                                 p=list(probabilities.values()))

        G.add_node(i)
        for target in targets:
            G.add_edge(i, target)
        try:
            metrics['clustering'].append(nx.average_clustering(G))
            metrics['path_length'].append(nx.average_shortest_path_length(G))
            metrics['degree_distribution'].append(dict(G.degree()))
        except nx.NetworkXError:
            metrics['clustering'].append(nx.average_clustering(G))
            metrics['path_length'].append(float('inf'))
            metrics['degree_distribution'].append(dict(G.degree()))

    return G, metrics


def instance(m0, m, N, num=100):
    results = {}

    for a in m0:
        for b in m:
            for c in N:
                if b > a:
                    continue
                key = f"m0={a}, m={b}, N={c}"
                print(f"Running {key}...")

                run_results = {
                    'clustering': np.zeros(c - a + 1),
                    'path_length': np.zeros(c - a + 1),
                    'final_degree_dist': []
                }

                for i in range(num):
                    if i % 10 == 0:
                        print(f"  Instance {i}/{num}")
                    G, metrics = baGraph(a, b, c)
                    run_results['clustering'] += np.array(metrics['clustering'])
                    run_results['path_length'] += np.array(metrics['path_length'])
                    final_degrees = metrics['degree_distribution'][-1]
                    run_results['final_degree_dist'].append(final_degrees)

                run_results['clustering'] /= num
                run_results['path_length'] /= num

                all_degrees = []
                for dist in run_results['final_degree_dist']:
                    all_degrees.extend(list(dist.values()))

                degree_counts = Counter(all_degrees)
                totalNodes = sum(degree_counts.values())

                run_results['degree_dist'] = {k: count/totalNodes for k, count in degree_counts.items()}

                
                results[key] = run_results

    return results


def plot(results):
    for params, data in results.items():
        fig, axes = plt.subplots(1, 3, figsize=(15, 5))

        
        params_dict = dict(item.split('=') for item in params.split(', '))
        m0, m, N = int(params_dict['m0']), int(params_dict['m']), int(params_dict['N'])

        
        axes[0].plot(range(m0, N+1), data['clustering'])
        axes[0].set_title(f'Average Clustering Coefficient\n{params}')
        axes[0].set_xlabel('Network Size')
        axes[0].set_ylabel('Clustering Coefficient')
        axes[0].grid(True)

        
        pathLength = data['path_length'][data['path_length'] < float('inf')]
        if len(pathLength) > 0:
            axes[1].plot(range(m0, m0 + len(pathLength)), pathLength)
            axes[1].set_title(f'Characteristic Path Length\n{params}')
            axes[1].set_xlabel('Network Size')
            axes[1].set_ylabel('Path Length')
            axes[1].grid(True)
        else:
            axes[1].text(0.5, 0.5, 'No valid path lengths', ha='center', va='center')

        
        degrees = sorted(data['degree_dist'].keys())
        probabilities = [data['degree_dist'][k] for k in degrees]

        axes[2].loglog(degrees, probabilities, 'o-')
        axes[2].set_title(f'Degree Distribution (log-log)\n{params}')
        axes[2].set_xlabel('Degree (k)')
        axes[2].set_ylabel('P(k)')
        axes[2].grid(True, which='both', linestyle='--', alpha=0.7)

        plt.tight_layout()
        plt.savefig(f'baNetwork{params.replace(", ", "_").replace("=", "")}.png')
        plt.close(fig)


def main():
    m0 = [5, 10]
    m = [1, 2, 3]
    N = [100, 200]
    start = time.time()
    results = instance(m0=m0, m=m, N=N, num=100)
    plot(results)
    end = time.time()
    print(f"Total execution time: {end - start:.2f} seconds")


if __name__ == "__main__":
    main()

Running m0=5, m=1, N=100...
  Instance 0/100
  Instance 10/100
  Instance 20/100
  Instance 30/100
  Instance 40/100
  Instance 50/100
  Instance 60/100
  Instance 70/100
  Instance 80/100
  Instance 90/100
Running m0=5, m=1, N=200...
  Instance 0/100
  Instance 10/100
  Instance 20/100
  Instance 30/100
  Instance 40/100
  Instance 50/100
  Instance 60/100
  Instance 70/100
  Instance 80/100
  Instance 90/100
Running m0=5, m=2, N=100...
  Instance 0/100
  Instance 10/100
  Instance 20/100
  Instance 30/100
  Instance 40/100
  Instance 50/100
  Instance 60/100
  Instance 70/100
  Instance 80/100
  Instance 90/100
Running m0=5, m=2, N=200...
  Instance 0/100
  Instance 10/100
  Instance 20/100
  Instance 30/100
  Instance 40/100
  Instance 50/100
  Instance 60/100
  Instance 70/100
  Instance 80/100
  Instance 90/100
Running m0=5, m=3, N=100...
  Instance 0/100
  Instance 10/100
  Instance 20/100
  Instance 30/100
  Instance 40/100
  Instance 50/100
  Instance 60/100
  Instance 70/100
 