unityID - hagrawa2


## 1) Degree Distribution

For the graphs provided to you, test and report which graphs are scalefree,
namely whose degree distribution follows a power law , at least asymptotically. That is, the fraction P(k) of
nodes in the network having k connections to other nodes goes for large values of k as

\begin{equation*}
\ P(k) \sim k^{(-\gamma)} 
\end{equation*}

where γ is a parameter whose value is typically in the range 2 < γ < 3, although occasionally it
may lie outside these bounds.

In [2]:
import glob
import pandas as pd
import numpy as np
from IPython.display import display, HTML

files = glob.glob('degree-outputs/*')

result = {
    'filename': [],
    'alpha': [],
    'scalefree': []
}

for f in files:
    name = f.split('/')[1]
    result['filename'].append(name)
    
    df = pd.read_csv(f)
    # Show DF for the random generated graphs     
    if len(name.split('.')) == 2:
        print(name)
        display(df)
    
    count = list(df['count'])
    degree = list(df['degree'])
    total_nodes = sum(count)
    fraction = [float(c)/total_nodes for c in count]
    slope, intercept = np.polyfit(np.log(degree), np.log(fraction), 1)
    
    result['alpha'].append(abs(slope))
    result['scalefree'].append('True' if abs(slope) > 1 and abs(slope) < 3.5 else 'False')
    
df = pd.DataFrame(data=result)
display(df)

gnm2.csv


Unnamed: 0.1,Unnamed: 0,degree,count
0,0,200,36
1,1,201,41
2,2,202,42
3,3,203,31
4,4,204,29
5,5,205,30
6,6,206,21
7,7,207,22
8,8,208,26
9,9,209,21


gnm1.csv


Unnamed: 0.1,Unnamed: 0,degree,count
0,0,12,3
1,1,13,2
2,2,14,5
3,3,15,4
4,4,16,7
5,5,17,8
6,6,18,10
7,7,19,5
8,8,20,10
9,9,21,10


gnp2.csv


Unnamed: 0.1,Unnamed: 0,degree,count
0,0,6,1
1,1,7,2
2,2,8,2
3,3,9,5
4,4,10,7
5,5,11,19
6,6,12,48
7,7,13,60
8,8,14,73
9,9,15,101


gnp1.csv


Unnamed: 0.1,Unnamed: 0,degree,count
0,0,1,4
1,1,2,12
2,2,3,15
3,3,4,12
4,4,5,18
5,5,6,19
6,6,7,11
7,7,8,5
8,8,9,4


Unnamed: 0,filename,alpha,scalefree
0,dblp.graph.large.csv,2.67185,True
1,dblp.graph.small.csv,1.141093,True
2,gnm2.csv,0.864608,False
3,gnm1.csv,1.093619,True
4,youtube.graph.small.csv,2.604728,True
5,amazon.graph.large.csv,2.548099,True
6,gnp2.csv,0.448687,False
7,gnp1.csv,0.063797,False
8,youtube.graph.large.csv,1.73319,True
9,amazon.graph.small.csv,1.678353,True


Answer the following questions:
1. Generate a few random graphs. You can do this using networkx’s random graph generators or GTGraph . Do the random graphs you tested appear to be scalefree?


|filename	|alpha	|scalefree
|-----------|-----------|------|
|gnm2.csv	|0.864608	|False |
|gnm1.csv	|1.093619	|True  |
|gnp2.csv	|0.448687	|False |
|gnp1.csv	|0.063797	|False |


2. Do the Stanford graphs provided to you appear to be scalefree?


|filename	                |alpha	    |scalefree
|---------------------------|-----------|---------|
|dblp.graph.large.csv	    |2.671850	|True|
|dblp.graph.small.csv	    |1.141093	|True|
|youtube.graph.small.csv	|2.604728	|True|
|amazon.graph.large.csv	    |2.548099	|True|
|youtube.graph.large.csv	|1.733190	|True|
|9amazon.graph.small.csv	|1.678353	|True|

## 2 - Centrality

Answer the following questions about the graph:

1. Rank the nodes from highest to lowest closeness centrality.

In [25]:
df = pd.read_csv('closeness.csv')
df

Unnamed: 0.1,Unnamed: 0,id,closeness
0,0,F,0.071429
1,1,C,0.071429
2,2,H,0.066667
3,3,D,0.066667
4,4,B,0.058824
5,5,E,0.058824
6,6,A,0.055556
7,7,G,0.055556
8,8,I,0.047619
9,9,J,0.034483


2. Suppose we had some centralized data that would sit on one machine but would be
shared with all computers on the network. Which two machines would be the best
candidates to hold this data based on other machines having few hops to access this
data?
   * Ans : F and C as they have the most degree of closeness amongst all other computers

## 3 - Articulation
Answer the following questions:
1. In this example, which members should have been targeted to best disrupt communication in the organization?

In [27]:
df = pd.read_csv('graphframe_false.csv')
display(df)

Unnamed: 0.1,Unnamed: 0,id,articulation
0,0,Mohamed Atta,1
1,1,Usman Bandukra,1
2,2,Mamoun Darkazanli,1
3,3,Essid Sami Ben Khemais,1
4,4,Djamal Beghal,1
5,5,Nawaf Alhazmi,1
6,6,Raed Hijazi,1


The above members should be targetted to best disrupt the organization