# Social Network Analysis - Class 5 - Triads, measures of competition, measures of brokerage

In [1]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf

In [2]:
lazega_attr = pd.read_csv('Data/Lazega-Atts.csv')

advice_df = pd.read_csv('Data/Lazega-Advice-Net.csv', skiprows=1, names=list(np.arange(1, 72)))
advice_df.index = list(np.arange(1, 72))

In [76]:
import networkx as nx

g = nx.from_pandas_adjacency(advice_df)#, create_using=nx.DiGraph())

In [77]:
degrees = dict(nx.degree(g))
struc_holes = pd.DataFrame.from_dict(degrees, orient='index')
struc_holes = struc_holes.rename(columns={0: 'degree'})
struc_holes = struc_holes.sort_index()

The networkx module can calculate some of these measures of structural holes but results are somewhat different from those produced by STATA. Unlike STATA, the functions to calculate some of the measurements such as hierarchy, ego between vertices, and density are calculated for the entire graph instead of for each vertex. In the following cells, we will only calculate the vertex-based measurements such as effective size, efficiency, and constraint.

In [78]:
# slide 11 - calculating effective size and efficiency
effsize = nx.algorithms.structuralholes.effective_size(g)
struc_holes['effsize'] = struc_holes.index.map(effsize)

struc_holes['efficiency'] = struc_holes['effsize']/struc_holes['degree']

In [79]:
struc_holes.head()

Unnamed: 0,degree,effsize,efficiency
1,13,7.0,0.538462
2,23,13.608696,0.591682
3,12,6.0,0.5
4,30,16.666667,0.555556
5,9,6.111111,0.679012


In [83]:
lazega_attr.head()

Unnamed: 0,ID,status,gender,office,seniority,age,practice,lawschool
0,1,1,1,1,31,64,1,1
1,2,1,1,1,32,62,2,1
2,3,1,1,2,13,67,1,1
3,4,1,1,1,31,59,2,3
4,5,1,1,2,31,59,1,2


In [112]:
# slide 25 - calculate mean degrees by status

# create directed graph
dir_g = nx.from_pandas_adjacency(advice_df, create_using=nx.DiGraph())

# attach attributes to vertices
lazega_attr = lazega_attr.set_index('ID')
attr_dict = lazega_attr.to_dict('index')
nx.set_node_attributes(dir_g, attr_dict)

# create df for status starting with degree measurements
dir_degrees = dict(nx.degree(dir_g))
status_df = pd.DataFrame.from_dict(dir_degrees, orient='index')
status_df = status_df.rename(columns={0: 'degree'})
status_df = status_df.sort_index()

# attach status to vertices in df
status = nx.get_node_attributes(dir_g, 'status')
status_df['status'] = status_df.index.map(status)

# attach indegree and outdegree measurements to df
indegrees = dict(dir_g.in_degree())
outdegrees = dict(dir_g.out_degree())

status_df['indegree'] = status_df.index.map(indegrees)
status_df['outdegree'] = status_df.index.map(outdegrees)

In [114]:
status_df.groupby('status').mean()

Unnamed: 0_level_0,degree,indegree,outdegree
status,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,21.666667,12.944444,8.722222
2,12.685714,4.171429,8.514286


The `networkx` module cannot distinguish between incoming and outgoing effective size and constraint for each vertex. 

In [80]:
# slide 31
constraint = nx.algorithms.structuralholes.constraint(g)
constraint = {k: v for k, v in sorted(constraint.items())}

In [56]:
# slide 33
struc_holes['constraint'] = struc_holes.index.map(constraint)

smf.ols('constraint ~ degree', data=struc_holes).fit().summary()

0,1,2,3
Dep. Variable:,constraint,R-squared:,0.307
Model:,OLS,Adj. R-squared:,0.297
Method:,Least Squares,F-statistic:,30.57
Date:,"Tue, 08 Jan 2019",Prob (F-statistic):,5.36e-07
Time:,20:53:23,Log-Likelihood:,43.527
No. Observations:,71,AIC:,-83.05
Df Residuals:,69,BIC:,-78.53
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,0.3202,0.034,9.517,0.000,0.253,0.387
degree,-0.0104,0.002,-5.529,0.000,-0.014,-0.007

0,1,2,3
Omnibus:,93.761,Durbin-Watson:,2.275
Prob(Omnibus):,0.0,Jarque-Bera (JB):,1247.185
Skew:,4.134,Prob(JB):,1.5e-271
Kurtosis:,21.794,Cond. No.,38.1


## Brokerage analysis

slide 61

For this part, we will rely on the functions available in R, as there is not yet a good way to conduct brokerage analysis in Python

In [3]:
from rpy2.robjects.packages import importr

statnet = importr('statnet')
sna = importr('sna')

%load_ext rpy2.ipython

--- Please select a CRAN mirror for use in this session ---



Secure CRAN mirrors
 





 1: 0-Cloud [https]                   2: Algeria [https]                
 3: Australia (Canberra) [https]      4: Australia (Melbourne 1) [https]
 5: Australia (Melbourne 2) [https]   6: Australia (Perth) [https]      
 7: Austria [https]                   8: Belgium (Ghent) [https]        
 9: Brazil (PR) [https]              10: Brazil (RJ) [https]            
11: Brazil (SP 1) [https]            12: Brazil (SP 2) [https]          
13: Bulgaria [https]                 14: Chile 1 [https]                
15: Chile 2 [https]                  16: China (Hong Kong) [https]      
17: China (Guangzhou) [https]        18: China (Lanzhou) [https]        
19: China (Shanghai 1) [https]       20: China (Shanghai 2) [https]     
21: Colombia (Cali) [https]          22: Czech Republic [https]         
23: Denmark [https]                  24: East Asia [https]              
25: Ecuador (Cuenca) [https]      

Selection:  61


In [6]:
%%R -i lazega_attr,advice_df

nrelations = network(advice_df, directed=TRUE)

nrelations %v% "ID" <- lazega_attr$ID
nrelations %v% "status" <- lazega_attr$status

b=brokerage(nrelations, lazega_attr$status)
bz=cbind(lazega_attr, b$z.nli)

bz

   ID status gender office seniority age practice lawschool        w_I
0   1      1      1      1        31  64        1         1 -0.4399068
1   2      1      1      1        32  62        2         1  3.7915196
2   3      1      1      2        13  67        1         1  0.7825053
3   4      1      1      1        31  59        2         3  6.0482804
4   5      1      1      2        31  59        1         2 -0.4399068
5   6      1      1      2        29  55        1         1 -1.4742555
6   7      1      1      2        29  63        2         3 -1.0981287
7   8      1      1      1        28  53        1         3 -0.8160336
8   9      1      1      1        25  53        2         1 -0.8160336
9  10      1      1      1        25  53        2         3 -0.7220019
10 11      1      1      1        23  50        1         1  1.3466955
11 12      1      1      1        24  52        2         2  9.9976117
12 13      1      1      1        22  57        1         2  2.0989491
13 14 