#Análisis de Redes con Igraph

En los últimos años  existe también un creciente interés en el avance y mejora de las técnicas de ** análisis de topología de grafos y redes complejas ** con el objetivo de obtener un mejor entendimiento de sistemas complejos como redes sociales/comunidades on-line o sistemas de biología.

En general, una de las principales características de los grafos para la representación de sistemas reales es la creación (implícita o explícita) de comunidades o  clústeres.

En este sentido, la organización de los distintos vértices y ejes así como su confluencia en distintas zonas permite la generación de zonas de activación de información independientes definidas por conjunto de características particulares: 1) estáticas o explícitas o 2) dinámicas o implícitas. 

De igual forma, estas características pueden ser determinadas a través de un análisis parcial de un grafo o bien mediante un análisis global. 

En cualquiera de los casos, la segmentación en grafos de acuerdo a las anteriores características es un problema complejo que ha sido investigado por equipos multidisciplinares en distintos ámbitos pero para el cual que todavía no se ha hallado una solución completa. Por ejemplo, el conocido algoritmo de  Clauset utiliza técnicas para obtener medidas inter e intra comunidades y así establecer comunidades mediante el borrado o adición de nuevas relaciones.

Algunas de las técnicas habituales para la detección y particionado de comunidades en grafos son: “hierarchical, partitional o spectral clustering” incluso existen otras más actuales basadas en inferencia estadística o en el algoritmo de “Girvan and Newman”. 

El uso de estas técnicas se ha manifestado en: 1) análisis de comunidades on-line, con especial foco en redes sociales para la segmentación del grafo social (formado por miles y millones de nodos, ejes y características) de acuerdo a variables pre-definidas u a otras inferidas basadas en análisis de sentimiento o interacción entre pares como por ejemplo la detección de comunidades hostiles o violentas y 2) análisis de la evolución de sistemas complejos.

* https://www.media.mit.edu/people/emoro/projects/

![alt text](https://pbs.twimg.com/media/Dut4zG1X4AAO_4y.jpg:large)


## Instalación de igraph

In [0]:
!pip install python-igraph

Collecting python-igraph
[?25l  Downloading https://files.pythonhosted.org/packages/0f/a0/4e7134f803737aa6eebb4e5250565ace0e2599659e22be7f7eba520ff017/python-igraph-0.7.1.post6.tar.gz (377kB)
[K    100% |████████████████████████████████| 378kB 19.9MB/s 
[?25hBuilding wheels for collected packages: python-igraph
  Building wheel for python-igraph (setup.py) ... [?25ldone
[?25h  Stored in directory: /root/.cache/pip/wheels/41/d6/02/34eebae97e25f5b87d60f4c0687e00523e3f244fa41bc3f4a7
Successfully built python-igraph
Installing collected packages: python-igraph
Successfully installed python-igraph-0.7.1.post6


##Carga y creación de grafos

In [0]:
import igraph
import numpy as np
print (igraph.__version__)

0.7.1


In [0]:
#Creación manual de grafo: añadiendo vertices y ejes
from igraph import *
g = Graph()
g.add_vertices(3)
g.add_edges([(0,1), (1,2)])
print(g)

IGRAPH U--- 3 2 --
+ edges:
0--1 1--2


In [0]:
#Creación manual de grafo: añadiendo ejes
g = Graph([(0,1), (0,2), (2,3), (3,4), (4,2), (2,5), (5,0), (6,3), (5,6)])

In [0]:
#Creación de un grafo aleatorio
g = Graph.Erdos_Renyi(n=100, m=20)
print(g)

IGRAPH U--- 100 20 --
+ edges:
7--11 2--27 8--38 8--42 21--42 12--47 14--54 24--54 49--59 52--61 59--62
62--63 47--64 61--64 35--65 38--69 9--70 37--76 58--99 98--99


##Extracción de métricas

In [0]:
def laplacian_centrality(graph, vs=None):
   if vs is None:
       vs = range(graph.vcount())
   degrees = graph.degree(mode="all")
   result = []
   for v in vs:
       neis = graph.neighbors(v, mode="all")
       result.append(degrees[v]**2 + degrees[v] + 2 * sum(degrees[i] for i in neis))
   return result


def betweenness_centralization(G):
    vnum = G.vcount()
    if vnum < 3:
        raise ValueError("graph must have at least three vertices")
    denom = (vnum-1)*(vnum-2)
 
    temparr = [2*i/denom for i in G.betweenness()]
    max_temparr = max(temparr)
    return sum(max_temparr-i for i in temparr)/(vnum-1)

In [0]:
def extract_graph_metrics(g):
   graph_features={}
   #Basic measures: http://igraph.org/python/doc/igraph.GraphBase-class.html
   graph_features[g.vcount.__name__] = g.vcount() 
   graph_features[g.ecount.__name__] = g.ecount()
   graph_features[g.omega.__name__] = g.omega()
   graph_features[g.alpha.__name__] = g.alpha()
   graph_features[g.diameter.__name__] = g.diameter()
   graph_features[g.average_path_length.__name__] = g.average_path_length()
   graph_features[g.radius.__name__] = g.radius()
       
   #Structural properties
   graph_features["max_"+g.degree.__name__] = max(g.degree())
   graph_features["min_"+g.degree.__name__] = min(g.degree())
   graph_features["mean_"+g.degree.__name__] = np.mean(g.degree())
   graph_features["max_"+g.count_multiple.__name__] = max(g.count_multiple())
   graph_features["min_"+g.count_multiple.__name__] = min(g.count_multiple())
   graph_features["mean_"+g.count_multiple.__name__] = np.mean(g.count_multiple())
   graph_features[g.has_multiple.__name__] = g.has_multiple()
   graph_features[g.density.__name__] = g.density()
   graph_features["max_"+g.diversity.__name__] = max(g.diversity())
   graph_features["min_"+g.diversity.__name__] = min(g.diversity())
   graph_features["mean_"+g.diversity.__name__] = np.mean(g.diversity())
   graph_features["len_"+g.articulation_points.__name__] = len(g.articulation_points())
   graph_features[g.assortativity_degree.__name__] = g.assortativity_degree()
       

   #Centrality
   graph_features["max_laplacian_centrality"] = max(laplacian_centrality(g))
   graph_features["min_laplacian_centrality"] = min(laplacian_centrality(g))
   graph_features["mean_laplacian_centrality"] = np.mean(laplacian_centrality(g))
   graph_features["betweenness_centralization"] = betweenness_centralization(g)
   betweeness_per_node=g.betweenness()
   graph_features["max_"+g.edge_betweenness.__name__] = max(g.edge_betweenness())
   graph_features["min_"+g.edge_betweenness.__name__] = min(g.edge_betweenness())
   graph_features["mean_"+g.edge_betweenness.__name__] = np.mean(g.edge_betweenness())
   graph_features["max_"+g.closeness.__name__] = max(g.closeness())
   graph_features["min_"+g.closeness.__name__] = min(g.closeness())
   graph_features["mean_"+g.closeness.__name__] = np.mean(g.closeness())
   graph_features["max_in_"+g.closeness.__name__] = max(g.closeness(mode="in"))
   graph_features["min_in_"+g.closeness.__name__] = min(g.closeness(mode="in"))
   graph_features["mean_in_"+g.closeness.__name__] = np.mean(g.closeness(mode="in"))
   graph_features["max_out_"+g.closeness.__name__] = max(g.closeness(mode="out"))
   graph_features["min_out_"+g.closeness.__name__] = min(g.closeness(mode="out"))
   graph_features["mean_out_"+g.closeness.__name__] = np.mean(g.closeness(mode="out"))
   graph_features[g.canonical_permutation.__name__] = len(g.canonical_permutation())
   graph_features[g.clique_number.__name__] = g.clique_number()
   #graph_features[g.largest_cliques.__name__] = g.largest_cliques()

   #Clustering
   graph_features[g.count_automorphisms_vf2.__name__] = g.count_automorphisms_vf2()
   graph_features["len_"+g.cut_vertices.__name__] = len(g.cut_vertices())
   graph_features["len_"+g.knn.__name__] = len(g.knn())
   graph_features["len_"+g.biconnected_components.__name__] = len(g.biconnected_components())

   #clusters=g.clusters() #weak or strong
   return graph_features

In [0]:
print(extract_graph_metrics(g))

{'vcount': 7, 'ecount': 9, 'clique_number': 3, 'independence_number': 3, 'diameter': 3, 'average_path_length': 1.7142857142857142, 'radius': 2.0, 'max_degree': 4, 'min_degree': 1, 'mean_degree': 2.5714285714285716, 'max_count_multiple': 1, 'min_count_multiple': 1, 'mean_count_multiple': 1.0, 'has_multiple': False, 'density': 0.4285714285714286, 'max_diversity': 1.0, 'min_diversity': 1.0, 'mean_diversity': 1.0, 'len_articulation_points': 1, 'assortativity_degree': -0.18867924528301827, 'max_laplacian_centrality': 42, 'min_laplacian_centrality': 8, 'mean_laplacian_centrality': 24.857142857142858, 'betweenness_centralization': 0.26111111111111107, 'max_edge_betweenness': 6.0, 'min_edge_betweenness': 2.0, 'mean_edge_betweenness': 4.0, 'max_closeness': 0.75, 'min_closeness': 0.42857142857142855, 'mean_closeness': 0.6004019789734075, 'max_in_closeness': 0.75, 'min_in_closeness': 0.42857142857142855, 'mean_in_closeness': 0.6004019789734075, 'max_out_closeness': 0.75, 'min_out_closeness': 0.42

##Guardar y mostrar grafo

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive


In [0]:
graph = Graph.Read_GML("/content/drive/My Drive/Colab Notebooks/data/karate.gml")
print(graph)

IGRAPH U--- 34 78 --
+ attr: id (v)
+ edges:
 0 --  1  2  3  4  5  6  7  8 10 11 12 13 17 19 21 31      25 -- 23 24 31
 1 --  0  2  3  7 13 17 19 21 30                           26 -- 29 33
 2 --  0  1  3  7  8  9 13 27 28 32                        27 --  2 23 24 33
 3 --  0  1  2  7 12 13                                    28 --  2 31 33
 4 --  0  6 10                                             29 -- 23 26 32 33
 5 --  0  6 10 16                                          30 --  1  8 32 33
 6 --  0  4  5 16                                          31 --  0 24 25 28
32 33
 7 --  0  1  2  3                                          32 --  2  8 14 15
18 20 22 23 29 30 31 33
 8 --  0  2 30 32 33                                       33 --  8  9 13 14
15 18 19 20 22 23 26 27 28 29 30 31 32
 9 --  2 33
10 --  0  4  5
11 --  0
12 --  0  3
13 --  0  1  2  3 33
14 -- 32 33
15 -- 32 33
16 --  5  6
17 --  0  1
18 -- 32 33
19 --  0  1 33
20 -- 32 33
21 --  0  1
22 -- 32 33
23 -- 25 27 29 32 33
24 -

In [0]:
print(extract_graph_metrics(g))

{'vcount': 7, 'ecount': 9, 'clique_number': 3, 'independence_number': 3, 'diameter': 3, 'average_path_length': 1.7142857142857142, 'radius': 2.0, 'max_degree': 4, 'min_degree': 1, 'mean_degree': 2.5714285714285716, 'max_count_multiple': 1, 'min_count_multiple': 1, 'mean_count_multiple': 1.0, 'has_multiple': False, 'density': 0.4285714285714286, 'max_diversity': 1.0, 'min_diversity': 1.0, 'mean_diversity': 1.0, 'len_articulation_points': 1, 'assortativity_degree': -0.18867924528301827, 'max_laplacian_centrality': 42, 'min_laplacian_centrality': 8, 'mean_laplacian_centrality': 24.857142857142858, 'betweenness_centralization': 0.26111111111111107, 'max_edge_betweenness': 6.0, 'min_edge_betweenness': 2.0, 'mean_edge_betweenness': 4.0, 'max_closeness': 0.75, 'min_closeness': 0.42857142857142855, 'mean_closeness': 0.6004019789734075, 'max_in_closeness': 0.75, 'min_in_closeness': 0.42857142857142855, 'mean_in_closeness': 0.6004019789734075, 'max_out_closeness': 0.75, 'min_out_closeness': 0.42

In [0]:
#Error plotting en notebook
#layout = g.layout("kk")
#plot(g, layout = layout)

In [0]:
def read(fin):
   g = Graph.Read_GraphML(fin, False)
   return g

In [0]:
def todot(g,fout):
   g.write_dot(fout)

In [0]:
todot(graph,"/content/drive/My Drive/Colab Notebooks/data/karate2.gml")

## Ejemplo: análisis de personas en Twitter

 * https://drive.google.com/file/d/1Icx-8NidA50GMVx954iTVdiagUImru9N/view
 * https://drive.google.com/open?id=1NGZ_YMI7MC_Le_mQrJv_42KZn9bd_qYh
 * https://drive.google.com/open?id=1wB7drIGh1FwE5C5yqNe16jdpgsWBu5r2

# Referencias

* http://snap.stanford.edu/data/index.html
* https://www.barabasilab.com/course 
* http://networksciencebook.com/
* S. Fortunato and A. Lancichinetti, “Community detection algorithms: a comparative analysis: invited presentation, extended abstract,” in VALUETOOLS, 2009, p. 27.
*	M. Girvan and M. E. J. Newman, “Community structure in social and biological networks,” Proc. Natl. Acad. Sci., vol. 99, no. 12, pp. 7821–7826, Jun. 2002.
*	A. Clauset, M. E. J. Newman, and C. Moore, “Finding community structure in very large networks,” Phys. Rev. E, vol. 70, no. 6, p. 066111, Dec. 2004.
*	A. Mislove, B. Viswanath, P. K. Gummadi, and P. Druschel, “You are who you know: inferring user profiles in online social networks,” in WSDM, 2010, pp. 251–260.
