# Trabajando con Datos Externos

## Cargar archivos csv
Utilizamos la base de datos de Marvel (https://www.kaggle.com/csanhueza/the-marvel-universe-social-network)

In [None]:
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import networkx as nx

In [1]:
h = pd.read_csv('../data/hero-network.csv')

Revisamos la información del Grafo

In [2]:
h.info()

Transformamos el DataFrame en un Grafo

In [3]:
G = nx.from_pandas_edgelist(h, source = "hero1", target = "hero2")
print(nx.info(G))

Name: Zachary's Karate Club
Type: Graph
Number of nodes: 34
Number of edges: 78
Average degree:   4.5882


Volvemos a generar el Grafo, ahora Dirigido

In [4]:
G = nx.from_pandas_edgelist(h, source = "hero1", target = "hero2", create_using=nx.DiGraph())
print(nx.info(G))

Node 0 has the following properties:
Degree: 16
Neighbors: 1 2 3 4 5 6 7 8 10 11 12 13 17 19 21 31


Crear la función top_nodes que mostrará los valores más altos de un diccionario

In [16]:
def get_top_nodes(cdict, num=5):
    top_nodes ={}
    for i in range(num):
        top_nodes =dict(
            sorted(cdict.items(), key=lambda x: x[1], reverse=True)[:num]
            )
        return top_nodes

#### Grado

Guardar el grado de cada nodo en un diccionario

In [8]:
gdeg=G.degree()

In [17]:
get_top_nodes(dict(gdeg))

{33: 17, 0: 16, 32: 12, 2: 10, 1: 9}

#### In-Degree

In [None]:
indeg=G.in_degree()
get_top_nodes(dict(indeg))

#### Out-Degree

In [None]:
outdeg=G.out_degree()
get_top_nodes(dict(outdeg))

Visualizar el Grafo utilizando como tamaño del nodo la métrica de grado

In [13]:
nx.draw_networkx(G, node_size=[200*val for(node,val)in gdeg])
#limits =plt.axis('off')
plt.show()

True

#### Degree Centrality

In [11]:
degree_centrality =nx.degree_centrality(G)
nx.set_node_attributes(G,degree_centrality, 'dc')
get_top_nodes(degree_centrality)

In [15]:
nx.draw_networkx(G, node_color=colors, node_size=[3000 * v for v in nx.get_node_attributes(G, 'dc').values()])

#### Betweenness

In [None]:
betweenness_centrality = nx.betweenness_centrality(G)
nx.set_node_attributes(G,betweenness_centrality, 'bc')
get_top_nodes(betweenness_centrality)

In [None]:
nx.draw_networkx(G, node_size=[4000 * v for v in nx.get_node_attributes(G, 'bc').values()])

#### Closeness

In [20]:
closeness_centrality =nx.closeness_centrality(G)
nx.set_node_attributes(G,closeness_centrality, 'cc')
get_top_nodes(closeness_centrality)

{0: 0.5689655172413793,
 2: 0.559322033898305,
 33: 0.55,
 31: 0.5409836065573771,
 8: 0.515625}

In [None]:
pos =nx.spring_layout(G)
ec =nx.draw_networkx_edges(G, pos=pos)
nc =nx.draw_networkx_nodes(G, pos=pos,
                           node_color=[v for v in nx.get_node_attributes(G, 'cc').values()],
                           node_size=[1200 * v for v in nx.get_node_attributes(G, 'cc').values()])
lb =nx.draw_networkx_labels(G,pos =pos)

#### Eigenvector Centrality

In [22]:
eigenvector_centrality = nx.eigenvector_centrality(G)
nx.set_node_attributes(G, eigenvector_centrality,'ec')
get_top_nodes(eigenvector_centrality)

{33: 0.373371213013235,
 0: 0.3554834941851943,
 2: 0.31718938996844476,
 32: 0.3086510477336959,
 1: 0.2659538704545025}

In [None]:
nx.draw_networkx(G, node_size=[2400 * v for v in nx.get_node_attributes(K, 'ec').values()])

#### PageRank Centrality

In [24]:
pagerank_centrality =nx.pagerank(G)
nx.set_node_attributes(G, pagerank_centrality, 'pr')
get_top_nodes(pagerank_centrality)

{33: 0.1009179167487121,
 0: 0.09700181758983709,
 32: 0.07169213006588289,
 2: 0.057078423047636745,
 1: 0.05287839103742701}

nx.draw_networkx(G, node_size=[5000 * v for v in nx.get_node_attributes(K, 'pr').values()], pos=pos)

## Métricas de Grafo

#### All Shortest Path

In [36]:
list(nx.all_shortest_paths(G,0,4))

[[0, 4]]

In [46]:
list(nx.all_shortest_paths(G,0,33))

[[0, 8, 33], [0, 13, 33], [0, 19, 33], [0, 31, 33]]

In [47]:
nx.shortest_path_length(G,0,33)

2

In [None]:
list(nx.shortest_path_length(G))

#### Average Path Length

In [49]:
nx.average_shortest_path_length(G)

2.408199643493761

#### Diametro

In [50]:
nx.diameter(G)

5

#### Densidad

In [51]:
nx.density(G)

0.13903743315508021

#### Local Clustering Coefficient

In [52]:
nx.average_clustering(G)

0.5706384782076823

## Layouts

Random

In [None]:
nx.draw_random(G)

Circular

In [None]:
nx.draw_circular(G)

Spectral

In [None]:
nx.draw_spectral(G)

Shell

In [None]:
nx.draw_shell(G)

Spiral

In [None]:
pos =nx.spiral_layout(G)
nx.draw_networkx(G, pos=pos)
plt.show()

Elaborado por Luis Cajachahua bajo licencia MIT (2021)