<img src="https://i.imgur.com/6U6q5jQ.png"/>







# Graphs

Let me show you a graph (from [wikipedia](https://en.wikipedia.org/wiki/Graph_(discrete_mathematics))):

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/6n-graf.svg/440px-6n-graf.svg.png"/>

As you can see, it is simply a representation of two sets:

1. A set of **vertices** or **nodes**. In the image above you see the nodes _1_, _2_, _3_, _4_, _5_, and _6_.
2. A set of **edges** or **links**. In the image above, the links are connecting pairs of nodes. 

Altogether, a _graph_ reveals some _relationship_ among the _nodes_. The graph structure will allow us to explore and understand that relationship. 

## Creating Graphs

The graph above can be represented computationally in Python using **networkx**:

In [None]:
import networkx as nx

# create graph
G = nx.Graph()

# create nodes and edges
G.add_edges_from([(1, 2), (1, 5),(2,5),(2,3),(3,4),(4,5),(4,6)])

## Basic Elements

**G** is the graph:

In [None]:
#you don't see much...just what it is:
G

In [None]:
# You see nodes
G.nodes()

In [None]:
# You see edges
G.edges()

## Drawing

As you can see, the graph is created by adding pairs of nodes. Once you complete that stage, you can draw the graph:

In [None]:
# draw
nx.draw(G,
        with_labels=True,
        node_color='white',
        edgecolors='black')

Notice that the position of the nodes will vary every time you re draw the graph. In fact, drawing a graph can become a challenge by itself when we need to find information via visualization. 

### Edge directionality

The graph we created and drew represented an **undirected** graph, that is, the relationships between a pair of nodes are **symmetric**: the relationships can not represent direction because they are _inherently mutual_ between the nodes. For example, the relationship *to be a neighbor of* is symmetric.

The following graph is **directed**:

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/23/Directed_graph_no_background.svg/340px-Directed_graph_no_background.svg.png"/>


You can create this directed graph this way:

In [None]:
# create graph
dG = nx.DiGraph()

# create nodes and edges
dG.add_edges_from([(1, 2), (1, 3),(3,2),(3,4),(4,3)])

# drawing
nx.draw(dG,with_labels=True,node_color='white',edgecolors='black')

Directed links are also called **arcs**. Notice the _DiGraph_ created represents an **asymmetric** relationship: the relationship a node has with another node does not need to be mutual, but could be (see nodes _3_ and _4_). If the arcs represent **cares for someone**, it shows that the feeling is not reciprocal for most cases in this graph. If a relationship is not mutual, and can only be in one direction, it is called **anti symmetric**.

## Attributes

Nodes can have attributes:

In [None]:
# adding attributes
dG.nodes[1]["sex"]='male'
dG.nodes[2]["sex"]='male'
dG.nodes[3]["sex"]='female'
dG.nodes[4]["sex"]='female'

In [None]:
# seeing attributes
dG.nodes.data()

Attributes can serve for some computational purposes, but also help to visually find structures.

In [None]:
# using node attributes

colors_for_nodes=['green' if n[1]['sex']=='male' else 'red' for n in dG.nodes.data()]
nx.draw(dG,
        with_labels=True,
        node_color=colors_for_nodes)

Of course, edges can have attributes too:

In [None]:
dG.edges[(1, 2)]['weight']=1
dG.edges[(1, 3)]['weight']=3
dG.edges[(3, 2)]['weight']=5
dG.edges[(3, 4)]['weight']=10
dG.edges[(4, 3)]['weight']=0.5

In [None]:
# see them
dG.edges.data()

Let me add another attribute conditional on another attribute:

In [None]:
# all edges will have the attribute color, and everyy edge will have 'black'
nx.set_edge_attributes(dG,values='grey',name='color')

# updating 'color' attribute:
for x in nx.get_edge_attributes(dG,'weight').items(): 
    if  dG.edges[x[0]]['weight']<1:
        dG.edges[x[0]].update({'color': 'magenta'}) 

In [None]:
# see them again
dG.edges.data()

Let's use edge attributes:

In [None]:
#notice
#Getting some values
weight_values = nx.get_edge_attributes(dG,'weight') # recovering 'weights'
weight_values

In [None]:
# use the edge attributes
# add labels to edges
            
pos = nx.spring_layout(dG) # position of the nodes
nx.draw(dG,
        pos, # using "position"
        with_labels=True,
        node_color=colors_for_nodes)
# adding labels
final_dG=nx.draw_networkx_edge_labels(dG,pos,edge_labels=weight_values)

The color of edges:

In [None]:
nx.get_edge_attributes(dG,'color') # recovering 'color'

In [None]:
nx.get_edge_attributes(dG,'color').values()

In [None]:
# use the edge attributes
# add labels to edges
# add color to edges

edge_colors=nx.get_edge_attributes(dG,'color').values()

pos = nx.spring_layout(G) 

# draw nodes first
nx.draw_networkx_nodes(dG,pos,
                       node_color=colors_for_nodes)
# draw edges
nx.draw_networkx_edges(dG, pos,
                       edge_color= edge_colors)
# draw node labels
nx.draw_networkx_labels(dG, pos)

final_dG=nx.draw_networkx_edge_labels(dG,pos,
                               edge_labels=weight_values)

In [None]:
# use the edge attributes
# add labels to edges
# add color to edges
# change style of edge
            
pos = nx.spring_layout(G) # position of the nodes

# draw nodes first
nx.draw_networkx_nodes(dG,pos,
                       node_color=colors_for_nodes)
# draw edges
nx.draw_networkx_edges(dG, pos,
                       edge_color= edge_colors,
                       connectionstyle="arc3,rad=0.15" )
# draw node labels
nx.draw_networkx_labels(dG, pos)

final_dG=nx.draw_networkx_edge_labels(dG,pos,label_pos=0.40,
                                      edge_labels=weight_values)


Remember:

In [None]:
weight_values

Notice the use of enumerate:

In [None]:
[(i,e) for i,e in enumerate(dG.edges.data())]

In [None]:
{(e[0],e[1]):e[2]['weight'] for i,e in enumerate(dG.edges.data())}

In [None]:
# use the edge attributes
# add labels to edges
# add color to edges
# change style of edge
# coloring edge labels (one at a time)
            
pos = nx.spring_layout(G) # position of the nodes

# draw nodes first
nx.draw_networkx_nodes(dG,pos,
                       node_color=colors_for_nodes)
# draw edges
nx.draw_networkx_edges(dG, pos,
                       edge_color= edge_colors,
                       connectionstyle="arc3,rad=0.15" )
# draw node labels
nx.draw_networkx_labels(dG, pos)

[nx.draw_networkx_edge_labels(dG,pos,edge_labels={(e[0],e[1]):e[2]['weight']},
                              font_color=e[2]['color'],label_pos=0.40) for i,e in enumerate(dG.edges.data())]

In [None]:
# use the edge attributes
# NO labels to edges, weight for thickness
# add color to edges
# change style of edge


pos = nx.spring_layout(G) # position of the nodes

# draw nodes first
nx.draw_networkx_nodes(dG,pos,
                       node_color=colors_for_nodes)
# draw edges
nx.draw_networkx_edges(dG, pos,
                       width=list(weight_values.values()),
                       edge_color= edge_colors,
                       connectionstyle="arc3,rad=0.2" )
# draw node labels
nx.draw_networkx_labels(dG, pos)

[nx.draw_networkx_edge_labels(dG,pos,edge_labels={(e[0],e[1]):e[2]['weight']},
                              font_color=e[2]['color'],label_pos=0.40) for i,e in enumerate(dG.edges.data())]

## Exporting

Network does not recommend its use for complex visualization. So, several times you may want to export your graph to be visualize in Gephi or something similar:

In [None]:
import os

nx.write_graphml(dG, os.path.join("data","dG.graphml"),encoding='utf-8')

# Case: Elites in Peru

The network we are going to build is based on the relationships studied in this paper:
<img src='https://github.com/PythonVersusR/DataStructures_graphs/blob/main/images/paper.png?raw=true' width="900">


In that paper, Professor Figueroa shows this table, where 1 represents that both nodes (families) appear together at least once in a top company board. Notice the last column is an attribute:

<img src="https://github.com/PythonVersusR/DataStructures_graphs/blob/main/images/dataRed.png?raw=true" width="900">


This is an spreadsheet, representing the information above:

<img src="https://github.com/PythonVersusR/DataStructures_graphs/blob/main/images/dataExcel.png?raw=true" width="900">


Let's use the data from the spreadsheet to prepare our table:

In [None]:
# reading in (install openpyxl before)

import pandas as pd

FigueData = pd.read_excel("https://github.com/PythonVersusR/DataStructures_graphs/raw/main/data/dataFigueroa.xlsx",
                          index_col=0) #notice!!!!!

We got this:

In [None]:
FigueData.head()

As intended, the family appears as the row index (not the first column of data). The family is also the column names:

In [None]:
FigueData.columns

The **adjacency matrix** does not need the attribute column, then:

In [None]:
varsToDrop=['Multinacional']
adjacency=FigueData.drop(varsToDrop,axis=1) 

#result
adjacency

It is easy to turn the adjacency matrix into a graph:

In [None]:
EliteNet = nx.from_pandas_adjacency(adjacency)

Take a look at **EliteNet**:

In [None]:
# nodes:
len(EliteNet)

In [None]:
# edges:
EliteNet.size()

In [None]:
# plot
nx.draw_random(EliteNet,
                node_color='yellow',
                edge_color='lightblue',
                with_labels=True,
                font_size=8)

The adjacency include the self-relationships, we should take those away:

In [None]:
EliteNet.remove_edges_from(nx.selfloop_edges(EliteNet))

In [None]:
# re plot
nx.draw_random(EliteNet,
                node_color='yellow',
                edge_color='lightblue',
                with_labels=True,
                font_size=8)

Let's add the attributes to the nodes:

In [None]:
FigueData['Multinacional'].head()

Currently:

In [None]:
# no attribute:
EliteNet.nodes.data()

Let me prepare a dictionary:

In [None]:
dict(zip(FigueData.index,FigueData['Multinacional']))

Then, I can use that to add an attribute to the node:

In [None]:
attributeToAdd=dict(zip(FigueData.index,FigueData['Multinacional']))
nx.set_node_attributes(EliteNet, attributeToAdd,'multi')

Updated nodes:

In [None]:
EliteNet.nodes.data()

In [None]:
# using node attributes

colors_for_nodes=['green' if n[1]['multi']==1 else 'red' for n in EliteNet.nodes.data()]
nx.draw_random(EliteNet,
        with_labels=True,
        node_color=colors_for_nodes)

In [None]:
nx.draw_circular(EliteNet,
        with_labels=True,
        node_color=colors_for_nodes)

Let's export this graph:

In [None]:
import os
nx.write_graphml(EliteNet, os.path.join("data","EliteNet.graphml"),encoding='utf-8')