<img src="https://i.imgur.com/6U6q5jQ.png"/>






<a target="_blank" href="https://colab.research.google.com/github/PythonVersusR/DataStructures_graphs/blob/main/Graphs_Creation.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>


# Graphs

Let me show you a graph (from [wikipedia](https://en.wikipedia.org/wiki/Graph_(discrete_mathematics))):

<img src="https://upload.wikimedia.org/wikipedia/commons/5/5b/6n-graf.svg"/>

As you can see, it is simply a representation of two sets:

1. A set of **vertices** or **nodes**. In the image above you see the nodes _1_, _2_, _3_, _4_, _5_, and _6_.
2. A set of **edges** or **links**. In the image above, the links are connecting pairs of nodes. 

Altogether, a _graph_ reveals some _relationship_ among the _nodes_. The graph structure will allow us to explore and understand that relationship. 

## Creating Graphs

The graph above can be represented computationally in Python using **networkx**:

In [None]:
import networkx as nx

# create graph
G = nx.Graph()

# create nodes and edges
G.add_edges_from([(1, 2), (1, 5),(2,5),(2,3),(3,4),(4,5),(4,6)])

## Basic Elements

**G** is the graph:

In [None]:
#you don't see much...just what it is:
G

In [None]:
# You see nodes
G.nodes()

In [None]:
# You see edges
G.edges()

## Drawing

As you can see, the graph is created by adding pairs of nodes. Once you complete that stage, you can draw the graph:

In [None]:
# draw
nx.draw(G,
        with_labels=True,
        node_color='white',
        edgecolors='black')

Notice that the position of the nodes will vary every time you re draw the graph. In fact, drawing a graph can become a challenge by itself when we need to find information via visualization. 

### Edge directionality

The graph we created and drew represented an **undirected** graph, that is, the relationships between a pair of nodes are **symmetric**: the relationships can not represent direction because they are _inherently mutual_ between the nodes. For example, the relationship *to be a neighbor of* is symmetric.

The following graph is **directed**:

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/23/Directed_graph_no_background.svg/340px-Directed_graph_no_background.svg.png"/>


You can create this directed graph this way:

In [None]:
# create graph
dG = nx.DiGraph()

# create nodes and edges
dG.add_edges_from([(1, 2), (1, 3),(3,2),(3,4),(4,3)])

# drawing
nx.draw(dG,with_labels=True,node_color='white',edgecolors='black')

Directed links are also called **arcs**. Notice the _DiGraph_ created represents an **asymmetric** relationship: the relationship a node has with another node does not need to be mutual, but could be (see nodes _3_ and _4_). If the arcs represent **cares for someone**, it shows that the feeling is not reciprocal for most cases in this graph. If a relationship is not mutual, and can only be in one direction, it is called **anti symmetric**.

## Attributes

Nodes can have attributes:

In [None]:
# preparing attributes
dictOfAttr=dict(zip(dG.nodes,['male','male','female','female']))
# see
dictOfAttr


In [None]:
# adding attributes
nx.set_node_attributes(dG, dictOfAttr,'sex')

In [None]:
# seeing attributes
dG.nodes.data()

In [None]:
# see the values of an attribute of the node
nx.get_node_attributes(dG,'sex').values()

Attributes can serve for some computational purposes, but also help to visually find structures.

In [None]:
# using node attributes

colors_for_nodes=['green' if attr=='male' else 'red' for attr in nx.get_node_attributes(dG,'sex').values()]
nx.draw(dG,
        with_labels=True,
        node_color=colors_for_nodes)

Of course, edges can have attributes too:

In [None]:
#preparing attrs
dictOfAttr=dict(zip(dG.edges,[1,3,5,10,2]))

# adding attributes
nx.set_edge_attributes(dG, dictOfAttr,'weight')

In [None]:
# see them
dG.edges.data()

Let's use edge attributes:

In [None]:
#notice
#Getting some values
weight_values = nx.get_edge_attributes(dG,'weight') # recovering 'weights'
weight_values

In [None]:
list(weight_values.values())

In [None]:
pos = nx.random_layout(dG) # position of the nodes

weights = list(nx.get_edge_attributes(dG,'weight').values())
nx.draw(dG, with_labels=True,
        node_color=colors_for_nodes,
        width=weights)

## Exporting

Networkx does not recommend its use for complex visualization. So, several times you may want to export your graph to be visualized in [Gephi](https://gephi.org/) or something similar:

In [None]:
import os

where=os.path.join("data","dG_Py.graphml")
nx.write_graphml(dG, where,encoding='utf-8')

# Case: Elites in Peru

The network we are going to build is based on the relationships studied in this paper:
<img src='https://github.com/PythonVersusR/DataStructures_graphs/blob/main/images/paper.png?raw=true' width="900">


In that paper, Professor Figueroa shows this table, where 1 represents that both nodes (families) appear together at least once in a top company board. Notice the last column is an attribute:

<img src="https://github.com/PythonVersusR/DataStructures_graphs/blob/main/images/dataRed.png?raw=true" width="900">


This is an spreadsheet, representing the information above:

<img src="https://github.com/PythonVersusR/DataStructures_graphs/blob/main/images/dataExcel.png?raw=true" width="900">


Let's use the data from the spreadsheet to prepare our table:

In [None]:
# reading in (install openpyxl before)

import pandas as pd

FigueData = pd.read_excel("https://github.com/PythonVersusR/DataStructures_graphs/raw/main/data/dataFigueroa.xlsx",
                          index_col=0) #notice!!!!!

We got this:

In [None]:
FigueData.head()

As intended, the family appears as the row index (not the first column of data). The family is also the column names:

In [None]:
FigueData.columns

The **adjacency matrix** does not need the attribute column, then:

In [None]:
varsToDrop=['Multinacional']
adjacency=FigueData.drop(varsToDrop,axis=1) 

#result
adjacency

It is easy to turn the adjacency matrix into a graph:

In [None]:
EliteNet = nx.from_pandas_adjacency(adjacency)

Take a look at **EliteNet**:

In [None]:
# nodes:
EliteNet.nodes()

In [None]:
# edges:
EliteNet.edges()

Let's see how this graph looks:

In [None]:
# plot
nx.draw_random(EliteNet,
                node_color='yellow',
                edge_color='lightblue',
                with_labels=True,
                font_size=8)

The adjacency include the self-relationships, we should take those away:

In [None]:
EliteNet.remove_edges_from(nx.selfloop_edges(EliteNet))

In [None]:
# re plot
nx.draw_random(EliteNet,
                node_color='yellow',
                edge_color='lightblue',
                with_labels=True,
                font_size=8)

Let's add the attributes to the nodes:

In [None]:
FigueData['Multinacional'].head()

Currently:

In [None]:
# no attribute:
EliteNet.nodes.data()

Let me prepare a dictionary of attributes:

In [None]:
dictOfAttr=dict(zip(FigueData.index,FigueData['Multinacional']))
dictOfAttr

Then, I can use that to add an attribute to the node:

In [None]:
nx.set_node_attributes(EliteNet, dictOfAttr,'multi')

Updated nodes:

In [None]:
EliteNet.nodes.data()

In [None]:
# using node attributes

colors_for_nodes=['green' if attr else 'red' for attr in nx.get_node_attributes(EliteNet,'multi').values()]
nx.draw_random(EliteNet,
        with_labels=True,
        node_color=colors_for_nodes)

Trying a different layout:

In [None]:
nx.draw_circular(EliteNet,
        with_labels=True,
        node_color=colors_for_nodes)

Let's export this graph:

In [None]:
where=os.path.join("data","EliteNet_Py.graphml")
nx.write_graphml(EliteNet, where,encoding='utf-8')