# MultiLayerNetwork Tutorial

<span style="font-size:13pt">In this tutorial we show how the MultiLayerNetwork (MLN) class can be used to work with the dummy POPNET network. For that we go through the relevant methods and show simple examples of how to use them. We show most functionalities on a subgraph of the network to keep processing times lower, but since this is also an MLN class, everything of course also works on the entire network.</span>

<span style="font-size:13pt">To import the class:</span>

In [1]:
# might need to be altered depending on where mln.py is located
import sys
sys.path.append("../src")
from mln import MultiLayerNetwork

We are currently working in the /home/bokanyie/projects/popnet_mln directory.


<br><span style="font-size:13pt">To create an instance of the MLN class using all POPNET data:</span>

In [2]:
# loading from raw CSV files using the RawCSVPreparer class from src/preparation.py
popnet = MultiLayerNetwork(load_from_config=True, config_path='src/config.json')
# loading from a saved files using the MultiLayerNetwork class from src/mln.py
popnet = MultiLayerNetwork(load_from_library=True, library_path='test_save')


            You are loading from raw CSV files.
            This is a slow process, and it is recommended to use the library mode.
            Use the export method to save time in the future.
                  
            E.g.:
            >>> mln.export("my_library")
            
Reading edge file test_data/dummy_popnet/popnet_edgelist.csv...
	Loading file...
	Done.
Trying to create enriched layer dataframe...
	Adding binary representation, groups and long labels...
Done.
Reading layers.csv...
Layer dataframe           label         layer  binary         group  \
0          work          work       1          work   
1        school        school       2        school   
2        cousin        cousin       4        cousin   
3         child         child       8         child   
4  aunt / uncle  aunt / uncle      16  aunt / uncle   

                   label_long  
0                  work: work  
1              school: school  
2              cousin: cousin  
3                child

In [3]:

# creating from in-memory objects using the MultiLayerNetwork class from src/mln.py
from scipy.sparse import load_npz
import pandas as pd

edges = load_npz('test_save/edges.npz')
nodes = pd.read_csv('test_save/nodes.csv.gz')
layers = pd.read_csv('test_save/layers.csv')

popnet = MultiLayerNetwork(nodes, edges, layers)

In [4]:
popnet.nodes

Unnamed: 0,id,label,generation,gender
0,0,50,1,0
1,1,51,1,1
2,2,52,1,1
3,3,53,1,1
4,4,54,1,1
...,...,...,...,...
141,141,195,3,0
142,142,196,3,0
143,143,197,3,1
144,144,198,3,0


In [5]:
popnet.A

<146x146 sparse matrix of type '<class 'numpy.int64'>'
	with 3680 stored elements in Compressed Sparse Row format>

In [6]:
popnet.layers

Unnamed: 0,label,layer,binary,group,label_long
0,work,work,1,work,work: work
1,school,school,2,school,school: school
2,cousin,cousin,4,family,cousin: cousin
3,child,child,8,family,child: child
4,aunt / uncle,aunt / uncle,16,family,aunt / uncle: aunt / uncle
5,household,household,32,family,household: household
6,parent,parent,64,family,parent: parent
7,neighbor,neighbor,128,family,neighbor: neighbor
8,niece / nephew,niece / nephew,256,family,niece / nephew: niece / nephew
9,mother / father-in-law,mother / father-in-law,512,family,mother / father-in-law: mother / father-in-law


---

## Filtering

<br><span style="font-size:13pt">To get an instance of the MLN class only containing **nodes with certain labels**</span>

In [7]:
from random import sample
# selecting females
selection = popnet.nodes[popnet.nodes["gender"]==0]["label"]

In [8]:
filtered = popnet.get_filtered_network(nodes_selected=selection)

print("Number of people:", filtered.N)
print("Number of connected pairs of people:", filtered.A.nnz)

Number of people: 76
Number of connected pairs of people: 914


<br><span style="font-size:13pt">To get an instance of the MLN class containing just **certain layers**, e.g. the household layer:</span>

In [9]:
filtered_h = filtered.get_filtered_network(layers_selected=["household"],layer_type="label")

print("Number of people:", filtered_h.N)
print("Number of connected pairs of people:", filtered_h.A.nnz)

Number of people: 76
Number of connected pairs of people: 50


To get an instance containing just certain groups of layers, e.g. the family edges:

In [10]:
filtered_g = filtered.get_filtered_network(groups_selected=["family"])

<br><span style="font-size:13pt">These filtering methods can also be used in **any combination**, e.g. all together:</span>

In [11]:
filtered = popnet.get_filtered_network(
    nodes_selected=selection,
    layers_selected=["household"],
    groups_selected=["family"]    
)

print("Number of people:", filtered.N)
print("Number of connected pairs of people:", filtered.A.nnz)

Number of people: 76
Number of connected pairs of people: 364


---

## Conversion to other formats

### igraph

<span style="font-size:13pt">Note that igraph automatically labels nodes from 0 through N-1 but these labels very likely do not line up with the mapping from IDs to labels as stored in `mln.nodemap_back`. However, we do always store the labels as node attributes.</span>

In [12]:
# prerequisites, also make sure to import the MLN class and create the
# `popnet` instance, as shown at the very start of this tutorial

import igraph as ig

filtered = popnet.get_filtered_network(nodes_selected=selection)

---

<span style="font-size:13pt">To obtain an **igraph Graph object** representing the MLN:</span>

In [13]:
g_igraph = filtered.to_igraph(node_attributes=True, edge_attributes=True)

ig.summary(g_igraph) # print the node count, edge count, and a list of the available attributes

IGRAPH D--- 76 672 -- 
+ attr: gender (v), generation (v), id (v), label (v), layer (e)


<br><span style="font-size:13pt">To obtain an **undirected** igraph Graph object of the MLN:</span>

In [14]:
g_igraph_u = filtered.to_igraph(directed=False)

ig.summary(g_igraph_u)
print("Directed:", g_igraph_u.is_directed())

IGRAPH U--- 76 336 -- 
+ attr: label (v), layer (e)
Directed: False


<br><span style="font-size:13pt">To **omit link types and node attributes** (except for the node labels):</span>

In [15]:
g_igraph = filtered.to_igraph(edge_attributes=False, node_attributes=False)

ig.summary(g_igraph)

IGRAPH D--- 76 672 -- 
+ attr: label (v)


<br><span style="font-size:13pt">The igraph objects are also stored in the `igraph` attribute of the MLN class. Unless we specify otherwise, only the result from the first call to `.get_igraph()` will be stored. To overwrite this:</span>

In [16]:
print("Before:")
ig.summary(filtered.igraph)

# overwrite using replace_igraph=True
filtered.to_igraph(replace_igraph=True, directed=False, node_attributes=True)

print("\nAfter:")
ig.summary(filtered.igraph)

Before:
IGRAPH D--- 76 672 -- 
+ attr: gender (v), generation (v), id (v), label (v), layer (e)

After:
IGRAPH U--- 76 336 -- 
+ attr: gender (v), generation (v), id (v), label (v), layer (e)


<br>

---

### NetworkX

<span style="font-size:13pt">Note that NetworkX automatically labels nodes from 0 through N-1 but these labels very likely do not line up with the mapping from IDs to labels as stored in the class. However, we do always store the labels as node attributes.</span>

In [17]:
# prerequisites, also make sure to import the MLN class and create the
# `popnet` instance, as shown at the very start of this tutorial

# TODO I stop here

import networkx as nx

filtered = popnet.get_filtered_network(nodes_selected=selection)

---

<span style="font-size:13pt">We can also obtain a **NetworkX graph** representation of the MLN. Since NetworkX is a less efficient library, we recommend to only use if for networks smaller than `mln.nx_limit`. To create a NetworkX graph for a network that is larger:</span>

In [18]:
g_nx = filtered.to_networkx(ignore_limit=True)

print(nx.info(g_nx))

DiGraph with 76 nodes and 672 edges


<br><span style="font-size:13pt">To obtain an **undirected** version of the network:</span>

In [19]:
g_nx = filtered.to_networkx(node_attributes=True,directed=False, ignore_limit=True)

print(nx.info(g_nx))

Graph with 76 nodes and 336 edges


<br><span style="font-size:13pt">To **node attributes** (except for the labels):</span>

In [20]:
print(g_nx.nodes[0])
print(g_nx.edges[0, 15]) # inspect one link manually to show the link type

# new graph without node and edge attributes
g_nx = filtered.to_networkx(edge_attributes=False, node_attributes=False, ignore_limit=True)

print()
print(g_nx.nodes[0])
print(g_nx.edges[0,15])

{'id': 0, 'label': 50, 'generation': 1, 'gender': 0}
{'link_types': ['sister / brother-in-law']}

{'label': 50}
{}


<span style="font-size:13pt">The NetworkX networks are not stored directly as an attribute of the MLN class, and so there is no replace function as there is for igraph.</span>

---


## Exporting and importing

---

<span style="font-size:13pt">To **export all data** in an MLN object to a library:</span>

In [21]:
filtered = popnet.get_filtered_network(nodes_selected=selection)
filtered.save(path="filtered_full")

The folder "filtered_full/" already exists, call function with `overwrite = True` keyword argument if you're sure!


<span style="font-size:13pt">If a folder with the given name already exists, we can overwrite it with the argument `overwrite=True`.</span>

<br><span style="font-size:13pt">To **import** from a libary:</span>

In [22]:
filtered2 = MultiLayerNetwork(
    load_from_library=True,
    library_path="filtered_full"
)

<br><span style="font-size:13pt">The **node attributes** can also be exported separately:</span>

In [23]:
filtered.export_nodes("filtered_nodes.csv.gz")

<span style="font-size:13pt">Uncompressed files and other file separators are also available.</span>

<br><span style="font-size:13pt">The **adjacency matrix** can also be exported separately, extension chooses between formats:</span>

In [24]:
# TODO this should be export graphml
filtered.export_edges("filtered_graph.graphml")

<span style="font-size:13pt">Other file types and separators are also available.</span>

---

## Other functionalities / examples

In [25]:
# prerequisites, also make sure to import the MLN class and create the
# `popnet` instance, as shown at the very start of this tutorial

---

<span style="font-size:13pt">To obtain the ego network of a person at a certain depth (the returned object is also of the MLN class):</span>

In [26]:
ego = popnet.get_egonetwork(popnet.to_label(4), depth=3)
ego.A

<146x146 sparse matrix of type '<class 'numpy.int64'>'
	with 2628 stored elements in Compressed Sparse Row format>

<br><span style="font-size:13pt">To create an affiliation matrix between people and a certain attribute, e.g. gender:</span>

In [27]:
# first create an edgelist of (person, attribute value) pairs 
affiliation_edgelist = popnet.nodes.set_index("label")['gender'].dropna().to_dict().items()

# create the affiliation matrix, under key 'work'
popnet.create_affiliation_matrix('gender', affiliation_edgelist)

<span style="font-size:13pt">This affiliation matrix is now stored in `.affiliation_matrix`, which is a dictionary that can store several affiliation matrices. We can acces the one that was just made with key "work" using:</span>

In [28]:
popnet.affiliation_matrix["gender"]["A"]

<146x2 sparse matrix of type '<class 'numpy.int64'>'
	with 146 stored elements in Compressed Sparse Row format>

Get adjacency matrix for single layer, if necessary, store it in class:

In [29]:
popnet.get_layer_adjacency_matrix(layer = "parent")

<146x146 sparse matrix of type '<class 'numpy.int64'>'
	with 338 stored elements in Compressed Sparse Row format>

In [30]:
popnet.get_layer_adjacency_matrix(layer = "school", store = True)

<146x146 sparse matrix of type '<class 'numpy.int64'>'
	with 762 stored elements in Compressed Sparse Row format>

In [31]:
popnet.layer_adjacency_matrix

{'school': <146x146 sparse matrix of type '<class 'numpy.int64'>'
 	with 762 stored elements in Compressed Sparse Row format>}

In [32]:
popnet.get_edgelist(edge_attribute="layer")

Unnamed: 0,source,target,layer
0,50,51,sibling
1,50,52,sibling
2,50,82,sister / brother-in-law
3,50,97,sister / brother-in-law
4,50,117,aunt / uncle
...,...,...,...
3678,199,197,household
3678,199,197,sibling
3679,199,198,school
3679,199,198,household


In [33]:
# Get all methods and attributes
methods_and_attributes = dir(popnet)

# Filter out the methods
methods = [method for method in methods_and_attributes if callable(getattr(popnet, method)) and not method.startswith("__")]

print('\n'.join(methods))

convert_layer_binary_to_list
convert_layer_representation
create_affiliation_matrix
export_edges
export_layers
export_nodes
get_aggregated_network
get_clustering_coefficient
get_degrees
get_edgelist
get_egonetwork
get_filtered_network
get_layer_adjacency_matrix
get_supra_adj_matrix
init_codebook
init_layer_dict
load
report_time
save
to_binary_adjacency
to_id
to_igraph
to_label
to_networkx
verboseprint
