Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
263 lines (193 sloc) 6.71 KB
#%%
import markdown
from IPython.core.display import display, HTML
def md(str):
display(HTML(markdown.markdown(str + "<br />")))
#%%
md("""
# Infomap
Infomap is a stochastic network clustering algorithm based on minimizing the [Map equation](http://www.mapequation.org/publications.html#Rosvall-Axelsson-Bergstrom-2009-Map-equation).
### The Map Equation
\begin{equation*}
L(M) = q_\curvearrowright H(\mathcal{Q}) + \sum_{i = 1}^{m}{p_{\circlearrowright}^i H(\mathcal{P}^i)}
\end{equation*}
$L(M)$ measures the amount of information it takes to describe a random walk on a network given a partition of the network into modules $M$. It is a sum of the amount of information needed to describe the movements _between_ and _within_ the modules, which balances the goodness of fit with the complexity of the model. For more information, see [www.mapequation.org](http://www.mapequation.org).
### Features
Infomap supports
* Unweighted and weighted links
* Undirected and directed links
* Two-level and multi-level solutions
* First-order and second-order dynamics
* Hard partitions and overlapping partitions
* Single- and multi-layer networks
""")
#%%
md("""
## Getting started
See https://mapequation.github.io/infomap/ for a simple example and python API to get started with Infomap in python
""")
#%%
md("""
### Install Infomap
The v1.0 beta release is available on the PyPI, install it with
```
pip install infomap
```
or upgrade with
```
pip install --upgrade infomap
```
""")
#%%
md("""
**TODO:** Import infomap. Show the version by running `infomap.Infomap().version` if infomap is imported as `infomap` and check that it is at least `1.0.0-beta.14`.
""")
#%% In [1]
# TODO: Fill code here
#%%
md("""
The above should be at least `1.0.0-beta.11`.
""")
#%%
md("""
### Output
Write all files during the tutorials to the `../output` folder, where git will ignore them.
""")
#%%
md("""
## Basic command line use
The installation of the python package also installs a binary for command line use, exemplified below.
See http://www.mapequation.org/code.html#Options for available input flags to `Infomap`
""")
#%%
md("""
**TODO:** Try run the command line version of Infomap installed with the python package. Command line programs can be called directly from jupyter by adding `!` in front, like `!ls`. Run Infomap on the `ninetriangles.net` network in the `data` folder and direct output to the `output` folder. Run it with `5` trials to see the effect of the stochastic nature of Infomap.
""")
#%% In [2]
# TODO: Fill code here
#%%
md("""
### Input network
The input network above was formed as nine triangles clustered in three levels, which was also recovered with the Infomap clustering algorithm after some trials.
![triangle-network](http://www.mapequation.org/assets/img/triangle-network-levels_3.svg)
""")
#%%
md("""
**TODO:**
- Print the input network from above to see the standard input format for Infomap. (Hint: `pathlib.Path(dir)` has a method `read_text()` to give back the whole string.)
""")
#%% In [3]
# TODO: Fill code here
#%%
md("""
### Output format
By default on command line, Infomap writes an output file with the same name as the input by with the `.tree` extension. This file contains the multi-level modular structure of the input network.
**TODO:** Print the result from running Infomap above.
""")
#%% In [4]
# TODO: Fill code here
#%%
md("""
## From Python
The python API gives more flexibility, but we can still work with files in a similar way as the cli use above.
**TODO:**
- Cluster the same network as above but through the Infomap python interface.
- Print the number of levels found and the codelength
- Let Infomap write a [flow tree](http://www.mapequation.org/code.html#FTree-format) file to `ninetriangles.ftree` in the output folder and print the result.
""")
#%% In [5]
# TODO: Fill code here
#%%
md("""
## Basic programmatic use
**TODO:**
- Create an Infomap instance and add some links programmatically
- Run the clustering and iterate over the result and print the tree path to the node, the flow and, if a leaf node, the node id. (Hint: All leaf nodes has a unique `stateId` property and a `physicalId` property, that is identical on first-order networks)
""")
#%% In [6]
# TODO: Fill code here
#%%
md("""
## Infomap + NetworkX
Generate and draw a network with NetworkX, colored
according to the community structure found by Infomap.
**TODO:**
- Create a networkx graph (Hint: `karate_club_graph()` is available on `networkx`)
- Write a function that takes a networkx graph as input, runs a two-level Infomap clustering and sets clusters as node attributes on the networkx graph, mapped by the node id (physicalId).
- Render the networkx graph with nodes colored by the Infomap clustering
""")
#%% In [7]
# TODO: Fill code here
#%% In [8]
# TODO: Fill code here
#%%
md("""
## Higher-order networks
""")
#%%
md("""
### General state networks
The [state format](http://www.mapequation.org/code.html#State-format) describes the exact network used internally by Infomap. It can model both ordinary networks and memory networks (of variable order).
#### Example
```
*Vertices 4
1 "PRE"
2 "SCIENCE"
3 "PRL"
4 "BIO"
# *ngrams
# 1 2 3
# 1 2 2 3
# 4 2 4
*States
#stateId physicalId [name]
1 2 "1 2"
2 3 "2 3"
3 2 "1 2 2"
4 2 "4 2"
5 4 "2 4"
*Links
1 2
3 2
4 5
```
Here some ngrams are represented by ordinary links between a set of state nodes.
""")
#%%
md("""
#### Programmatically creating a state network
**TODO**:
- Create the above network programmatically with Infomap
- Run the clustering and iterate over the underlying state network to print out space separated list of `#stateId physicalId moduleIndex flow` for each leaf node.
- Iterate over the physical network and print `#physicalId moduleIndex flow` for each node
- What happened to state node 1 and 3?
""")
#%% In [9]
# TODO: Fill code here
#%%
md("""
### paths
Infomap can generate a higher-order state network from path data, specifying a certain markov order. Markov order 1 corresponds to an ordinary network where the memory is discarded.
**TODO:**
- Programmatically create a state network of certain order from path data
- Write the state network to file
- Partition the state network and print the codelength and the physical tree as above
- Did you get any overlapping modules on the physical nodes?
""")
#%% In [10]
# TODO: Fill code here
#%%
md("""
#### Running on the generated state network will give the same result
**TODO:**
- Run Infomap on the generated state network from above
- Check that the codelength is the same
""")
#%% In [11]
# TODO: Fill code here
#%%
md("""
### Visualising the multi-level modular network
**TODO:**
- Use the [Network Navigator](http://navigator.mapequation.org) to load the `.ftree` file generated earlier from the triangular network.
""")