# Detecting communities with *Infomap*
This Notebook accesses the *Infomap* [Python API](https://mapequation.github.io/infomap/python/) to detect communities in the constructed flow network.

In [2]:
import sys
from pathlib import Path

import infomap
import pandas as pd
from IPython.core.display import display

sys.path.insert(1, str(Path.cwd() / 'utils'))
from constants import NUMBER_OF_SEEDS  # noqa: E402
from classifications import *  # noqa: E402

## Loading network
*Infomap* is given a network with which to be used on. A `DataFrame` object is used to store information about each uniquely seeded run of *Infomap*.

In [3]:
particle_type = Restricted
season = Season.fall
markov_time = 2

im = infomap.Infomap()
im.read_file(str(Path.cwd() / 'network' / particle_type.name / season.name / 'network.txt'))

seeds = pd.DataFrame(columns=['codelength', 'modules'])
seeds.index.name = 'seed'

## Running *Infomap*
Since *Infomap* is a stochastic and recursive heuristic algorithm, results vary by the random seed used. As such, a variety of seeds should be used and then selected from.

We give *Infomap* the following specifications:
* `-2` for a two-level solution.
* `-d` for a directed network.
* `-k` to allow for loops (edges relating a vertex to itself).
* `-N 20` for 20 outer-most loops to be run before choosing the best network partition.
* `--markov-time markov_time` to specify the scale of structures
* `-s seed` to specify the random seed to be used

In [4]:
for seed in range(NUMBER_OF_SEEDS):
    # Run 20 outer-most loops of Infomap to find the partition with the smallest codelength
    im.run(f'-k -2 -d --markov-time {markov_time} -s {seed} -N 20')
    seeds.loc[seed] = [im.codelength, im.num_top_modules]

    # Write best partition in .clu format
    clu = Path.cwd() / 'network' / particle_type.name / season.name / str(markov_time) / 'clus' / f'seed_{seed}.txt'
    im.write_clu(str(clu))

## Solutions
A good solution is indicated by a minimized codelength.

In [5]:
display(seeds)
best_seeds = list(seeds.loc[seeds['codelength'] == seeds.min()['codelength']].index)
print(f'seed(s) of minimum codelength: {best_seeds}')

Unnamed: 0_level_0,codelength,modules
seed,Unnamed: 1_level_1,Unnamed: 2_level_1
0,6.001675,13.0
1,6.021562,14.0
2,6.037439,12.0
3,6.002143,12.0
4,6.001675,13.0
...,...,...
95,6.023113,12.0
96,6.005448,12.0
97,6.045717,12.0
98,6.001375,12.0


seed(s) of minimum codelength: [27, 44]
