# Mesoscale Structures Detection

The scope of this notebook is to provide some examples of how SurpriseMeMore can
be used to detect mesoscale structure (e.g. core periphery) in binary
 and weighted network. We use the undirected *les miserables* character network
 in the next examples, but all the applications
 that we are going to see can be generalized to the case of directed networks.

### Important

All the methods we present are heuristic, we suggest then to run it more than once
to increase the chance to find the best partitioning. The methods usually find the
best solution without running it too many times.

In [4]:
from surprisememore import Undirected_Graph_Class as ug
import numpy as np
import networkx as nx
import pandas as pd

aux_path = 'out.moreno_lesmis_lesmis'
prova = np.loadtxt(aux_path,comments='%')
edgelist = pd.DataFrame(prova,columns=['source','target','weight'])
network = nx.from_pandas_edgelist(edgelist,source='source',target='target',edge_attr=True,create_using=nx.Graph)
adjacency_matrix = nx.to_numpy_array(network)

We initialize our SupriseMeMore **UndirectedGraph** object with the adjacency
matrix. The available options are adjacency matrix or edgelist. We suggest to
use numpy array in both cases.

## Basic Usage

In [5]:
graph = ug.UndirectedGraph(adjacency_matrix)

Now that our graph istance is initalized we can run discrete detection of mesoscale
structures by tapping the following command.

In [6]:
graph.run_discrete_cp_detection()

# The optimal partitioning is given by
print(graph.solution)
# This is a numpy array where the membership of each node is stored. Node with
# the same membership are in the same cluster.

# The relative logsurprise is
print(graph.log_surprise)
# and the associated p-value (surprise) is
print(graph.surprise)

100%|██████████| 254/254 [00:00<00:00, 2400.76it/s]
100%|██████████| 254/254 [00:00<00:00, 2619.18it/s]

[1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1
 1 1 1]
217.99342901379407
1.0152452966619884e-218





### Important

Everytime that you run the algorithm the values *solution*, *log_surprise* and
*surprise* are overwritten.

## Mesoscale Detection Arguments

SurpriseMeMore mesoscale detection methods allow the user to pass arguments
specifying some aspects of the optimization process. In what follows we
will brifly discuss these options.

### initial_guess

The user can pass its own initial guess to the algorithm (pay attention that
has to be a proper initial guess) or use one of the implemented one.

* *random*: membership is assigned to nodes randomly. If the *method* is aglomerative
it doesn't affect the initial guess;

* *ranked*: The 5% nodes with higher degree/strength are in the core. We suggest
 using this option if you are looking for *core-periphery* or *bow-tie structures*.

* *eigenvector*: nodes 5% nodes with higher eigenvector centrality are in the core.
 We suggest using this option if you are looking for *core-periphery* or
 *bow-tie structures*.

The default value is *ranked*.

### weighted

This argument has to be used when we initialize a UndirectedGraph (DirectedGraph)
instance that is weighted. In that case, if we want to run binary mesoscale detection
we must specify *weighted*=False.

```
    graph.run_discrete_cp_detection('weighted'=False)
```

The above snippet of code run binary mesoscale detection on les miserables graph.

The *weighted* argument is just for discrete mesoscale detection methods,
in the case of enhanced mesoscale detection there is no binary version of
the algorithm.

The default value is *None*, the algorithm will choose the proper method for
the network:

* weighted network --> weighted surprise;

* binary netowrk --> binary surprise;

### num_sim

Number of times the algorithm will run over all the links trying to improve the
partioning. If no improvements are detected for 10 times in a row then the algorithm
stops.

The default value is 2.

##  Discrete Mesoscale Detection


### Binary

In [7]:
# An example of how we can run it
graph.run_discrete_cp_detection(weighted=False,
                                initial_guess="random",
                                num_sim=3)
print("The solution is", graph.solution)

100%|██████████| 254/254 [00:01<00:00, 159.62it/s]
100%|██████████| 254/254 [00:00<00:00, 5079.40it/s]
100%|██████████| 254/254 [00:00<00:00, 5146.19it/s]

The solution is [1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 1 1 0
 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 1 1 0 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 0
 1 1 1]





### Weighted

In [8]:
# An example of how we can run it
graph.run_discrete_cp_detection(weighted=True,
                                initial_guess="random",
                                num_sim=3)
print("The solution is", graph.solution)

100%|██████████| 254/254 [00:00<00:00, 3496.26it/s]
100%|██████████| 254/254 [00:00<00:00, 3920.89it/s]
100%|██████████| 254/254 [00:00<00:00, 2977.28it/s]

The solution is [1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0
 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 1 1 0 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1
 1 1 1]





## Enhanced Mesoscale Detection

For more details about the differences between enhanced methods and discrete ones
 read the relative paper (you can find the link in the readme).

In [10]:
# An example of how we can run it
graph.run_enhanced_cp_detection(initial_guess="eigenvector",
                                num_sim=4)
print("The solution is", graph.solution)

100%|██████████| 254/254 [00:01<00:00, 252.60it/s]
100%|██████████| 254/254 [00:00<00:00, 2662.42it/s]
100%|██████████| 254/254 [00:00<00:00, 2646.33it/s]
100%|██████████| 254/254 [00:00<00:00, 2712.63it/s]


The solution is [1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1
 1 1 1]
