## Demo of the contribution submitted at ECML-PKDD 2019
This notebook allows to easily test the method introduced in the article:

`Detecting Stable Communities in Link Streams at Multiple Temporal Scales`

### tnetwork library installation
The code is built on a custom library, that need to be installed using pip first.
If you're running this code online, you probably need to execute the command in the next cell

In [2]:
#!pip install --upgrade git+https://github.com/Yquetzal/tnetwork.git

### We can now import a set of useful libraries

In [1]:
import tnetwork as tn
import networkx as nx
from bokeh.io import show, output_notebook, output_file, reset_output
from bokeh.layouts import row

We finally import the methods introduced in the article. Note that their code is integrated in the tnetwork library, but also availale as stand-alones on the same git repository as this file

In [None]:
#Stable community detection method
from tnetwork.DCD.pure_python.community_tracker import track_communities 
#Generator
from tnetwork.DCD.multi_temporal_scale import generate_multi_temporal_scale

## Generate a random graph with multiple temporal scale communities


In [66]:
T=5000
N= 50
SC=5
graph,original_coms = generate_multi_temporal_scale(nb_steps=T,nb_nodes=N,nb_com = SC)

## Plot the communities to discover
Each horizontal position corresponds to a node, color lines represent communities

In [67]:
output_file("ground_truth.html")
p = tn.plot_longitudinal(tn.DynGraphIG(),original_coms.to_DynCommunitiesIG(sn_duration=1),nodes=[str(x) for x in range(50)],height=400)
show(p)

  return bool(asarray(a1 == a2).all())


## Detect stable communities using the proposed algorithm

In [69]:
winsow_size = nb_steps/3
time_periods = [int(winsow_size)]
while winsow_size>1:
    winsow_size = int(winsow_size/2)
    time_periods.append(winsow_size)

(persistant_coms,_,_) = track_communities(graph,time_periods)

[1666, 833, 416, 208, 104, 52, 26, 13, 6, 3, 1]
------- 1666
aggregating graph
computing communities
computing quality for each com
#nb seeds total 28
#nb good seeds 0
#nb different seeds 0
tracking
# persistent communities 0
------- 833
aggregating graph
computing communities
computing quality for each com
#nb seeds total 104
#nb good seeds 29
#nb different seeds 29
tracking
# persistent communities 0
------- 416
aggregating graph
computing communities
computing quality for each com
#nb seeds total 571
#nb good seeds 68
#nb different seeds 68
tracking
# persistent communities 0
------- 208
aggregating graph
computing communities
computing quality for each com
#nb seeds total 1542
#nb good seeds 97
#nb different seeds 97
tracking
# persistent communities 11
------- 104
aggregating graph
computing communities
computing quality for each com
#nb seeds total 3617
#nb good seeds 134
#nb different seeds 78
tracking
# persistent communities 26
------- 52
aggregating graph
computing communitie

## Plot the discovered communities
We first define a function to select resulting communities according to their quality score SQ, and/or their relation

In [70]:
def most_stable_communities(persistant_coms,nb_coms=None,duration_min=0,duration_max=10000000):
    visu_blocks = tn.DynCommunitiesIG()
    if nb_coms==None:
        nb_coms = len(persistant_coms)
    for nodes,period,current_granularity,score in persistant_coms[:nb_coms]:
        if period.duration()>duration_min and period.duration()<duration_max:
            name = str(current_granularity)+" "+str(nodes)
            #name = str(c_prop[4])
            visu_blocks.add_affiliation(nodes,name,period)
    return visu_blocks
#visu_blocks = tn.DynCommunitiesIG()

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


  return bool(asarray(a1 == a2).all())


### Plot communities found, to compare them visually

In [None]:
com_to_plot = most_stable_communities(persistant_coms,nb_coms=40)
p = tn.plot_longitudinal(tn.DynGraphIG(),com_to_plot,nodes=[str(x) for x in range(50)],height=400)
output_file("found.html")
show(p)

## First evaluation: We compute the NMI between communities in the ground truth (ignoring time) and:

 1) communities found on the cumulated graph
 
 2) communities found by our algorithm (ignoring time)

### First, detect communities on the cumulated graph

In [71]:
cumulated_graph = nx.Graph()
cumulated_graph = graph.cumulated_graph()

In [72]:
CUMULATED_coms = best_partition(cumulated_graph)
nodeSet = tn.utils.community_utils.affiliations2nodesets(CUMULATED_coms)
CUMULATED_coms = list(nodeSet.values())

### get communities in the ground truth

In [73]:
from tnetwork.DCD.analytics.NMIs import NMI
GT_coms = []
for a_com in original_coms.communities().values():
    GT_coms.append(set(a_com.keys()))


### get communities in our solution

In [74]:
OUR_coms = []
for a_com in com_to_plot.communities().values():
    OUR_coms.append(set(a_com.keys()))

### Compute NMIs

In [75]:
print("NMI for cumulated: ",NMI(GT_coms,CUMULATED_coms))
print("NMI for proposed solution: ",NMI(GT_coms,OUR_coms))

NMI for cumulated:  0.21485577256207966
NMI for proposed solution:  0.6987209133347945
