<a href="https://colab.research.google.com/github/kicasta/Modeling_WUGS_WSBM/blob/master/example/example_modeling_wugs_wsbm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Prepare everything

We first need to install every dependency needed to run this notebook in colab. 

In [None]:
!pip install pymc3 --upgrade

!echo "deb http://downloads.skewed.de/apt bionic main" >> /etc/apt/sources.list
!apt-key adv --keyserver keys.openpgp.org --recv-key 612DEFB798507F25
!apt-get update
!apt-get install python3-graph-tool python3-cairo python3-matplotlib

!apt install libgraphviz-dev
!pip install pygraphviz

!pip install pyvis

We also need to clone the repository to your content, so every module is accessible from this notebook. Notice that your content in colab gets purged everytime your environment is restarted. 

You might need to refresh the content directory to see the repository cloned. 

In [None]:
!git clone https://github.com/kicasta/Modeling_WUGS_WSBM.git

We now add the path to the system path to easily import the modules.

In [None]:
import sys
sys.path.insert(0,'/content/Modeling_WUGS_WSBM/src/')

Import everything 

In [None]:
import wsbm as wsbm
import plot_utils as pltutil

import pickle

Load dictionaries with the best fit found for the data presented in the paper.

In [None]:
output_path = "/content/Modeling_WUGS_WSBM/data/best_fit/"

with open(output_path + "g_dist_states", 'rb') as f:
  states = pickle.load(f)

with open(output_path + "g_accuracies", 'rb') as f:
  accuracies = pickle.load(f)

In [None]:
#TODO: if possible to use graph file then show in this section how the best fit is found

From those dictionaries we can then plot a lot of useful statistics.

In [None]:
# Plot the amount of graphs best fitted by each distribution
# Corresponds to plot_distributions.py
dists = [d for d,s in states.values()]
dist_count = {d:dists.count(d) for d in set(dists)}

dist_labels = dist_count.keys()
dist_values = [dist_count[d] for d in dist_labels]
dist_labels = [d.split("-")[1] for d in dist_labels]

pltutil.plot_dist_dist(dist_labels, dist_values)

In [None]:
# Plot the amount of blocks of the best fit found for each graph 
# Corresponds to plot_number_of_blocks.py
block_counts = {}
for g,v in states.items():
  state = v[1]
  block_counts[g] = len(set(wsbm.get_blocks(state)))

blocks_y = [block_counts[k] for k in block_counts.keys()]
pltutil.plot_single_stat(block_counts.keys(), blocks_y, "Number of Blocks", limy=False)

In [None]:
# Plot the accuracy of the best fits 
# Corresponds to plot_accuracies.py
acc_y = [accuracies[k] for k in accuracies.keys()]
pltutil.plot_single_stat(accuracies.keys(), acc_y, "Accuracy")