# Community detection - Stochastic block model

Original data: from the `painter_networks.ipynb` notebook

## Imports, initial procedures

In [1]:
import json
import pandas as pd
import matplotlib.pyplot as plt
from graph_tool.all import *

In [None]:
#Load the graph from the graphml file
g = load_graph("data/painters.graphml")
names = list(g.vertex_properties["artist"])

print("Vertex properties: ", list(g.vertex_properties.keys()))
print("\nExample artist:", g.vertex_properties["artist"][2])
for prop in g.vertex_properties:
    print(prop, g.vertex_properties[prop][g.vertex(2)])

Vertex properties:  ['Art500k_Movements', 'FirstYear', 'FriendsandCoworkers', 'Influencedby', 'Influencedon', 'LastYear', 'Nationality', 'PaintingSchool', 'PaintingsExhibitedAt', 'PaintingsExhibitedAtCount', 'Pupils', 'StylesCount', 'StylesYears', 'Teachers', '_graphml_vertex_id', 'artist', 'birth_place', 'birth_year', 'citizenship', 'death_place', 'death_year', 'gender', 'locations', 'locations_with_years', 'movement', 'name', 'occupations', 'styles', 'styles_extended', 'wikiart_pictures_count']
Example artist: Jean-Baptiste-Simeon Chardin
Art500k_Movements {Realism:1}
FirstYear 1728.0
FriendsandCoworkers 
Influencedby 
Influencedon 
LastYear 1753.0
Nationality 
PaintingSchool 
PaintingsExhibitedAt 
PaintingsExhibitedAtCount 
Pupils 
StylesCount 
StylesYears 
Teachers 
_graphml_vertex_id Jean-Baptiste-Simeon Chardin
artist Jean-Baptiste-Simeon Chardin
birth_place Paris
birth_year 1699.0
citizenship France
death_place Paris
death_year 1779.0
gender male
locations ['Paris']
locations_wi

## Community detection

To find communities, we use a stochastic block model (SBM) - the nested SBM, which is a hierarchical version, finding connections on higher levels, between communities (this can be nicely visualized on the plot). We will only use the first level of the nested SBM.

In [20]:
state = minimize_nested_blockmodel_dl(g, state_args=dict(recs=[g.ep.weight], rec_types=["real-exponential"]))
state.levels[0]

<BlockState object with 2400 blocks (49 nonempty), degree-corrected, with 1 edge covariate, for graph <Graph object, undirected, with 2400 vertices and 18725 edges, 30 internal vertex properties, 2 internal edge properties, at 0x7f4bcc5fa750>, at 0x7f4bdc86f710>

We have 49 communities of the 2400 artists, which we will further analyze. Before that, it's interesting to see how they are connected:

**NOTE**: Output was removed to save memory, I saved it in the image subfolder as a .png

In [46]:
state.draw(edge_color=[0.6, 0.6, 0.6, 0.3],
           eorder=g.ep.weight, # edge ordering is important!
              ecmap=plt.cm.Greys,
          output = "images/painters_nested_blockmodel.png");

We save the communities, and do the analysis in the main notebook (`painter_networks.ipynb`).

In [62]:
b = state.levels[0].get_blocks()
block_to_artist = {}
for v in g.vertices():
    block_id = b[v]
    artist = g.vertex_properties['artist'][v]
    if block_id not in block_to_artist:
        block_to_artist[block_id] = []
    block_to_artist[block_id].append(artist)

with open("data/blocks.json", "w") as f:
    json.dump(block_to_artist, f)