### 1st Virtual Methods Seminar: Methods of Computational Social Science
## Introduction to Social Network Science with Python
# Community Detection - Exercise 4.1
Instructors: Haiko Lietz & Lisette Espín Noboa

Date: September 24, 2020
## Packages

In [None]:
import sys
libs_path = '../libs/'
sys.path.append(libs_path)
import compsoc as cs

In [None]:
import matplotlib.pyplot as plt
import networkx as nx
import pandas as pd

In [None]:
import community as louvain

## Introduction
In this exercise, we will again look at the Copenhagen Networks Study collection to study community detection. Load the dataset using the `copenhagen_collection()` function:

In [None]:
users, genders, bluetooth, calls, sms, facebook_friends = cs.copenhagen_collection(path='../../data/copenhagen/')

We use the same networks as in the `32_cohesion_exercise notebook`. For the `bluetooth` data, we sum up the signal strengths and remove signals with zero strength:

In [None]:
bluetooth = bluetooth[['user_id_from', 'user_id_to', 'strength']].groupby(['user_id_from', 'user_id_to']).sum().reset_index()
bluetooth = bluetooth[bluetooth['strength'] > 0]

The `facebook_friends` dataframe needs a unit weight so the dataframe meets the data format expectations:

In [None]:
facebook_friends['weight'] = 1

## Exercise 1
Study this **bluetooth** graph:

In [None]:
# construct graph
G = cs.construct_graph(
    directed=False, 
    multiplex=False, 
    graph_name='co_proximity', 
    node_list=users, 
    edge_list=bluetooth[bluetooth['strength'] > 500]
)

# extract the largest connected component and create a layout
G = G.subgraph(max(nx.connected_components(G), key=len))
vp_node_pos_bluetooth = nx.spring_layout(G, seed=0)

# draw graph
cs.draw_graph(
    G, 
    node_pos=vp_node_pos_bluetooth, 
    edge_width_factor=.0001, 
    edge_transparency=.5, 
    figsize='large'
)

Explore how to call the different community detection methods and how they behave. Check the effect of edge weights and the resolution parameter if available.
## Solution 1

## Exercise 2
Study the **facebook-friends** graph:

In [None]:
# construct graph
H = cs.construct_graph(
    directed=False, 
    multiplex=False, 
    graph_name='facebook_friends', 
    node_list=users, 
    edge_list=facebook_friends, 
    node_label='user'
)

# remove self-loops, extract the largest connected component, and create a layout
H.remove_edges_from(nx.selfloop_edges(H))
H = H.subgraph(max(nx.connected_components(H), key=len))
vp_node_pos_facebook_friends = nx.spring_layout(H, seed=0)

# draw graph
cs.draw_graph(
    H, 
    node_pos=vp_node_pos_facebook_friends, 
    edge_width_factor=.25, 
    figsize='large'
)

We have already seen that this graph has a marked core/periphery structure. How does Louvain community detection behave on such a graph?
## Solution 2