# Louvain Community Detection

The Louvain method of community detection is a greedy algorithm which seeks to optimize Modularity as it progresses. 

For a detailed description of the algorithm see: https://en.wikipedia.org/wiki/Louvain_Modularity

It takes as input a cugraph.Graph object and returns as output a 
cudf.Dataframe object with the id and assigned partition for each 
vertex as well as the final modularity score

In [1]:
# Import needed libraries
import cugraph
import cudf
import numpy as np
from scipy.io import mmread
from collections import OrderedDict

In [3]:
# define a function to print the results
def print_parts(df, mod):
    
    #print the Modularity Score
    print('Modularity was {mod}')
    # See which nodes are in partition 0:
    part = []
    for i in range(len(df)):
        if (df['partition'][i] == 0):
            part.append(df['vertex'][i])
    print(part)

## Reading a CSV file using cuDF

In [4]:
# Test file  - using the clasic Karate club dataset.  
datafile='../data/networks/karate-data.csv'

In [5]:
# Read the data file
cols = ["src", "dst"]

dtypes = OrderedDict([
        ("src", "int32"), 
        ("dst", "int32")
        ])

gdf = cudf.read_csv(datafile, names=cols, delimiter='\t', dtype=list(dtypes.values()), )

In [6]:
# create a Graph 
G1 = cugraph.Graph()
G1.add_edge_list(gdf["src"], gdf["dst"])

In [None]:
# Call Louvain on the graph
df1, mod1 = cugraph.nvLouvain(G1) 

In [None]:
print_parts(fd1, mod1)

## As a Matrix file using scipy

In [None]:
# Read in the data file into scipy matrix format
mmFile='/datasets/networks/karate.mtx'
M = mmread(mmFile).asfptype().tolil()
M = M.tocsr()

In [None]:
# Load the structure of the graph into GPU memory and create a CuGraph
# graph object:
row_offsets = cudf.Series(M.indptr)
col_indices = cudf.Series(M.indices)
values = cudf.Series(M.data)
G = cugraph.Graph()
G.add_adj_list(row_offsets, col_indices, values)

In [None]:
# Call Louvain on the graph
df, mod = cugraph.nvLouvain(G) 

In [None]:
# Check the modularity score
mod

In [None]:
# See which nodes are in partition 0:
part = []
for i in range(len(df)):
    if (df['partition'][i] == 0):
        part.append(df['vertex'][i])
print(part)