# Community detection in Graphs

We will explore the different community detection algorithms in graphs. 

Not all fo them fit with our problem requirements, we will focus on the algorithms which fit the following requirements
* Directed edges.
* Homogeneous nodes. 
* Possible Weighted relationships (in the future).

The algorithms which fit the requirements are:
* Conductance metric
* K-1 Coloring
* Label Propagation
* Louvain
* Modularity metric
* Modularity Optimization
* Strongly Connected Components
* Weakly Connected Components

We will use an additional approach after embedding the graph, we have use Node2Vector and DeepWalk
* K-Means
* DBSCAN

We will explore each one of them, how they work, their advantages and disadvantages and when to use them. 

Check this documentation to understand and explore the Community Detection algorithms:
https://neo4j.com/docs/graph-data-science/current/algorithms/community/

### Conductance metric 

### K-Means Clustering for Community Detection 
Read the embedding graph and apply clustering with cuML (RAPIDS) to use GPU

In [1]:
import torch

In [2]:
# Read embedding representation
node2vector_embeddings_filepath = 'node2vector_embeddings.pt'
node2vector_embeddings = torch.load(node2vector_embeddings_filepath, weights_only=True)
node2vector_embeddings = node2vector_embeddings.cpu().numpy()

In [3]:
# Init k-means clustering
from sklearn.cluster import KMeans

In [4]:
n_clusters = 100  # N Clusters

kmeans_model = KMeans(
    n_clusters=n_clusters,
    max_iter=300,
    tol=1e-4,
    random_state=42
)


kmeans_model.fit(node2vector_embeddings)

labels = kmeans_model.labels_
print(labels)

[78 64 54 ... 31 78 79]


In [6]:
list(labels)

[np.int32(78),
 np.int32(64),
 np.int32(54),
 np.int32(98),
 np.int32(69),
 np.int32(74),
 np.int32(53),
 np.int32(15),
 np.int32(11),
 np.int32(49),
 np.int32(65),
 np.int32(4),
 np.int32(85),
 np.int32(59),
 np.int32(78),
 np.int32(59),
 np.int32(54),
 np.int32(54),
 np.int32(49),
 np.int32(72),
 np.int32(15),
 np.int32(63),
 np.int32(30),
 np.int32(47),
 np.int32(62),
 np.int32(25),
 np.int32(99),
 np.int32(63),
 np.int32(34),
 np.int32(70),
 np.int32(49),
 np.int32(29),
 np.int32(30),
 np.int32(80),
 np.int32(25),
 np.int32(49),
 np.int32(60),
 np.int32(72),
 np.int32(36),
 np.int32(72),
 np.int32(15),
 np.int32(23),
 np.int32(65),
 np.int32(0),
 np.int32(65),
 np.int32(57),
 np.int32(36),
 np.int32(15),
 np.int32(85),
 np.int32(49),
 np.int32(23),
 np.int32(34),
 np.int32(4),
 np.int32(3),
 np.int32(54),
 np.int32(5),
 np.int32(15),
 np.int32(49),
 np.int32(73),
 np.int32(89),
 np.int32(63),
 np.int32(7),
 np.int32(7),
 np.int32(59),
 np.int32(73),
 np.int32(17),
 np.int32(27),
 n

In [7]:
len(labels)

19129