<img src="https://relevance.ai/wp-content/uploads/2021/11/logo.79f303e-1.svg" width="150" alt="Relevance AI" />
<h5> Developer-first vector platform for ML teams </h5>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/RelevanceAI/workflows/blob/main/community-detection/Community_Detection_with_Relevance_AI.ipynb)

# 🔰 Community Detection with Relevance AI

Community detection is a method used to cluster nodes in a graph. In deep learning, community detection is applied to data encoded into an embedding by a transformer. Relevence AI uses the community detection algorithm provided by [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) (see example [here](https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/clustering/fast_clustering.py)). Relevance AI simplifies the process of uploading data and applying the algorithm to just a few lines of code.


## ⏬ Install `relevanceai` and `sentence-transformers`

In [None]:
%%capture
!pip install -q RelevanceAI==2.3.2
!pip install -q sentence-transformers==2.2.0

## 🖥️ Connect to Client

In [None]:
from relevanceai import Client

client = Client()

## 📊 Data

For this example, let's use one of Relevance AI's cleaned datasets, namely the cleaned ecommerce dataset. To simplify this example, we'll only get the field over which we apply the algorithm: `product_title`, the name of a product.

In [None]:
from relevanceai.utils.datasets import get_ecommerce_dataset_encoded

documents = get_ecommerce_dataset_encoded(
    select_fields=['product_title', 'product_image', 'product_image_clip_vector_']
)
ds = client.Dataset('community-detection-test')
ds.insert_documents(documents)

In [None]:
ds.schema

## 🔎 Community Detection

In [None]:
ds.cluster(
    model="communitydetection",
    vector_fields=['product_image_clip_vector_'],
    cluster_config={"threshold": 0.75},
)

And that's it. The method above automatically creates the attribute `_cluster_.product_title.community-detection` in all relevant documents of the Dataset. To confirm that this is indeed the case, you can check the schema below. While we didn't change any of the default values for this demonstration, be sure to check out the documentation to see how fine-tune the algorithm.

In [None]:
ds.schema

## 👣 Next Steps

* Explore our platform at https://cloud.releveance.ai
* There are more in-depth tutorials and guides at https://docs.relevance.ai
* There are detailed library references at https://relevanceai.readthedocs.io/
* Join our slack community at https://join.slack.com/t/relevance-ai/shared_invite/zt-11fo8oush-dHPd57wamhoQ7J5arNv1mg

## 📄 Documentation Link

* https://relevanceai.readthedocs.io/en/latest/dataset.html?highlight=community#relevanceai.dataset_api.dataset_operations.Operations.community_detection