# Unsupervised analysis with DeepOF
Welcome to this notebook, where we'll perform an unsupervised analysis using Google Colab! With an unsupervised analysis, you can identify patterns in your data without labeling or a pre-existing hypothesis. Therefore, you can use it to explore your data and discover new relationships you wouldn't have noticed!

## Importing packages and installing dependencies

In [None]:
import os
!git clone -q https://github.com/mlfpm/deepof.git
!pip install -q -e deepof --progress-bar off
os.chdir("deepof")
!curl --output tutorial_files.zip https://datashare.mpcdf.mpg.de/s/knF7t78isQuIAr0/download
!unzip tutorial_files.zip
os.kill(os.getpid(), 9)

In [None]:
os.chdir("deepof")
import os, warnings
warnings.filterwarnings('ignore')

In [None]:
from google.colab import drive
    drive.mount('/content/drive')
data_dir = "/content/drive/MyDrive/MY_DATA_DIRECTORY"

## Running the unsupervised analysis
Run the following cells to import your project. Then, we'll preprocess our data; DeepOF will calculate the centered and aligned coordinates, speeds, and distances between the animal parts.

In [None]:
my_deepof_project = deepof.data.load_project(data_dir + "deepof_tutorial_project")

In [None]:
# This code will generate a dataset using graph representations, as well a some auxiliary objects
def graph_dataset_function(my_deepof_project):
    graph_preprocessed_coords, adj_matrix, to_preprocess, global_scaler = my_deepof_project.get_graph_dataset(
        # animal_id="S1", # Comment out for multi-animal embeddings
        center="Center",
        align="Spine_1",
        window_size=25,
        window_step=1,
        test_videos=1,
        preprocess=True,
        scale="standard",
    )

    return graph_preprocessed_coords, adj_matrix, to_preprocess, global_scaler

Now, we will embed our data with deep clustering methods. The core idea of deep clustering is to embed our preprocessed data with a neural network and retrieve a set of embeddings per time point, each assigned to a cluster. If you have already trained a model, set **pre_trained=True**** instead of *False*.

In [None]:
def train_model_function(my_deepof_project, graph_preprocessed_coords, adj_matrix, pre_trained)
    trained_model = my_deepof_project.deep_unsupervised_embedding(
        preprocessed_object=graph_preprocessed_coords, # Change to preprocessed_coords to use non-graph embeddings
        adjacency_matrix=adj_matrix,
        embedding_model="VaDE", # Can also be set to 'VQVAE' and 'Contrastive'
        epochs=10,
        encoder_type="recurrent", # Can also be set to 'TCN' and 'transformer'
        n_components=10,
        latent_dim=4,
        batch_size=1024,
        verbose=False, # Set to True to follow the training loop
        interaction_regularization=0.0, # Set to 0.5 if multi-animal training
        pretrained=pre_trained, # Set to False to train a new model!
    )

    return trained_model