<a href="https://colab.research.google.com/github/mikislin/CNE25/blob/main/notebooks/CNE_Class3_QAB_Exercises_DCL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DeepLabCut on Single Mouse Data Demo

üëã This notebook is a modified copy from [Github](https://colab.research.google.com/github/DeepLabCut/DeepLabCut/blob/master/examples/COLAB/COLAB_DEMO_mouse_openfield.ipynb), originally written by Mackenzie Mathis and contributors.

‚ö†Ô∏è It has been edited for the 2025 CNE class!


This notebook illustrates how to use DeepLabCut and Colab to:


This notebook illustrates how to use the cloud to:

- load demo data
- create a training set
- train a network
- evaluate a network
- analyze a novel video



In [None]:
# Clone the entire deeplabcut repo so we can use the demo data:
!git clone -l -s https://github.com/DeepLabCut/DeepLabCut.git cloned-DLC-repo
%cd cloned-DLC-repo
!ls

In [None]:
# Install the latest DeepLabCut version (this will take a few minutes to install all the dependencies!)
%cd /content/cloned-DLC-repo/
%pip install "."

### The installation error is expected after installing DLC on the colab
PLEASE, click "restart runtime" from the output above before proceeding!

In [None]:
import deeplabcut

In [None]:
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import pandas as pd
from ipywidgets import interact, IntSlider
import os
import yaml
import numpy as np
from IPython.display import HTML
from base64 import b64encode

In [None]:
# @title Preview the experimental data

# Replace with the actual filename or full path
video_path = '/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/videos/m3v1mp4.mp4'

# Read the video file in binary mode
with open(video_path, 'rb') as f:
    mp4_bytes = f.read()

# Encode the video bytes to base64
mp4_base64 = b64encode(mp4_bytes).decode()

# Create the HTML for embedding the video
html_code = f"""
<video width="640" height="480" controls>
    <source src="data:video/mp4;base64,{mp4_base64}" type="video/mp4">
    Your browser does not support the video tag.

</video>
"""

# Display the HTML
display(HTML(html_code))

In [None]:
# @title Create a path variable that links to the config file:
path_config_file = '/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/config.yaml'

# Loading example data set:
deeplabcut.load_demo_data(path_config_file)

# Automatically update some hyperparameters for training,
# here rotations to +/- 180 degrees. This can be helpful for optimizing performance.
# see Primer -- Mathis et al. Neuron 2020
from deeplabcut.core.config import read_config_as_dict
import deeplabcut.pose_estimation_pytorch as dlc_torch

loader = dlc_torch.DLCLoader(
    config=path_config_file,
    trainset_index=0,
    shuffle=1,
)

# Get the pytorch config path
pytorch_config_path = loader.model_folder / "pytorch_config.yaml"

model_cfg = read_config_as_dict(pytorch_config_path)
model_cfg['data']["train"]["affine"]["rotation"]=180

# Save the modified config
dlc_torch.config.write_config(pytorch_config_path,model_cfg)

# Problem 1: Why is label quality crucial for Computer Vision model training?

DLC repo provides images with the labes, review them before starting training

In [None]:
# @ title Interactive Labeled Frames Review

# Path to CSV and image folder (from cloned repo)
csv_path = '/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/labeled-data/m4s1/CollectedData_Pranav.csv'
img_folder = '/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/labeled-data/m4s1/'

# Parse CSV (skip header rows, focus on data)
try:
    df = pd.read_csv(csv_path, skiprows=2)
    df.columns = ['img_path', 'snout_x', 'snout_y', 'leftear_x', 'leftear_y', 'rightear_x', 'rightear_y', 'tailbase_x', 'tailbase_y']
except FileNotFoundError:
    print("Error: CSV file not found. Ensure the demo data is loaded correctly.")
    raise

# Define plotting function for use with interact
def plot_labeled_image(idx):
    try:
        row = df.iloc[idx]
        img_name = row['img_path'].split('/')[-1]  # e.g., 'img0000.png'
        img_path = os.path.join(img_folder, img_name)

        # Load and plot image
        img = mpimg.imread(img_path)
        fig, ax = plt.subplots(figsize=(8, 6))
        ax.imshow(img)

        # Overlay body parts as colored points
        ax.scatter(row['snout_x'], row['snout_y'], c='r', s=50, label='Snout')
        ax.scatter(row['leftear_x'], row['leftear_y'], c='g', s=50, label='Left Ear')
        ax.scatter(row['rightear_x'], row['rightear_y'], c='b', s=50, label='Right Ear')
        ax.scatter(row['tailbase_x'], row['tailbase_y'], c='y', s=50, label='Tail Base')

        ax.legend()
        ax.set_title(f"Labeled Image: {img_name}")
        plt.show()

        print("Review: Check if points align with body parts. Mislabels? Consider refining in full DLC GUI.")
    except IndexError:
        print(f"Error: Index {idx} out of range. Choose a number between 0 and 114.")
    except FileNotFoundError:
        print(f"Error: Image {img_name} not found. Ensure the demo data is loaded correctly.")
    except Exception as e:
        print(f"Unexpected error: {e}. Try another index.")

# Create interactive slider
print("Interactive Label Review: Use the slider to select image indices (0-114) to review labeled data.")
interact(
    plot_labeled_image,
    idx=IntSlider(min=0, max=len(df)-1, step=1, value=0, description='Image Index')
)

reply here:

Bonus Problem: Illustrating the Impact of Label Quality on DLC Model Performance

Description:

To demonstrate why label quality is crucial in computer vision, modify the training labels in CollectedData_Pranav.csv to simulate corruption or shuffling.

Train the DLC model on the modified data and compare performance (e.g., mean likelihoods, accuracy) to the original. This illustrates that models can "learn" (memorize) even random labels with high train accuracy but fail to generalize, dropping test performance to near-zero

In [None]:
# Load original CSV
gt_path = '/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/labeled-data/m4s1/CollectedData_Pranav.csv'
df = pd.read_csv(gt_path, skiprows=2)
df.columns = ['img_path', 'snout_x', 'snout_y', 'leftear_x', 'leftear_y', 'rightear_x', 'rightear_y', 'tailbase_x', 'tailbase_y']

# Shuffle version
df_shuffled = df.copy()
df_shuffled.iloc[:,1:] = np.random.permutation(df_shuffled.iloc[:,1:].values)
df_shuffled.to_csv('/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/labeled-data/m4s1/CollectedData_Pranav_shuffled.csv', index=False)

# Corrupted version
df_corrupted = df.copy()
df_corrupted.iloc[:,1:] += np.abs(np.random.normal(0, 5, df_corrupted.iloc[:,1:].shape))
df_corrupted.to_csv('/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/labeled-data/m4s1/CollectedData_Pranav_corrupted.csv', index=False)

# Then, replace CSV in DLC folder and run training/analyze_videos

## Start training:
This function trains the network for a specific shuffle of the training dataset.

In [None]:
# Let's also change the display and save_epochs just in case Colab takes away
# the GPU... If that happens, you can reload from a saved point using the
# `snapshot_path` argument to `deeplabcut.train_network`:
#   deeplabcut.train_network(..., snapshot_path="/content/.../snapshot-050.pt")

# Typically, you want to train to ~200 epochs. We set the batch size to 8 to
# utilize the GPU's capabilities.

# More info and there are more things you can set:
#   https://deeplabcut.github.io/DeepLabCut/docs/standardDeepLabCut_UserGuide.html#g-train-the-network

deeplabcut.train_network(
    path_config_file,
    shuffle=1,
    save_epochs=5,
    epochs=20,
    batch_size=8,
)

# This will run until you stop it (CTRL+C), or hit "STOP" icon, or when it hits the end.

In [None]:
# @ title Inspect the network training
# Adjust path to learning_stats.csv (from DLC model folder)
model_folder = '/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/dlc-models-pytorch/iteration-0/openfieldOct30-trainset95shuffle1/train/'
csv_path = os.path.join(model_folder, 'learning_stats.csv')

try:
    df = pd.read_csv(csv_path)
except FileNotFoundError:
    print("Error: learning_stats.csv not found. Ensure training completed successfully.")
    raise

# Figure with subplots for losses
fig, axs = plt.subplots(2, 1, figsize=(12, 10), sharex=True)

# Plot training losses
axs[0].plot(df['step'], df['losses/train.bodypart_heatmap'], label='Train Heatmap Loss', marker='o')
axs[0].plot(df['step'], df['losses/train.bodypart_locref'], label='Train Locref Loss', marker='o')
axs[0].plot(df['step'], df['losses/train.bodypart_total_loss'], label='Train Bodypart Total Loss', marker='o')
axs[0].plot(df['step'], df['losses/train.total_loss'], label='Train Total Loss', marker='o', linewidth=2)
axs[0].set_ylabel('Loss')
axs[0].set_title('Training Losses over Epochs')
axs[0].legend()
axs[0].grid(True)

# Plot evaluation losses (only where available)
eval_df = df.dropna(subset=['losses/eval.total_loss'])
axs[1].plot(eval_df['step'], eval_df['losses/eval.bodypart_heatmap'], label='Eval Heatmap Loss', marker='s')
axs[1].plot(eval_df['step'], eval_df['losses/eval.bodypart_locref'], label='Eval Locref Loss', marker='s')
axs[1].plot(eval_df['step'], eval_df['losses/eval.bodypart_total_loss'], label='Eval Bodypart Total Loss', marker='s')
axs[1].plot(eval_df['step'], eval_df['losses/eval.total_loss'], label='Eval Total Loss', marker='s', linewidth=2)
axs[1].set_xlabel('Epoch')
axs[1].set_ylabel('Loss')
axs[1].set_title('Evaluation Losses at Checkpoints')
axs[1].legend()
axs[1].grid(True)

plt.tight_layout()
plt.show()

# Separate plot for metrics
metric_fig, metric_ax = plt.subplots(figsize=(12, 6))
eval_metrics = df.dropna(subset=['metrics/test.mAP'])
metric_ax.plot(eval_metrics['step'], eval_metrics['metrics/test.mAP'], label='Test mAP (%)', marker='o')
metric_ax.plot(eval_metrics['step'], eval_metrics['metrics/test.mAR'], label='Test mAR (%)', marker='o')
# Twin axis for RMSE (different scale)
rmse_ax = metric_ax.twinx()
rmse_ax.plot(eval_metrics['step'], eval_metrics['metrics/test.rmse'], label='Test RMSE (pixels)', marker='o', color='r')
rmse_ax.plot(eval_metrics['step'], eval_metrics['metrics/test.rmse_pcutoff'], label='Test RMSE (pcutoff, pixels)', marker='o', color='m')
rmse_ax.set_ylabel('RMSE Value')
metric_ax.set_xlabel('Epoch')
metric_ax.set_ylabel('mAP/mAR Value')
metric_ax.set_title('Evaluation Metrics over Epochs')
metric_ax.legend(loc='upper left')
rmse_ax.legend(loc='upper right')
metric_ax.grid(True)
plt.show()

print("Review: Analyze if train/eval losses are decreasing without divergence (overfitting if eval rises while train drops). Metrics like mAP/mAR should increase, RMSE decrease. If metrics plateau, consider more epochs or hyperparameter tuning.")

## Start evaluating:
This function evaluates a trained model for a specific shuffle/shuffles at a particular state or all the states on the data set (images)
and stores the results as .csv file in a subdirectory under **evaluation-results**

In [None]:
deeplabcut.evaluate_network(path_config_file, plotting=True)

# Here you want to see a low pixel error! Of course, it can only be as
# good as the labeler, so be sure your labels are good!a

# Problem 2: What metrics indicate good training?

reply here:

## Start Analyzing videos:
This function analyzes the new video. The user can choose the best model from the evaluation results and specify the correct snapshot index for the variable **snapshotindex** in the **config.yaml** file. Otherwise, by default the most recent snapshot is used to analyse the video.

The results are stored in hd5 file in the same directory where the video resides.

**On the demo data, this should take around ~ 90 seconds! (The demo frames are 640x480, which should run around 25 FPS on the google-provided T4 GPU)**

In [None]:
# Enter the list of videos to analyze.
videofile_path = ["/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/videos/m3v1mp4.mp4"]
deeplabcut.analyze_videos(path_config_file, videofile_path, videotype=".mp4")

In [None]:
deeplabcut.create_labeled_video(path_config_file, videofile_path)

In [None]:
deeplabcut.create_labeled_video(
    path_config_file,
    videofile_path,
    videotype='mp4',
    filtered=False
)

In [None]:
deeplabcut.plot_trajectories(path_config_file, videofile_path)

# Problem 3: Evaluate predictions

1. Inspect the likelihood-over-time plot (plot-likelihood.png in the plot-poses folder).

2. Adapt the code to load the HDF5 output file generated by deeplabcut.analyze_videos.

3. Extract likelihood values for each body part, and visualize their distributions using histograms and kernel density estimates (KDEs).

Address the following questions:
Which body part has the lowest mean likelihood?
What percentage of frames show low likelihoods ?

Based on both the distributions and the likelihood-over-time plot, propose at least two targeted improvements to the DLC model (for example, adding specific viewpoints to the training set, refining labels for difficult frames, or increasing model iterations).

In [None]:
# Adjust path to predictions HDF5 file, see output of deeplabcut.analyze_videos line with "Saving results in "
h5_path = '/content/cloned-DLC-repo/examples/openfield-Pranav-2018-10-30/videos/m3v1mp4DLC_Resnet50_openfieldOct30shuffle1_snapshot_best-20.h5'

# Load predictions from HDF5
try:
    data = pd.read_hdf(h5_path)
    scorer = data.columns.levels[0][0]
    bodyparts = ['snout', 'leftear', 'rightear', 'tailbase']
    likelihood_columns = [(scorer, bp, 'likelihood') for bp in bodyparts]
    if not all(col in data.columns for col in likelihood_columns):
        print(f"Warning: Expected likelihood columns {likelihood_columns} not found. Available columns: {list(data.columns)}")
        raise ValueError("Missing expected columns in HDF5 file.")
except FileNotFoundError:
    print(f"Error: HDF5 file not found at {h5_path}. Ensure deeplabcut.analyze_videos ran successfully.")
    raise
except Exception as e:
    print(f"Unexpected error loading HDF5: {e}. Inspect with: h5py.File('{h5_path}', 'r').keys()")
    raise

In [None]:
# solution

### Bonus: DCL Network Architecture

In [None]:
# Optional: Graphical visualization with torchviz
print("\nGenerating graphical visualization of the model architecture...")
try:
    # Install dependencies
    %pip install --quiet torchviz graphviz
    # Install Graphviz binary for Colab (only if needed)
    !if ! command -v dot >/dev/null 2>&1; then apt-get update -qq && apt-get install -y graphviz; fi
    from torchviz import make_dot
    import os
    from IPython.display import Image, display
    import torch
    import deeplabcut.pose_estimation_pytorch as dlc_torch
    from deeplabcut.core.config import read_config_as_dict

    # Load model configuration
    pytorch_config_path = loader.model_folder / "pytorch_config.yaml"
    model_cfg = read_config_as_dict(pytorch_config_path)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # Initialize the model using DLC's training loader
    try:
        # Attempt to load the model using DLC's internal utilities
        from deeplabcut.pose_estimation_pytorch.training import PoseTrainingRunner
        runner = PoseTrainingRunner(
            config=pytorch_config_path,
            trainset_index=0,
            shuffle=1,
        )
        model = runner.model.to(device)
    except (AttributeError, ImportError) as e:
        print(f"Error initializing model: {e}")
        print("Model loading failed. Falling back to configuration-based description.")
        model = None

    if model:
        # Create dummy input for forward pass
        x = torch.randn(1, 3, 448, 448).to(device)  # Input size: batch=1, channels=3, 448x448 (from train.txt)

        # Perform forward pass
        y = model(x)

        # Generate and save visualization
        output_dir = loader.model_folder
        os.makedirs(output_dir, exist_ok=True)  # Ensure output directory exists
        output_path = os.path.join(output_dir, "model_architecture")
        dot = make_dot(y, params=dict(model.named_parameters()), show_attrs=True, show_saved=True)
        dot.format = 'png'
        dot.render(output_path, view=False)  # Save as PNG without opening

        # Verify and display output
        output_file = f"{output_path}.png"
        if os.path.exists(output_file):
            print(f"Graphical model architecture saved as {output_file}")
            display(Image(filename=output_file))
            print("Download the PNG from Colab's file explorer (left sidebar) under the model folder.")
        else:
            print(f"Warning: Output file {output_file} not found. Rendering may have failed.")
            print("Fallback: Displaying graph as text representation...")
            print(dot)
    else:
        # Fallback: Describe architecture from configuration
        print("\nFallback: Model visualization failed. Describing architecture from configuration:")
        print(f"Network Type: {model_cfg.get('net_type', 'N/A')}")
        print(f"Backbone: {model_cfg['model']['backbone']['type']} ({model_cfg['model']['backbone']['model_name']})")
        print(f"Bodyparts: {model_cfg['metadata']['bodyparts']}")
        print(f"Number of Keypoints: {len(model_cfg['metadata']['bodyparts'])}")
        print(f"Input Size: 448x448 (from crop_sampling)")
        print(f"Output: Heatmaps ({model_cfg['model']['heads']['bodypart']['heatmap_config']['channels'][-1]} channels) and location refinements for {len(model_cfg['metadata']['bodyparts'])} keypoints")
        print(f"Location Refinement Std: {model_cfg['model']['heads']['bodypart']['locref_std']}")
        print("Note: The model uses a ResNet50 backbone for feature extraction, followed by a heatmap head (predicting keypoint probabilities) and a location refinement head (refining coordinates).")

    print("Review: If visualized, inspect the network graph to identify the ResNet50 backbone, heatmap head, and location refinement head. How do ResNet50‚Äôs convolutional layers capture spatial features for mouse tracking? Why are separate heads used for heatmaps and location refinement in pose estimation for bodyparts (snout, leftear, rightear, tailbase)? If visualization failed, use the configuration details to infer the model structure. How does the 448x448 input size balance detail and computational cost? Consider checking the DLC version or re-running '%pip install deeplabcut --upgrade'.")

except ImportError as e:
    print(f"Error: Failed to import torchviz, graphviz, or DLC modules: {e}")
    print("Run '%pip install torchviz graphviz deeplabcut --upgrade' in a new cell.")
except NameError as e:
    print(f"Error: Variable not defined: {e}")
    print("Ensure 'loader' is defined. Check if deeplabcut.create_training_dataset was run successfully.")
except RuntimeError as e:
    print(f"Error during forward pass: {e}")
    print("Verify model initialization, CUDA availability, and input size (448x448). Try re-running the model setup.")
except OSError as e:
    print(f"Error with Graphviz binary: {e}")
    print("Run '!apt-get install graphviz' in a new cell. Verify 'dot' is in PATH with '!which dot'.")
except Exception as e:
    print(f"Unexpected error: {e}")
    print("Try re-running the cell or installing dependencies manually ('%pip install torchviz graphviz deeplabcut --upgrade' and '!apt-get install graphviz').")