Add support for mouse triplets and mabe22 by gchindemi · Pull Request #47 · BelloneLab/lisbet

gchindemi · 2025-05-23T19:46:12Z

This pull request introduces support for the MABe22 dataset and refines data preprocessing, scaling, and training configurations. Key changes include adding a new dataset, enhancing scaling logic, updating the input pipeline for feature selection, and improving test coverage.

New Dataset Support:

Added support for the MABe22 dataset, including its preprocessing and loading logic in a new module src/lisbet/datasets/mabe22.py. This includes methods to handle training and testing data, along with metadata and references.
Updated fetch_dataset in src/lisbet/datasets/core.py to include MABe22 dataset retrieval and preprocessing logic.
Added MABe22 to the dataset options in the CLI (src/lisbet/cli.py).

Data Preprocessing and Scaling:

Enhanced _load_posetracks in src/lisbet/datasets/core.py to support scaling based on explicit factors or the image_size_px attribute. This ensures more flexible and accurate coordinate normalization.
Added image_size_px attribute to CalMS21 preprocessing to enable consistent scaling.

Input Pipeline Refinements:

Updated __init__ in src/lisbet/input_pipeline.py to compute and store individual-specific feature indices for efficient feature selection.
Modified __getitem__ in src/lisbet/input_pipeline.py to apply swaps for specific individuals' features, improving clarity and precision.

Training Configuration:

Added a pin_memory parameter to _configure_dataloaders in src/lisbet/training.py, dynamically set based on the device type (e.g., CUDA, MPS, or CPU).

Test Enhancements:

Added tests to validate scaling logic, ensuring explicit scaling and image_size_px scaling produce consistent results and raise errors for invalid data.
Improved error message validation in inference tests.

- Support scaling coordinates using image_size_px attribute - Enforce [0, 1] range after all scaling modes - Add tests for explicit and image_size_px scaling behavior

Add logic to extract individuals and their feature indices from pose data. Update swapping and shifting to operate only on the second individual's features. Add tests for multi-individual handling and correct swapping/shifting behavior.

Copilot

Pull Request Overview

Add support for the MABe22 Mouse Triplets dataset, refine scaling and preprocessing logic, and update training and input pipeline configurations.

Introduce a new mabe22 module with loading and preprocessing for mouse triplets and register it in the CLI.
Enhance _load_posetracks to handle explicit scaling, image_size_px scaling, and auto-scaling with consistent error checks.
Compute per-individual feature indices in BaseDataset and update swap/delay datasets to target specific individual features; add dynamic pin_memory selection for dataloaders.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/test_training_integration.py	Updated sample dataset path in integration test to include the `datasets/` prefix
tests/test_training_helpers.py	Updated `_configure_dataloaders` call to include the new `pin_memory` argument
tests/test_input_pipeline.py	Added tests for multi-individual feature indexing, swap-only and delay-only behaviors
tests/test_inference.py	Adjusted error message regex to remove the exclamation mark
tests/test_datasets_core.py	Added tests to validate explicit vs. `image_size_px` scaling and out-of-range checks
src/lisbet/training.py	Added `pin_memory` parameter to `_configure_dataloaders` and set it dynamically based on device type
src/lisbet/input_pipeline.py	Compute per-individual feature indices in `BaseDataset`; update swap/delay classes to swap by index
src/lisbet/datasets/mabe22.py	New module implementing support for the MABe22 dataset
src/lisbet/datasets/core.py	Enhanced coordinate scaling logic and integrated MABe22 fetch path
src/lisbet/datasets/calms21.py	Added `image_size_px` attribute to CalMS21 preprocessing
src/lisbet/cli.py	Registered `MABe22_MouseTriplets` dataset in the CLI

- Changes annotation variable from 'label' to 'target_cls' throughout the codebase for consistency - Removes obsolete label_cat conversion in core.py - Updates evaluation and training logic to use target_cls - Adds tests for evaluation module

Include support for loading and structuring both classification and regression annotations using xarray Datasets for train and test splits.

Copilot

Pull Request Overview

This PR adds support for the new MABe22 dataset, updates data preprocessing (including scaling logic), refines the input pipeline, and improves test coverage for training, inference, and evaluation. Key changes include adding a new module for the MABe22 dataset, updating scaling/normalization routines in dataset loaders, and modifying the data loader configuration with a new pin_memory parameter.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/test_training_integration.py	Updated test path to reflect new directory structure for sample data
tests/test_training_helpers.py	Updated dataloader call to include pin_memory parameter
tests/test_inference.py	Adjusted expected error message for inference failure
tests/test_input_pipeline.py	Added tests verifying individual indices and swap/shift behavior
tests/test_datasets_core.py	Added tests for explicit scaling behavior and equivalence with image_size_px scaling
src/lisbet/training.py	Updated dataloader configuration and pin_memory assignment based on device type
src/lisbet/input_pipeline.py	Revised swapping/shifting to directly target the second individual’s features
src/lisbet/evaluation.py	Updated error label extraction to match new annotation naming convention
src/lisbet/datasets/mabe22.py	New module implementing data processing for the MABe22 dataset
src/lisbet/datasets/core.py	Extended scaling logic to handle explicit scale strings and image_size_px attributes
src/lisbet/datasets/calms21.py	Updated preprocessing to include image_size_px and new target_cls naming
src/lisbet/cli.py	Updated dataset options to add support for MABe22_MouseTriplets

Comments suppressed due to low confidence (1)

tests/test_inference.py:127

The updated error message no longer includes an exclamation mark. Please verify that this change is consistent with the intended exception messaging throughout the codebase.

with pytest.raises(ValueError, match="Incompatible input features"):

gchindemi added 6 commits May 20, 2025 09:33

Add image_size_px scaling and improve coordinate normalization

3b12481

- Support scaling coordinates using image_size_px attribute - Enforce [0, 1] range after all scaling modes - Add tests for explicit and image_size_px scaling behavior

Add support for MABe22 MouseTriplets dataset (no annotations)

5022b0b

Update test sequence URL and hash in fetch_dataset

81802a1

Remove exclamation mark from ValueError match in test

b1f36a0

Make pin_memory configurable in dataloaders setup

140e95c

gchindemi linked an issue May 23, 2025 that may be closed by this pull request

Add support for MABe22 dataset and extend dataset heuristics #46

Closed

gchindemi requested a review from Copilot May 23, 2025 19:46

Copilot AI reviewed May 23, 2025

View reviewed changes

Comment thread src/lisbet/datasets/mabe22.py Outdated

Comment thread src/lisbet/input_pipeline.py Outdated

Comment thread src/lisbet/datasets/mabe22.py Outdated

Comment thread src/lisbet/input_pipeline.py Outdated

gchindemi added 5 commits May 25, 2025 18:18

Remove label_cat assignment and update label extraction logic

5c30345

Add annotation handling for Mouse Triplets dataset

668c07e

Include support for loading and structuring both classification and regression annotations using xarray Datasets for train and test splits.

Fix comment to refer to shifting instead of swapping

a5a363e

Refactor variable name for clarity in swap logic

8131f3e

gchindemi requested a review from Copilot May 27, 2025 08:14

Copilot AI reviewed May 27, 2025

View reviewed changes

Comment thread src/lisbet/input_pipeline.py

Comment thread src/lisbet/datasets/core.py

Expand help for --task_ids and raise for unimplemented lfc task

2fe1166

gchindemi merged commit da85b1e into main May 27, 2025

gchindemi deleted the feature/mabe22 branch May 27, 2025 12:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for mouse triplets and mabe22#47

Add support for mouse triplets and mabe22#47
gchindemi merged 12 commits intomainfrom
feature/mabe22

gchindemi commented May 23, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gchindemi commented May 23, 2025

New Dataset Support:

Data Preprocessing and Scaling:

Input Pipeline Refinements:

Training Configuration:

Test Enhancements:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants