Conversation
- Support scaling coordinates using image_size_px attribute - Enforce [0, 1] range after all scaling modes - Add tests for explicit and image_size_px scaling behavior
Add logic to extract individuals and their feature indices from pose data. Update swapping and shifting to operate only on the second individual's features. Add tests for multi-individual handling and correct swapping/shifting behavior.
There was a problem hiding this comment.
Pull Request Overview
Add support for the MABe22 Mouse Triplets dataset, refine scaling and preprocessing logic, and update training and input pipeline configurations.
- Introduce a new
mabe22module with loading and preprocessing for mouse triplets and register it in the CLI. - Enhance
_load_posetracksto handle explicit scaling,image_size_pxscaling, and auto-scaling with consistent error checks. - Compute per-individual feature indices in
BaseDatasetand update swap/delay datasets to target specific individual features; add dynamicpin_memoryselection for dataloaders.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_training_integration.py | Updated sample dataset path in integration test to include the datasets/ prefix |
| tests/test_training_helpers.py | Updated _configure_dataloaders call to include the new pin_memory argument |
| tests/test_input_pipeline.py | Added tests for multi-individual feature indexing, swap-only and delay-only behaviors |
| tests/test_inference.py | Adjusted error message regex to remove the exclamation mark |
| tests/test_datasets_core.py | Added tests to validate explicit vs. image_size_px scaling and out-of-range checks |
| src/lisbet/training.py | Added pin_memory parameter to _configure_dataloaders and set it dynamically based on device type |
| src/lisbet/input_pipeline.py | Compute per-individual feature indices in BaseDataset; update swap/delay classes to swap by index |
| src/lisbet/datasets/mabe22.py | New module implementing support for the MABe22 dataset |
| src/lisbet/datasets/core.py | Enhanced coordinate scaling logic and integrated MABe22 fetch path |
| src/lisbet/datasets/calms21.py | Added image_size_px attribute to CalMS21 preprocessing |
| src/lisbet/cli.py | Registered MABe22_MouseTriplets dataset in the CLI |
- Changes annotation variable from 'label' to 'target_cls' throughout the codebase for consistency - Removes obsolete label_cat conversion in core.py - Updates evaluation and training logic to use target_cls - Adds tests for evaluation module
Include support for loading and structuring both classification and regression annotations using xarray Datasets for train and test splits.
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for the new MABe22 dataset, updates data preprocessing (including scaling logic), refines the input pipeline, and improves test coverage for training, inference, and evaluation. Key changes include adding a new module for the MABe22 dataset, updating scaling/normalization routines in dataset loaders, and modifying the data loader configuration with a new pin_memory parameter.
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_training_integration.py | Updated test path to reflect new directory structure for sample data |
| tests/test_training_helpers.py | Updated dataloader call to include pin_memory parameter |
| tests/test_inference.py | Adjusted expected error message for inference failure |
| tests/test_input_pipeline.py | Added tests verifying individual indices and swap/shift behavior |
| tests/test_datasets_core.py | Added tests for explicit scaling behavior and equivalence with image_size_px scaling |
| src/lisbet/training.py | Updated dataloader configuration and pin_memory assignment based on device type |
| src/lisbet/input_pipeline.py | Revised swapping/shifting to directly target the second individual’s features |
| src/lisbet/evaluation.py | Updated error label extraction to match new annotation naming convention |
| src/lisbet/datasets/mabe22.py | New module implementing data processing for the MABe22 dataset |
| src/lisbet/datasets/core.py | Extended scaling logic to handle explicit scale strings and image_size_px attributes |
| src/lisbet/datasets/calms21.py | Updated preprocessing to include image_size_px and new target_cls naming |
| src/lisbet/cli.py | Updated dataset options to add support for MABe22_MouseTriplets |
Comments suppressed due to low confidence (1)
tests/test_inference.py:127
- The updated error message no longer includes an exclamation mark. Please verify that this change is consistent with the intended exception messaging throughout the codebase.
with pytest.raises(ValueError, match="Incompatible input features"):
This pull request introduces support for the MABe22 dataset and refines data preprocessing, scaling, and training configurations. Key changes include adding a new dataset, enhancing scaling logic, updating the input pipeline for feature selection, and improving test coverage.
New Dataset Support:
src/lisbet/datasets/mabe22.py. This includes methods to handle training and testing data, along with metadata and references.fetch_datasetinsrc/lisbet/datasets/core.pyto include MABe22 dataset retrieval and preprocessing logic.src/lisbet/cli.py).Data Preprocessing and Scaling:
_load_posetracksinsrc/lisbet/datasets/core.pyto support scaling based on explicit factors or theimage_size_pxattribute. This ensures more flexible and accurate coordinate normalization.image_size_pxattribute to CalMS21 preprocessing to enable consistent scaling.Input Pipeline Refinements:
__init__insrc/lisbet/input_pipeline.pyto compute and store individual-specific feature indices for efficient feature selection.__getitem__insrc/lisbet/input_pipeline.pyto apply swaps for specific individuals' features, improving clarity and precision.Training Configuration:
pin_memoryparameter to_configure_dataloadersinsrc/lisbet/training.py, dynamically set based on the device type (e.g., CUDA, MPS, or CPU).Test Enhancements:
image_size_pxscaling produce consistent results and raise errors for invalid data.