Fix cumulative sum bugs in aggregate_data_by_bin_size()#123
Merged
jacobbeierle merged 7 commits intomainfrom Apr 9, 2026
Merged
Fix cumulative sum bugs in aggregate_data_by_bin_size()#123jacobbeierle merged 7 commits intomainfrom
jacobbeierle merged 7 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes incorrect cumulative aggregation in aggregate_data_by_bin_size() that was corrupting latency and average-bout-length output features by summing per-bin values that should not be summed.
Changes:
- Extract per-mouse first/last latency values from per-bin data before numeric
groupby().sum()so latency isn’t cumulatively summed. - Compute
avg_bout_lengthfrom the last bin’s value per mouse (with a sample-count guard) instead of a cross-mouse scalarnp.average()broadcast. - Add focused unit tests covering single/multi-bin and multi-mouse cases, including NaN preservation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| support_code/behavior_summaries.py | Fixes latency and avg bout length aggregation by computing them pre-sum and aligning per MouseID. |
| tests/support_code/test_behavior_summaries.py | Adds regression tests for latency and avg bout length correctness, including multi-mouse alignment and NaN cases. |
| tests/support_code/init.py | Adds package marker/docstring for support_code tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
gbeane
approved these changes
Apr 2, 2026
michberger
approved these changes
Apr 7, 2026
michberger
left a comment
There was a problem hiding this comment.
I tested the changes in this branch with some data from the Rett project and found that it correctly generated the latency values and produced logical average bout length values.
bergsalex
approved these changes
Apr 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two output features in behavior_summaries.py are corrupted by the same root cause: groupby("MouseID").sum() on line 136/148 collapses all per-bin rows into one row per mouse by summing every numeric column — including columns where summation is meaningless. This produces silently wrong values that grow with window size.
Affected features:
Latency columns (
bin_first_XX.{behavior}_latency_first_prediction,bin_last_XX.{behavior}_latency_last_prediction) should report the frame number of the first/last prediction within a time window. Instead, per-bin latency values are summed together, producing frame numbers that exceed the window duration (e.g., 142.7 min latency in a 0–60 min window). A secondary issue:.head(1)/.tail(1)on the already-collapsed single-row-per-mouse result silently returns NaN for all mice except the first/last in multi-mouse runs.Average bout length (
bin_avg_XX.{behavior}_avg_bout_length) should report the mean bout duration for a given window. Instead, per-bin averages are summed, producing values that scale roughly linearly with bin count (e.g., bin_avg_5 ≈ 18.8, bin_avg_15 ≈ 55.1). A secondary issue:np.average()computes a single scalar across all mice and broadcasts it to every row, so all mice get an identical value.Origin
Both bugs live in
aggregate_data_by_bin_size()insupport_code/behavior_summaries.py. The function filters per-bin data to a time window, then runsgroupby("MouseID")[numeric_cols].sum(). This is correct for count-like columns (total frames, total bouts) but wrong for columns that represent point-in-time values (latency) or already-averaged quantities (mean bout duration). The downstream code that consumes the summed result then compounds the problem with additional logic errors (.head(1)/.tail(1)indexing for latency; scalar broadcast for bout length).Fix
The fix follows the same pattern for both features: extract the needed values from the per-bin
filtered_databefore thegroupby().sum(), so the summation never touches them.Latency: Use
.groupby("MouseID")[col].agg(lambda s: s.iloc[0])and.agg(lambda s: s.iloc[-1])to grab the first and last bin's latency per mouse. These preserve NaN when a bin has no behavior, preventing bleed from adjacent bins. Assign directly toaggregated, which aligns on the shared MouseID index.Average bout length: Use
.groupby("MouseID")[col].agg(lambda s: s.iloc[-1])to grab the last bin's average bout duration per mouse. Guard with.where(sample_count > 0, np.nan)so bins with zero bouts yield NaN rather than a stale or zero value. This replaces the brokennp.average()scalar broadcast entirely.Verification
New tests in
tests/support_code/test_behavior_summaries.pycover both fixes with synthetic multi-bin, multi-mouse DataFrames, asserting that values are not cumulative, NaN is preserved when a bin has no behavior, and each mouse receives its own independent values.