Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: off-by-one error in run_lengths #2093

Merged
merged 2 commits into from
Jan 9, 2023
Merged

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Jan 9, 2023

ak.run_lengths([[3, 3, 3, 3], [3]]) does not correctly consider the inner list boundary. This PR fixes the causative off-by-one error.

The cause of this bug is that the logic that forcibly splits runs across list boundaries was ignoring one-length sublists at the end of the list. This off-by-one error meant that runs were incorrectly computed across a boundary in such cases.

@agoose77 agoose77 marked this pull request as draft January 9, 2023 17:47
@agoose77
Copy link
Collaborator Author

agoose77 commented Jan 9, 2023

Hmm, this test isn't failing on main...

@agoose77 agoose77 marked this pull request as ready for review January 9, 2023 17:51
@agoose77 agoose77 temporarily deployed to docs-preview January 9, 2023 18:02 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Jan 9, 2023

Codecov Report

Merging #2093 (cf3abfb) into main (00da8e3) will decrease coverage by 0.08%.
The diff coverage is n/a.

Additional details and impacted files
Impacted Files Coverage Δ
src/awkward/operations/ak_run_lengths.py 90.90% <ø> (ø)
src/awkward/operations/ak_sort.py 60.00% <0.00%> (-40.00%) ⬇️
src/awkward/operations/ak_argsort.py 75.00% <0.00%> (-25.00%) ⬇️
src/awkward/operations/ak_ones_like.py 90.90% <0.00%> (-9.10%) ⬇️
src/awkward/operations/ak_zeros_like.py 92.85% <0.00%> (-7.15%) ⬇️
src/awkward/operations/ak_ravel.py 93.33% <0.00%> (-6.67%) ⬇️
src/awkward/operations/ak_isclose.py 94.44% <0.00%> (-5.56%) ⬇️
src/awkward/operations/ak_count_nonzero.py 77.27% <0.00%> (-2.73%) ⬇️
src/awkward/operations/ak_nan_to_num.py 98.03% <0.00%> (-1.97%) ⬇️
src/awkward/operations/ak_full_like.py 98.07% <0.00%> (-1.93%) ⬇️
... and 15 more

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that if this was something v1 got right and v2 (without this PR) gets wrong—which I just confirmed, using your test example—then we should be able to look at the v1 code and understand why.

Here's the relevant section of the v1 code:

diffs = data[1:] != data[:-1]
if isinstance(diffs, ak.highlevel.Array):
diffs = nplike.asarray(diffs)
if offsets is not None:
diffs[offsets[1:-1] - 1] = True

The code here is pretty different now because of is_interior, introduced in #1795. I guess that means this PR is a bug-fix on that PR, right?

Anyway, it looks right now and I think it should be merged!

@agoose77
Copy link
Collaborator Author

agoose77 commented Jan 9, 2023

@jpivarski yes, don't run git blame ;)

@agoose77 agoose77 merged commit 49eabc2 into main Jan 9, 2023
@agoose77 agoose77 deleted the agoose77/feat-run-lengths-empty branch January 9, 2023 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants