Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libidset: fix idset_last() at size=32 #2340

Merged
merged 3 commits into from Aug 27, 2019

Conversation

@garlick
Copy link
Member

commented Aug 27, 2019

This turned out to be a veb bug, which I opened quixotically at mrdomino/libveb#1.

The PR presents a workaround and some tests, one unit test for veb.c (skipping with TODO), and one for libidset which demonstrates the fix and pokes around the edges of it.

I regret that I do not think I am the right person to have a go at the upstream bug, hence the workaround, after satisfying myself that it must be localized to vebpred (size-1) where size=32.

garlick added 3 commits Aug 26, 2019
Before the workaround, the only failure I could make happen
was at size 32, but check 31, 32, 33.
Check that idset_last() returns correctly for all possible
"last" values within those sizes, to be sure the workaround
doesn't introduce any issues.
Found a bug: vebpred(M-1) at M=32 always returns M.
This adds a unit test for the bug, but leaves it a TODO
as we'll work around it rather than fix it right now.

The bug where this was discovered is #2236
Problem: idset_last() always fails when the idset has
a fixed size of 32.

This is a vebpred() bug, and appears to be peculiar to
size=32 and vebpred(size-1).  This commit adds a workaround:
if vebpred(size-1) fails, check vebsucc(size-1), then if
that fails check vebpred(size-2).

The workaround is enabled for all sizes > 1, just in case.
Overhead should be low (in fact no overhead for non-empty
idsets, when vebpred() works as advertized).

Fixes #2336 (using the term loosely)
@garlick garlick force-pushed the garlick:idset_workaround branch from 854ea47 to 0730693 Aug 27, 2019
@grondo

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2019

Whoa! I was worried it was going to be a libveb bug. Impressive sleuthing.

It is really surprising, and a bit worrying, that we didn't even see this issue until now.

Hopefully no more gotchas living in veb.c!

@garlick

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2019

We used to have a min size of 1K for nodeset_t so wouldn't have hit it then. I was tempted to do that again, but it seemed a bit heavy handed...

@grondo

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2019

Ah, sorry I didn't know that.

@grondo
grondo approved these changes Aug 27, 2019
Copy link
Contributor

left a comment

Looks good thanks. I especially appreciate the TODO test in the veb.c unit tests.
Thanks!

@codecov-io

This comment has been minimized.

Copy link

commented Aug 27, 2019

Codecov Report

Merging #2340 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2340      +/-   ##
==========================================
- Coverage   80.83%   80.82%   -0.02%     
==========================================
  Files         215      215              
  Lines       34236    34240       +4     
==========================================
- Hits        27676    27674       -2     
- Misses       6560     6566       +6
Impacted Files Coverage Δ
src/common/libidset/idset.c 96.45% <100%> (+0.1%) ⬆️
src/modules/connector-local/local.c 73.26% <0%> (-1.16%) ⬇️
src/cmd/flux-module.c 84.19% <0%> (+0.47%) ⬆️
@dongahn

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2019

Thank you for looking into this @garlick and @grondo!

@grondo grondo merged commit 802c2cc into flux-framework:master Aug 27, 2019
4 checks passed
4 checks passed
Summary 1 potential rule
Details
codecov/patch 100% of diff hit (target 80.83%)
Details
codecov/project Absolute coverage decreased by -0.01% but relative coverage increased by +19.16% compared to c9ef033
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@garlick

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2019

You are welcome, good catch @dongahn!

@grondo

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2019

Oops missed that 1367c46 references the wrong issue. Oh well.

@garlick garlick deleted the garlick:idset_workaround branch Aug 27, 2019
@garlick garlick referenced this pull request Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.