Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master unbusy race #2986

Merged
merged 2 commits into from
Apr 26, 2019
Merged

Conversation

mbgrydeland
Copy link
Contributor

Resolve race on waitinglist rushing between OC_F_BUSY and boc->state …
When an object is ready for delivery, HSH_Unbusy was called before calling
ObjSetState([BOS_STREAM|BOS_FINISHED]). The HSH_Unbusy() call does the
waitinglist rushing, but HSH_Lookup() wanted to look at the boc->state and
if BOS_STREAM had been reached. This could cause requests woken to find
that the stream state still hadn't been reached (ObjSetState still hadn't
executed), and go back on the waitinglist.

To fix this, this patch reverts commit 0375791, and goes back to considering
OC_F_BUSY as the gate keeper for HSH_Lookup. This eliminates the race,
because HSH_Unbusy and HSH_Lookup then uses the same mutex.

That change opens up the possiblity that req code after HSH_Lookup() sees
an object that has not yet reached BOS_STREAM. In order to not have to add
new ObjWaitState() calls (with the additional locking cost that would
bring) to wait for BOS_STREAM, the order of events is changed throughout,
and calls ObjSetState([BOS_STREAM|BOS_FINISHED]) before HSH_Unbusy(). That
way, an object returned from HSH_Lookup() is guaranteed to be at least
BOS_STREAM.

When an object is ready for delivery, HSH_Unbusy was called before calling
ObjSetState([BOS_STREAM|BOS_FINISHED]). The HSH_Unbusy() call does the
waitinglist rushing, but HSH_Lookup() wanted to look at the boc->state and
if BOS_STREAM had been reached. This could cause requests woken to find
that the stream state still hadn't been reached (ObjSetState still hadn't
executed), and go back on the waitinglist.

To fix this, this patch reverts commit
0375791, and goes back to considering
OC_F_BUSY as the gate keeper for HSH_Lookup. This eliminates the race,
because HSH_Unbusy and HSH_Lookup then uses the same mutex.

That change opens up the possiblity that req code after HSH_Lookup() sees
an object that has not yet reached BOS_STREAM. In order to not have to add
new ObjWaitState() calls (with the additional locking cost that would
bring) to wait for BOS_STREAM, the order of events is changed throughout,
and calls ObjSetState([BOS_STREAM|BOS_FINISHED]) before HSH_Unbusy(). That
way, an object returned from HSH_Lookup() is guaranteed to be at least
BOS_STREAM.
This reverts some of the previous attempts to get these test cases stable,
as those attempts actually prevented the testing of the desired code
paths.

Also make the test cases wait until the required requests are on the
waitinglist before continuing.
@bsdphk
Copy link
Contributor

bsdphk commented Apr 26, 2019

Looks good to me and passes mild torture on 32&64 bit FreeBSD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants