Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix object layout races #285

Merged
merged 23 commits into from Feb 5, 2019

Conversation

@daumayr
Copy link
Contributor

commented Jan 14, 2019

This PR fixes issue #85.

The Phaser used in ObjectTransitionSafepoint used a default Phaser, which Terminates when onAdvance(...) returns true. This caused the synchronisation methods to have no effect.
The default onAdvance implementation causes the Phaser to terminate when the number of registered Threads is 0.

By overwriting the onAdvance method and returning false we can prevent Phaser from terminating and preserve the synchronisation.

@daumayr daumayr force-pushed the daumayr:FixObjectLayoutRaces branch from c84b334 to 5f611db Jan 14, 2019

@daumayr

This comment has been minimized.

Copy link
Contributor Author

commented Jan 15, 2019

While tests are running for this branch, we still have an issue with some benchmarks. The Vacation Benchmark still deadlocks, so ideally we don't merge this PR until we fix the Join Prim and similar things.

@smarr

This comment has been minimized.

Copy link
Owner

commented Jan 15, 2019

Ok, will see what I can do about those. Will probably try to do them in a separate PR to see performance issues independently.

@daumayr

This comment has been minimized.

Copy link
Contributor Author

commented Jan 15, 2019

I think we need to exclude those benchmarks (LeeTm and Vacation) until then.

Repository owner deleted a comment from codacy-bot Jan 16, 2019

Repository owner deleted a comment from codacy-bot Jan 16, 2019

Repository owner deleted a comment from codacy-bot Jan 16, 2019

Repository owner deleted a comment from codacy-bot Jan 16, 2019

Repository owner deleted a comment from codacy-bot Jan 16, 2019

Repository owner deleted a comment from codacy-bot Jan 16, 2019

daumayr added some commits Jan 9, 2019

Fix ObjectTransitionSafepoint
Phaser was able to terminate, an thread registration didn't prevent threads from making progress.

@daumayr daumayr force-pushed the daumayr:FixObjectLayoutRaces branch from f4f6d42 to 1ad49a2 Jan 16, 2019

@smarr
Copy link
Owner

left a comment

A quick review to give at least some comments.
Sorry, didn't get to more.

smarr added some commits Jan 31, 2019

Adapt to be SafepointPhaser
This change introduces our formatting and names the class SafepointPhaser and sets the package name correctly.

No changes of functionality.

Signed-off-by: Stefan Marr <git@stefan-marr.de>
Fix checkstyle issues in SafepointPhaser
Signed-off-by: Stefan Marr <git@stefan-marr.de>

@smarr smarr force-pushed the daumayr:FixObjectLayoutRaces branch from 1ad49a2 to 439313e Feb 1, 2019

Repository owner deleted a comment from codacy-bot Feb 1, 2019

* if null, denotes noninterruptible wait
* @return current phase
*/
private int internalAwaitAdvance(final int phase, QNode node) {

This comment has been minimized.

Repository owner deleted a comment from codacy-bot Feb 1, 2019

smarr and others added some commits Jan 31, 2019

Use SafepointPhaser
Co-authored-by: Dominik Aumayr <daumayr162@gmail.com>
Signed-off-by: Stefan Marr <git@stefan-marr.de>
Remove termination by onAdvance from SafepointPhaser
The SafepointPhaser should not terminate, even if no parties are registered on it anymore.

This can happen with our safepoint design, but it is not safe to terminate the phaser. It needs to continue working and accept threads to join again.

Co-authored-by: Dominik Aumayr <daumayr162@gmail.com>
Signed-off-by: Stefan Marr <git@stefan-marr.de>
Simplify SafepointPhaser
- we don’t need support for a phaser tree
- we don’t need different constructors

Signed-off-by: Stefan Marr <git@stefan-marr.de>
Remove further unused elements from SafepointPhaser
Signed-off-by: Stefan Marr <git@stefan-marr.de>

daumayr and others added some commits Jan 14, 2019

Integrate SafepointPhaser closely with Safepoint to avoid races
This change uses the knowledge of the safepoint consisting of two phases. This allows us to manage registrations without extra race, and it allows us to renew the assumption as part of the arrival routine in the phaser.

Signed-off-by: Stefan Marr <git@stefan-marr.de>
Reduce SafepointPhaser documentation to the accurate bits
Signed-off-by: Stefan Marr <git@stefan-marr.de>

@smarr smarr force-pushed the daumayr:FixObjectLayoutRaces branch from 13c4a97 to 2c0f9ba Feb 2, 2019

@smarr smarr added this to the v0.7.0 milestone Feb 2, 2019

@smarr smarr added this to Open Issues in Completeness via automation Feb 2, 2019

@smarr smarr added the bug label Feb 2, 2019

Repository owner deleted a comment from codacy-bot Feb 2, 2019

Repository owner deleted a comment from codacy-bot Feb 2, 2019

@smarr

This comment has been minimized.

Copy link
Owner

commented Feb 2, 2019

@daumayr I come up with what I think is a working solution. Could you please review and possible test this?
I haven't yet thought of a good test on how to stress the safepoints.
Not sure I can induce repeated safepoints easily.
Would be good to discuss this in detail, to see whether there are any know races left, or any other issue that may make this approach problematic.

The most important bit to discuss is likely 0274546

@daumayr

This comment has been minimized.

Copy link
Contributor Author

commented Feb 4, 2019

(phase & 1) == 0
I was thinking of using something similar when i did some work on the safepoints. When no safepoint is going on and all threads deregister the phaser advances to the next phase. After that the even/odd check would no longer work. You might want to change the doArrive method to not advance phase when there are no threads left.

@smarr

This comment has been minimized.

Copy link
Owner

commented Feb 4, 2019

@daumayr hmmmm, ok. Didn't think of that. Looking at the code: https://github.com/smarr/SOMns/pull/285/files#diff-2b3be932b680eb719ab91a02f3b6f064R137

I suppose, I could simply increment the phase in that case by 2, no?

@smarr

This comment has been minimized.

Copy link
Owner

commented Feb 4, 2019

hmmm, well, possibly depending on in which concrete state we are. Though, we should never unregister during a safepoint. So, probably increment by 2 plus an assertion that we are never in a phase that indicates a safepoint is going on

@daumayr

This comment has been minimized.

Copy link
Contributor Author

commented Feb 4, 2019

Increment by two sounds good, that way we still have a phase change and keep the odd/even.

@smarr

This comment has been minimized.

Copy link
Owner

commented Feb 4, 2019

ok, cool.

and one more todo:

  • add unit tests that test the basic logic for correctness, and have asserts for all the parts not covered
SafepointPhase: ensure odd/even for phases also after all threads are…
… reregistered

Signed-off-by: Stefan Marr <git@stefan-marr.de>

@smarr smarr force-pushed the daumayr:FixObjectLayoutRaces branch from 3b34006 to 743481b Feb 4, 2019

Added basic SafepointPhaser unit test
Signed-off-by: Stefan Marr <git@stefan-marr.de>

@smarr smarr force-pushed the daumayr:FixObjectLayoutRaces branch from 743481b to 56e54f9 Feb 4, 2019

@smarr

This comment has been minimized.

Copy link
Owner

commented Feb 4, 2019

@daumayr could you give this a spin with your snapshot tests?
Would also be great if you could review it and look at the added test. Any obvious test we should add?

Repository owner deleted a comment from codacy-bot Feb 4, 2019

Repository owner deleted a comment from codacy-bot Feb 4, 2019

@daumayr

This comment has been minimized.

Copy link
Contributor Author

commented Feb 5, 2019

Snapshot Tests running, didn't see any errors.

@smarr smarr merged commit 29a2c9f into smarr:dev Feb 5, 2019

2 checks passed

Codacy/PR Quality Review Up to standards. A positive pull request.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

Completeness automation moved this from Open Issues to Completed Feb 5, 2019

@daumayr daumayr deleted the daumayr:FixObjectLayoutRaces branch Feb 5, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.