Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for tracking thread causality. #38

Merged
merged 1 commit into from Jul 2, 2021

Conversation

jorajeev
Copy link
Member

This change adds support for tracking causality among thread operations.
We do this in the standard way, by associating a vector clock with each thread.
The i'th element of a thread's vector clock denotes its knowledge of the clock
of thread i. Clocks are partially ordered using a pointwise ordering <.
The main property we want is that for any pair of events p, q:
(p causally precedes q) iff (clock at p < clock at q).

We update the code for thread spawn and join, as well as the various synchronization
objects (Atomics, Barriers, CondVars, Mutexes, RwLocks and mpsc channels) to track
causality by updating vector clocks appropriately.

This change does not currently properly track causality for async interactions;
those will be done in a subsequent PR.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Member

@jamesbornholt jamesbornholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool. A bunch of minor comments. I think my only high-level comment is whether there's a way we can make the API less clunky by doing the clock maintenance inside block/unblock?

src/runtime/task/mod.rs Outdated Show resolved Hide resolved
src/runtime/task/mod.rs Outdated Show resolved Hide resolved
src/runtime/execution.rs Show resolved Hide resolved
src/runtime/task/mod.rs Outdated Show resolved Hide resolved
src/runtime/task/mod.rs Outdated Show resolved Hide resolved
tests/basic/clocks.rs Show resolved Hide resolved
tests/basic/clocks.rs Outdated Show resolved Hide resolved
tests/basic/mutex.rs Outdated Show resolved Hide resolved
tests/basic/pct.rs Outdated Show resolved Hide resolved
tests/basic/thread.rs Outdated Show resolved Hide resolved
@jorajeev
Copy link
Member Author

jorajeev commented Jul 1, 2021

Super cool. A bunch of minor comments. I think my only high-level comment is whether there's a way we can make the API less clunky by doing the clock maintenance inside block/unblock?

I think hijacking block/unblock won't work for the atomic registers, where the reader has to inherit the clock from the last writer, but there is no blocking. We could try piggybacking clocks on yield's, but since we're likely to remove some yields in the future (since we know we have some redundant ones), I'm worried that change may break causality. So I propose we live with this clunkiness for now, and revisit this later (maybe when we add causality to async). I added a TODO in the spawn method to record this.

Copy link
Member

@jamesbornholt jamesbornholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more nits.

src/sync/atomic/mod.rs Outdated Show resolved Hide resolved
Comment on lines 178 to 179
self.inhale_clock();
self.exhale_clock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these happen in the opposite order to in fetch_update?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. It should be the other way around. I also realized fetch_update was wrong -- it was not handling the error case properly. :-(

Fixed both and added comments to explain.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added a new test clock_fetch_update to reveal the fetch_update bug. For the other one, I think the way I had it wasn't a bug, but reordering the steps is more consistent with translating the swap to load/store.

use crate::runtime::thread;
use crate::runtime::thread::continuation::{ContinuationPool, PooledContinuation};
use bitvec::prelude::*;
use bitvec::vec::BitVec;
use futures::{task::Waker, Future};
use std::cell::RefCell;
use std::cmp::PartialOrd;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Why doesn't clippy complain?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, because it's test code. Guess I should run it with --cfg test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, it's cargo clippy --tests. TIL.

src/runtime/task/mod.rs Outdated Show resolved Hide resolved
src/runtime/task/clock.rs Outdated Show resolved Hide resolved
src/runtime/task/clock.rs Outdated Show resolved Hide resolved
src/runtime/task/clock.rs Outdated Show resolved Hide resolved
src/thread.rs Outdated Show resolved Hide resolved
This change adds support for tracking causality among thread operations.
We do this in the standard way, by associating a vector clock with each thread.
The i'th element of a thread's vector clock denotes its knowledge of the clock
of thread i. Clocks are partially ordered using a pointwise ordering <.
The main property we want is that for any pair of events p, q:
    (p causally precedes q)  iff  (clock at p  <  clock at q).

We update the code for thread spawn and join, as well as the various synchronization
objects (Atomics, Barriers, CondVars, Mutexes, RwLocks and mpsc channels) to track
causality by updating vector clocks appropriately.

This change does not currently properly track causality for async interactions;
those will be done in a subsequent PR.
@jamesbornholt jamesbornholt merged commit 9ddbcd8 into awslabs:main Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants