Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rt: cleanup and simplify scheduler (scheduler v2.5) #2273

Merged
merged 68 commits into from
Mar 5, 2020
Merged

Conversation

carllerche
Copy link
Member

@carllerche carllerche commented Feb 25, 2020

A refactor of the scheduler internals focusing on simplifying and
reducing unsafety. There are no fundamental logic changes. That said,
I recommend reading the new files themselves and not the diff.

  • The state transitions of the core task component are refined and
    reduced.
  • basic_scheduler has (almost) all unsafety removed.
  • local_set has most unsafety removed.
  • threaded_scheduler limits unsafety to its queue implementation.

A refactor of the scheduler internals focusing on simplifying and
reducing unsafety. There are no fundamental logic changes.

* The state transitions of the core task component are refined and
reduced.
* `basic_scheduler` has all unsafety removed.
* `local_set` has most unsafety removed.
* `threaded_scheduler` limits unsafety to its queue implementation.
@carllerche carllerche changed the title rt: cleanup and simplify scheduler rt: cleanup and simplify scheduler (scheduler v2.5) Feb 25, 2020
@carllerche
Copy link
Member Author

cc @tmiasko, this is the cleanup I mentioned before. If you can help w/ CI configuration, I would be happy to set this up w/ TSAN as long as we can remove false positives. You mentioned you had a strategy to do so?

@carllerche
Copy link
Member Author

And I only ran with MAX_PREEMPTIONS=1 locally before the final push because it takes too long 😢

@tmiasko
Copy link
Contributor

tmiasko commented Feb 25, 2020

Regarding the thread sanitizer, the first step would be to remove the uses of
atomic fences in Tokio. Now there is only one left so this shouldn't be an
issue. Then I use thread sanitizer as follows:

# Install the source code of the standard library.
$ rustup component add rust-src

# Replace fences with atomic loads in Arc implementation.
$ patch ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/sync.rs <<EOF
405c405
<         atomic::fence(Acquire);
---
>         this.inner().strong.load(Acquire);
742c742
<             atomic::fence(Acquire);
---
>             self.inner().weak.load(Acquire);
1246c1246
<         atomic::fence(Acquire);
---
>         self.inner().strong.load(Acquire);
1704c1704
<             atomic::fence(Acquire);
---
>             inner.weak.load(Acquire);
EOF

# Enable thread sanitizer.
$ export RUSTFLAGS=-Zsanitizer=thread RUSTDOCFLAGS=-Zsanitizer=thread

# Run tests with instrumented standard library.
$ cargo -Zbuild-std test --workspace --target x86_64-unknown-linux-gnu

BTW. Looks like there is a data race on the scheduler field, since task is
bound to the scheduler only after future returns pending, but by that time it
is already possible to use associated waker which reads the same field.

@carllerche
Copy link
Member Author

@tmiasko Yes re: race. I failed to run the full suite locally before pushing after making a tweak... working on a fix now.

tokio/src/runtime/mod.rs Outdated Show resolved Hide resolved
tokio/src/runtime/mod.rs Outdated Show resolved Hide resolved

/// Create a new local run-queue
pub(super) fn local<T: 'static>() -> (Steal<T>, Local<T>) {
debug_assert!(LOCAL_QUEUE_CAPACITY >= 2 && LOCAL_QUEUE_CAPACITY.is_power_of_two());
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way for us to make this a static assertion instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have thoughts on how?

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like hax 😆 you want me to lift that macro?

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would still like to spend more time reading over this change, but i don't see anything that i want to block merging it over.

ci/azure-loom.yml Show resolved Hide resolved
tokio/src/runtime/task/core.rs Outdated Show resolved Hide resolved
tokio/src/runtime/task/harness.rs Outdated Show resolved Hide resolved
tokio/src/runtime/task/core.rs Show resolved Hide resolved
/// requires ensuring mutal exclusion between any concurrent thread that
/// might modify the future or output field.
///
/// The mutual exclusion is implemented by `Harness` and the `Lifecycle`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we still call this lifecycle, or is it called stage now?

@carllerche carllerche merged commit a78b1c6 into master Mar 5, 2020
sthagen added a commit to sthagen/tokio-rs-tokio that referenced this pull request Mar 5, 2020
rt: cleanup and simplify scheduler (scheduler v2.5) (tokio-rs#2273)
@Darksonn Darksonn deleted the scheduler-2-5 branch April 12, 2020 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants