Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task wakes too often in release mode when used with async_std #133

Open
timotheyca opened this issue Jun 13, 2024 · 11 comments
Open

Task wakes too often in release mode when used with async_std #133

timotheyca opened this issue Jun 13, 2024 · 11 comments

Comments

@timotheyca
Copy link
Contributor

This may cause a CPU core to be constantly at 100% after a client connects and sends nothing:

#[async_std::main]
async fn main() -> async_tungstenite::tungstenite::Result<()> {
    let listener = async_std::net::TcpListener::bind("127.0.0.1:8080").await?;
    let (stream, _) = listener.accept().await?;
    async_tungstenite::accept_async(stream).await?;
    Ok(())
}

(the same also happens after the handshake when the client is idle; using only accept_async as the example, because shorter)

It could be an issue with async-std but I haven't yet been able to replicate it without async-tungstenite (both codebases are still a bit hard for me to navigate).

[dependencies]
async-std = { version = "1.12.0", features = ["attributes"] }
async-tungstenite = { version = "0.26.1", features = ["async-std-runtime"] }
@sdroege
Copy link
Owner

sdroege commented Jun 14, 2024

If you do the same with tokio, does it behave correctly?

@timotheyca
Copy link
Contributor Author

timotheyca commented Jun 14, 2024

I've tested it with async-net (versions 1 and 2), and that worked okay;
will test with tokio now;

ed.: tokio is fine

@timotheyca
Copy link
Contributor Author

This, to me, looks a lot like some sort of soft race condition, because it goes away if I add some overhead (or run the thing in debug) or do anything that causes poll be called not as soon as it wakes, and happens more often if I add this at the very start of main:

async_std::task::spawn(futures_util::future::pending::<()>());

@timotheyca
Copy link
Contributor Author

Update: this seems to affect anything that uses async_io::Async::<std::net::TcpStream>::poll_read directly. Happens both on versions 1 and 2

@timotheyca
Copy link
Contributor Author

I found the reason (async_tungstenite::compat):

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};

and the "fix" (for now I can't figure out much better):

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
}
.clone();

why:

  • Waker returned by task::waker_ref changes its vtable on clone
  • async-io's Source, just like AtomicWaker, checks Waker equality by Waker::will_wake which compares both the pointer and the vtable
  • async-io's Source wakes the older Waker on change

as for why async-net works: I have no clue

@timotheyca
Copy link
Contributor Author

timotheyca commented Jun 14, 2024

Okay, so I've added a print:

let waker = match kind {
    ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
    ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),
};
eprintln!("{waker:?} {:?}", waker.clone());

In debug builds (some parts omitted):

... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ... data: 0x563a8f431fa0, vtable: 0x563a8e701368 ...

In release builds on 1.78/1.79:

... data: 0x563ac7b8cf30, vtable: 0x563ac5b80238 ... data: 0x563ac7b8cf30, vtable: 0x563ac5b80178 ...

???????

compiler bug?

Seems like it's making the same vtable be at different addresses.

ed.: it was indeed a compiler bug: rust-lang/futures-rs#2829 (comment)

@timotheyca
Copy link
Contributor Author

There's been a change in how Waker works. std::task::Wake has adapted to it and futures::task::ArcWake has not. I've checked with both std::task::Wake and with a patched version of futures::task::ArcWake, both work fine. So, ig, if there's a need to mitigate this without waiting for futures-task to be changed, switching over std::task::Wake is a possibility at the cost of some extra atomic operations (std::task doesn't provide an equivalent to waker_ref, as far as I see).

@sdroege
Copy link
Owner

sdroege commented Jun 15, 2024

I don't have time to look into this in more detail until some time next week, but this sounds like a bug in the code doing the comparison. If multiple codegen units are involved, it's not guaranteed that the vtable is always the same for the same object.

That's why e.g. https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#method.ptr_eq exists.

@timotheyca
Copy link
Contributor Author

I'll list all the parts of code I found relevant so far in one place, and with less guesses

async-tungstenite creating wakers that change their vtable on clone:

ContextWaker::Read => task::waker_ref(&self.read_waker_proxy),
ContextWaker::Write => task::waker_ref(&self.write_waker_proxy),

async-std directly uses async_io::Async:
https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L5
https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L50
https://github.com/async-rs/async-std/blob/8947d8e03c4c267fdd3828e07368cefc7d39b002/src/net/tcp/stream.rs#L308

async-io checking Wakers for equality and waking the old one on change:
https://github.com/smol-rs/async-io/blob/master/src/reactor.rs#L456-L461

futures-task creates vtables for original and cloned (note no #[inline(always)] on clone_arc_raw):
https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker_ref.rs#L64
https://github.com/rust-lang/futures-rs/blob/master/futures-task/src/waker.rs#L38-L42

Patch to alloc::task in Rust that uses #[inline(always)] to avoid the split across CGUs:
rust-lang/rust#121622

async-tungstenite can mitigate this by using std::task::Wake instead of futures_task::ArcWake or cloning the waker, both options involve extra Arc clones to work (&Arc<impl std::task::Wake> can't be directly converted into impl Deref<Target = Waker> like futures_task::waker_ref does).

futures-task can mitigate this by putting #[inline(always)] on clone_arc_raw.

@sdroege
Copy link
Owner

sdroege commented Jun 17, 2024

futures-task can mitigate this by putting #[inline(always)] on clone_arc_raw.

That seems like a solution that will just fail again some time in the future. This sounds to me like will_wake() should actually only check the pointer for equality and not the vtable.

async-tungstenite can mitigate this by using std::task::Wake instead of futures_task::ArcWake or cloning the waker

Doing this as a workaround for now seems acceptable. Do you want to provide a PR for that?

@sdroege
Copy link
Owner

sdroege commented Jun 17, 2024

Intentionally keeping this open here so there's a reminder to revert the change once that's safe to do again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants