Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a configuration option to skip the lifo_slot optimization #4051

Closed
wants to merge 3 commits into from

Conversation

guswynn
Copy link
Contributor

@guswynn guswynn commented Aug 20, 2021

Motivation

In our (my coworkers and I) RPC stack, our server overload detection relies on being able to detect when request processing throughput is stalled. Currently, we derive this metric interactively by injecting futures (via spawn() in our case) into the Runtime and measures how long they took to begin execution. (this is a vague overview and I can go into more detail if needed)

However, this flow relies on 'mostly-FIFO' ordering of request handling.

While tokio does not explicitly provide a FIFO execution guarantee* for spawn()'d futures, the lifo_slot optimization path results in pathological behavior for this workload.

*I stand to be corrected, but there are 2 parts of the runtime that are not strict-fifo, and they could be altered to be fifo to varying degrees (though, if this PR is accepted, I would be interested in exploring both more concretely):

  1. when a Local queue is filled, work is pulled off the front half of that queue and added to the global Inject queue. I can imagine a scheme where this is changed to the back half or something, but upon reading the queue module, I think this would be complex to implement
  2. tokio's runtime is work-stealing, which I think regardless of implementation can't be made loosely fifo. I was thinking rayon implemented this with spawn_fifo, but that appears to be per-thread fifo? I admittedly, don't know what I'm talking about here.

We also believe that the LIFO optimization creates surprising behavior in systems that have similar expectations of 'mostly-FIFO-ness'. Given that the LIFO optimization is an optimization and not key to the functioning of the runtime, we believe it to be reasonable to empower users who find this behavior pathological for their workflows to disable it at runtime.

Solution

Change the runtime to allow Core's to skip the lifo_slot optimization, and allow this to be configured in the runtime Builder

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Aug 21, 2021
@Darksonn
Copy link
Contributor

It sounds like adding an on_thread_park option to the builder as described in #3975 would be a better way of solving your problem?

@guswynn
Copy link
Contributor Author

guswynn commented Aug 23, 2021

@Darksonn is the idea that the closure in on_thread_park would spawn a new task, and would be called each time a worker thread runs out of work? I am not sure I understand that issue

@Darksonn
Copy link
Contributor

The closure would be called every time the runtime is about to park. Whether or not you spawn a task in the closure is up to you. The reason I ask is because being able to know whether the runtime has recently parked could be one way to figure out how loaded the runtime is.

@Matthias247
Copy link
Contributor

Currently, we derive this metric interactively by injecting futures (via spawn() in our case) into the Runtime and measures how long they took to begin execution.

Won't the work on runtime metrics (#3845, #4043 , #4074 ) be a reasonable way to do this?

And even without it: A yield() in a task is guaranteed not to use LIFO and can be [mis]used to measure executor latency.

@Darksonn
Copy link
Contributor

I think we would rather avoid the complexity of making this configurable for now. It sounds like #4070 or use of tokio::task::yield_now() could achieve the same goal.

@Darksonn Darksonn closed this Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants