-
Notifications
You must be signed in to change notification settings - Fork 339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New scheduler resilient to blocking #631
Conversation
@stjepang CI seems to be failing due to missing imports. Wondering if something in a rebase maybe went wrong? |
OOC, what are the thresholds/rules when it adds a new thread or when it removes a thread, and if adding a new thread/removing one, how does it distribute the existing tasks and event sources (fds, etc) among the threads? |
let reactor = &REACTOR; | ||
let mut events = mio::Events::with_capacity(1000); | ||
|
||
loop { | ||
// Block on the poller until at least one new event comes in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or until timeout.
I'd love an overview of how "this executor detects blocking tasks" works. Do tasks need to identify themselves as containing blocking operations or is there something smarter going on? |
static YIELD_NOW: Cell<bool> = Cell::new(false); | ||
} | ||
|
||
/// Scheduler state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love a brief overview of terminology here. Words like "machine" and "processor" mean specific things in this context, and are necessary to understand to make sense of what's going on.
It seems like a "machine" is somewhere between a thread and a task. Is this like the old blocking pool? And a "processor" exists as one per core, so I assume it's like just the general worker queue?
If you use this scheduler with a blocking database like From the blog post:
|
@overdrivenpotato because it still gives you evented IO to the outside before starting to query the DH, where you usually don't have a client (potentially slowly) sending anymore. |
@skade I see, thanks for the response. However this doesn't alleviate the 1 thread per connection problem - I think an example will clear up what I'm getting with the I'm running into a use async_std::task;
use std::{thread, time::Duration};
const THREADS: usize = 10_000;
async fn make_response() {
// Some long-running synchronous task...
thread::sleep(Duration::from_millis(5000));
}
async fn async_main() {
// Simulate 10k *successful* requests - we only need a response.
for _ in 0..THREADS {
task::spawn(make_response());
}
// Wait for all tasks to spawn & finish.
thread::sleep(Duration::from_millis(30000));
}
fn main() {
task::block_on(async_main());
println!("Done.");
}
However when I change the blocking |
@overdrivenpotato The default stack size for a thread on unix is 2mb, but often set to 8mb. You're forcing 10.000 threads to be spawned concurrently concurrently. This means that unless you have Tasks on the other hand are stackless which means by default they only take up their own allocation's worth of memory, making it possible to spawn much higher numbers. Which is likely why you're seeing tasks succeed, but threads fail. However conversation has been ongoing about introducing ways we can place upper limits on the amount of threads spawned in scenarios such as these. |
@yoshuawuyts So if the new runtime is expected to choke on a bare C10k example, why does the
You can see in my example that this PR combined with simple blocking calls is not scalable to C10k. Specifically, you'll run into this same issue when using synchronous libraries like Is it not misleading to make the above claims? Is FWIW the PR as a standalone feature is great - it is a much better fallback than blocking the entire executor. However I'm confused about the messaging going on and ecosystem recommendations. |
@overdrivenpotato But this is not the c10k problem. The c10k problem is handling 10000 clients at the same time and keeping their sockets open. It is very usual to do async IO on the edges (because your client might be slow, the whole thing) and then spin up a thread for request handling. The library will not spawn 10000 threads on for the network sockets. It will even just scale down to 1 thread for 10000 sleeping sockets, which no other does. Again: what you present is the "sleep 10000 threads problem" and has no relation to c10k. Forcibly exhausting the resource pool of a system is not very good example for anything, every practical system can be exhausted if you aim at the right spot internally. |
@skade I am only using sleeping threads for simplicity of example; the With that said, it is entirely realistic to have 10,000 concurrent requests going to and from the database layer. Here is a pseudocode example: async fn make_response() -> u64 {
diesel::sql_query("SELECT COUNT(*) FROM users")
.first(CONNECTION)
.unwrap()
} This is a realistic and concrete C10k example. As with the previous example, this example could be expanded - this time with something like a real SQL query for a real application, but this is entirely orthogonal to the problem. The example could also call one of several blocking functions based on HTTP request route - again this is not the point of the example. The point is that I can see that the Help in understanding this situation is appreciated. I am sure others will also find this thread useful. |
@overdrivenpotato You are still oversimplifying (and I don't want to sound hostile there) and I think that makes this discussion hard. First of all, this problem would exist if you use Particularly, with our system, you need to first hit the case where all 10000 queries are active at the same time and don't return within a certain time frame. As an more complex illustration: async fn handle_request() {
// some socket
socket.read_to_string(&mut my_request).await;
let result = do_database_request(&my_request); // may block
socket.write_bytes(&result).await;
} This task mixes a potentially blocking and two non-blocking stretches. The non-blocking stretches lay themselves to sleep in a classic fashion. Assuming that the database request is local and takes ~5ms and the client is a mobile phone with takes ~100ms per request, the chance of 10000 requests being in the database request phase at the same time is very low and needs vastly over 10000 clients. Plus: a fast database access can regulary be so fast that it runs below the general time that is considered a "blocked" task. So this is already a statistical process. Note that this small time blocking (our current threshold is 10ms) is very usual even in tasks, e.g. for some computation or even memory allocation/reclamation. Now, consider this approach: async fn handle_request() {
// some socket
socket.read_to_string(&mut my_request).await;
let result = do_database_request_with_caching(&my_request); //does a db request, statistically, in 33% of all cases
socket.write_bytes(&result).await;
} The whole thing becomes even harder to predict. If you were to wrap this in And this is the statistical process that makes an adaptive runtime so teasing: whether a task spends too much time or not is runtime behaviour. The problem is that that many people overlook is that the adaptive behaviour goes both ways: it will not throw additional resources at the problem if a task does not take much time, be it blocking or not. The ceiling is really high there. And this is what we mean with "you shouldn't worry about blocking": write your code, the runtime supports you and often does a right thing in a dynamic setting. Figuring out if it does exactly the right thing requires tracing of an actual systems and not uniform spawned tasks that operate on one edge of the spectrum. When you hit limits, start optimising and potentially using other approaches, e.g. a fixed size threadpool with a request queue for database request handling. This does not say "you shouldn't think about the structure of your system", especially you should think about capacities. But the hyperfocus on blocking clouds our view there, IMHO. As an example: we have tons of non-blocking, fully-async libraries that have no mitigations against slowloris. I stand by that statement, but yes, we intend to write more documentation specifically around the construction of systems (and also provide the appropriate tracing features to figure out actual problems in the future, if possible). Given no system at hand and the whole thing being dependent on 100 of variables, there's not much we can do except building it and finding if we have actual problems. I hope that helps your evaluation, I'd be happy to discuss any more complex case. Addition: you can also think of more complex cases, where you read off a blocking socket (file) to a non-blocking socket (network) in a loop and it works fine, because the blocking call is most likely by any measure very fast. |
@skade Thanks for the reply. This specific statement clarifies to me your intentions with the new runtime:
I see now that Have you also considered drafting an RFC to relax the
According to the API, deliberately crafting Personally, these two points are important. Rust is a niche language as it stands, and any source of confusion makes it much more difficult to onboard for. It is even harder if the general Rust community is reading into this feature as making blocking "a thing of the past" based on an official blog post. |
@overdrivenpotato I consider the danger of async-std panicking along the level of the allocator failing (which panics or segfaults, depending on platform). In particular, a library correctly panicking according to a definition is correct - it has reached a state where it cannot continue. For completeness sake (as everyone loves to just quote the first part, I wonder why), the full contract of
I have considered proposing a rephrasing, for 4 reasons:
async fn image_manipulation() {
let image = load_image().await;
transform_image(image)
}; Breaks this contract, depending on what "quickly" means. 1ms? 2ms? 5ms? Below a certain frequency? So, this forces spawning. Taken to the extreme, it forces spawning to a thread pool around every piece of non-awaited code!
And that's particularly one pattern we've seen a lot: except in very constrained environments, where work packages are very predictable, this contract is hard to fulfil or quantify. We already have the case where Futures are reactor bound, so I don't quite understand that point. There's one thing more I want to address: there is a difference between application code and library code. If you write a non-blocking library, it should not block. That also makes it compatible with all executors. If you write application-level code, you should buy into your runtime semantics and use one that fits your application and should not care about general applicability to multiple runtime systems. This is a problem similar to how database libraries should be abstract over the store, while particular applications may well buy into specific store features. |
@skade Sure, panicking is correct, but this is a micro-optimization that can break down quickly. A There is really a difference in messaging between these quotes:
And:
As for the The biggest issue here is the disparity between the first selection of quotes and the second, why do you insist on using such highly opinionated and generalized phrases when it really depends on the application requirements? Sure, panicking on too many threads is a correct thing to do, but this is not the same class of error as memory exhaustion. |
@skade Upon further experimentation, I am running into complete event loop blocking on We have a custom networking library with an entry function signature that looks similar to this: async fn drive<T: Source>(source: T) -> Result<(), Error> { ... } We have a custom scheduler that multiplexes multiple sub- It is worth noting that this bypasses the EDIT: To be clear I fully expect this behaviour. This is for the sake of a real example that breaks down. |
@overdrivenpotato You can actually submit PRs to our blog, the link is a the bottom ;). It leads to https://github.com/async-rs/async-std, the blog post source is https://github.com/async-rs/async.rs/blob/master/source/blog/2019-12-16-stop-worrying-about-blocking-the-new-async-std-runtime.html.md. The I'm trying to figure out why you multiplex on a single Future? Futures and tasks are units of concurrency, not of parallelism, so asking for parallel working there is hard. Using block_on there is definitely not the right approach, but that's with or without the new scheduler. |
@skade This is actually necessary to support structured concurrency. Freely spawning can lead to several exhaustion issues. We also have some additional requirements that are not worth getting into right now. Anyways thanks for the help, but we will be sticking with |
Please consider this counter-post before deprecating |
From what I can tell, the real answer is to use an asynchronous semaphore to limit the number of blocking requests made at any given time. |
Sorry, I haven’t had time to actually read the code but perhaps to alleviate the issue @overdrivenpotato is bringing up, one could introduce some kind of “threadpool” concept where cpu-consuming task are automatically offloaded, and by introducing some lower/upper bounds on the thread count, thereby eliminating the issue where we exhaust real userspace-threads? Edit: Oh well, I’ve just realized that this might not be possible because from userspace one cannot “preempt” the running task, therefore it can’t be migrated to the threadpool, so basically this issue is inherent to the nature of detecting long-running/blocking computations completely from userspace. Not sure how Go solves this though (Heard something about compiler inserted preemption points / library assisted preemption points / through unix signals?). |
Disclosure: I'm writing my own post related to blocking operations in the rust async runtimes, One humble observation, forgive me, without having first-hand experience with this branch of async-syd: Could end-users use an async-compatible That's an approach that has been suggested with tokio, at least. I don't know enough yet, to know if that is a good or bad suggestion or entirely how applicable it is here, thanks. |
@serzhiio could it be that you're having a lot of expensive computation going on, which in turn requires a fair amount of threads? Each thread usually comes with about an 8MB static overhead (e.g. their own stack), so if there are a lot of threads being spawned, it seems obvious more memory would be consumed. This is the tradeoff for overall improved latency. |
@yoshuawuyts |
@dekellum Adding |
I agree on suggestion to use at application level, @niklasf. That was missing/missed here or in the "Stop worrying about blocking" blog post, though. Just to be clear, my empirical results (linked post above) support the notion of "stop worrying" at least for SSD blocking reads/writes, which I found surprising given the conventional wisdom. |
Hmm, but look at processes i have only 3 threads running(timer+runtime+machine). I though the main idea of new scheduler is to spawn no more blocking threads than really needed. So who is allocating memory? And as i saif i have no memory issues with current scheduler. |
I'm a bit late to the party, but this is incorrect and a common misconception about thread stack memory. OS threads use virtual memory for their stacks, and only use physical memory when actually needed. So if an OS thread requests 8 MB of VRAM for the stack but only physically uses 1 MB, then only 1 MB of physical RAM is used; not 8 MB. The exact limit on virtual memory differs, but is usually in the range of terabytes, and if I'm not mistaken on Linux the limit is 256 TB. At 8 MB of VRAM per thread, this allows you to spawn 33 554 432 OS threads before running out of VRAM. Not that you should spawn that many OS threads, but it shows that memory is unlikely to be the problem for most users. |
} | ||
} | ||
|
||
sched.progress = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stjepang
I have one question.
Why set false to sched.progress here? I'm interested.
IMO, remove |
Hey all, thank you for your insightful comments and help in testing out this PR! <3 As you've noticed, I haven't had much time to continue working on this PR. If somebody else wants to take over or help out, that would be great! I will now close this PR, but let me know. One possible way of moving forward could be to keep |
I think this would be a great step to make use of all of this work. |
@stjepang I'll take orver this PR!
I try this. |
Now that the PR has been closed, it would be really nice to update the blog. As it stands, the last entry is the post promising a new scheduler that handles blocking automagically. I only learned that the scheduler was abandoned by reading Stjepan's personal post which linked to this PR. The project blog being out of date leaves a bad impression, making it look like the project is unmaintained or the maintainers don't care about communication. The previous sentence sounds harsh, but it's really not meant as an attack, but as constructive criticism from someone who appreciates your work and is about to use it in production. I'm sure that the users who were expecting the new scheduler will welcome the update. |
The smaller version has been merged to master, thanks to @k-nasa. We will be releasing a |
This PR replaces the current scheduler with a new one. Differences:
spawn_blocking
and can eventually remove it. Note thatspawn_blocking
now simply callsspawn
.