Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New scheduler resilient to blocking #631

Closed
wants to merge 2 commits into from
Closed

New scheduler resilient to blocking #631

wants to merge 2 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Dec 16, 2019

This PR replaces the current scheduler with a new one. Differences:

  • The executor and reactor are more closely coupled. We don't have a networking driver thread anymore. Instead, executor threads poll mio.
  • The executor creates new threads only when there is too much work to handle. This way the executor automatically becomes single-threaded if one thread can handle all the work.
  • The executor detects blocking tasks and hands off the event loop to a new thread. With this strategy, we don't need spawn_blocking and can eventually remove it. Note that spawn_blocking now simply calls spawn.

@yoshuawuyts
Copy link
Contributor

@stjepang CI seems to be failing due to missing imports. Wondering if something in a rebase maybe went wrong?

@sdroege
Copy link

sdroege commented Dec 16, 2019

  • The executor creates new threads only when there is too much work to handle. This way the executor automatically becomes single-threaded if one thread can handle all the work.

OOC, what are the thresholds/rules when it adds a new thread or when it removes a thread, and if adding a new thread/removing one, how does it distribute the existing tasks and event sources (fds, etc) among the threads?

let reactor = &REACTOR;
let mut events = mio::Events::with_capacity(1000);

loop {
// Block on the poller until at least one new event comes in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or until timeout.

@asonix
Copy link

asonix commented Dec 16, 2019

I'd love an overview of how "this executor detects blocking tasks" works. Do tasks need to identify themselves as containing blocking operations or is there something smarter going on?

static YIELD_NOW: Cell<bool> = Cell::new(false);
}

/// Scheduler state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love a brief overview of terminology here. Words like "machine" and "processor" mean specific things in this context, and are necessary to understand to make sense of what's going on.

It seems like a "machine" is somewhere between a thread and a task. Is this like the old blocking pool? And a "processor" exists as one per core, so I assume it's like just the general worker queue?

@overdrivenpotato
Copy link

If you use this scheduler with a blocking database like diesel, won't you effectively end up with 1 thread per connection if latency is over 10ms? Why even bother using async-std instead of just writing a blocking server in that case?

From the blog post:

The new runtime enables you to fearlessly use synchronous libraries like diesel or rayon in any async-std application.

@skade
Copy link
Collaborator

skade commented Dec 16, 2019

@overdrivenpotato because it still gives you evented IO to the outside before starting to query the DH, where you usually don't have a client (potentially slowly) sending anymore.

@overdrivenpotato
Copy link

@skade I see, thanks for the response. However this doesn't alleviate the 1 thread per connection problem - I think an example will clear up what I'm getting with the stjepang:new-scheduler branch:

I'm running into a panic! when running the following code:

use async_std::task;
use std::{thread, time::Duration};

const THREADS: usize = 10_000;

async fn make_response() {
    // Some long-running synchronous task...
    thread::sleep(Duration::from_millis(5000));
}

async fn async_main() {
    // Simulate 10k *successful* requests - we only need a response.
    for _ in 0..THREADS {
        task::spawn(make_response());
    }

    // Wait for all tasks to spawn & finish.
    thread::sleep(Duration::from_millis(30000));
}

fn main() {
    task::block_on(async_main());
    println!("Done.");
}
thread 'async-std/runtime' panicked at 'cannot start a machine thread: Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" }', src/libcore/result.rs:1165:5
stack backtrace:
   0:        0x109d55495 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h66f38d4abb2e41c1
   1:        0x109d6ddc0 - core::fmt::write::h4f487e714088986d
   2:        0x109d5391b - std::io::Write::write_fmt::h7bd7b6fe8b47c5fb
   3:        0x109d5710a - std::panicking::default_hook::{{closure}}::heee79636c241547c
   4:        0x109d56e15 - std::panicking::default_hook::hfcbd07059d15441e
   5:        0x109d577d7 - std::panicking::rust_panic_with_hook::h0c4b67125f55410a
   6:        0x109d5734d - std::panicking::continue_panic_fmt::h0e74ab2b215a1401
   7:        0x109d572a9 - rust_begin_unwind
   8:        0x109d7208f - core::panicking::panic_fmt::h09741a3213dba543
   9:        0x109d72169 - core::result::unwrap_failed::hf47c31c429b02014
  10:        0x109d3cdf5 - async_std::rt::runtime::Runtime::run::{{closure}}::hc49ce7e9abd42f33
  11:        0x109d3b2f9 - crossbeam_utils::thread::scope::{{closure}}::h4beb1775084ec8f5
  12:        0x109d3b2e9 - <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h4023904186d71ec7
  13:        0x109d40183 - std::panicking::try::do_call::h02701ca8e38cb843
  14:        0x109d5963f - __rust_maybe_catch_panic
  15:        0x109d3d599 - crossbeam_utils::thread::scope::h9004e3edab86f4fb
  16:        0x109d3ed1e - std::sys_common::backtrace::__rust_begin_short_backtrace::h2f0a676e6aea99cd
  17:        0x109d5963f - __rust_maybe_catch_panic
  18:        0x109d40283 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h2f07a838a2089d5b
  19:        0x109d505fe - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::hd1afe4d791e3679f
  20:        0x109d58fee - std::sys::unix::thread::Thread::new::thread_start::h785fcf8d81bd0f52
  21:     0x7fff720d0d76 - _pthread_start

However when I change the blocking thread::sleep(...) to task::sleep(...).await, there are no panics and the example always succeeds without fail. Is this expected behaviour?

@yoshuawuyts
Copy link
Contributor

@overdrivenpotato The default stack size for a thread on unix is 2mb, but often set to 8mb. You're forcing 10.000 threads to be spawned concurrently concurrently. This means that unless you have >=20GB (or >=80GB) of RAM, your RAM will likely be exhausted by spawning threads making it impossible to spawn any more threads. This is expected behavior.

Tasks on the other hand are stackless which means by default they only take up their own allocation's worth of memory, making it possible to spawn much higher numbers. Which is likely why you're seeing tasks succeed, but threads fail.

However conversation has been ongoing about introducing ways we can place upper limits on the amount of threads spawned in scenarios such as these.

@overdrivenpotato
Copy link

@yoshuawuyts So if the new runtime is expected to choke on a bare C10k example, why does the async-std blog post advertise the following?

The new async-std runtime relieves programmers of the burden of isolating blocking code from async code. You simply don’t have to worry about it anymore.

The new runtime enables you to fearlessly use synchronous libraries like diesel or rayon in any async-std application.

The new runtime relieves you of these concerns and allows you to do the blocking operation directly inside the async function

This strategy eliminates the need for separating async and blocking code.

You can see in my example that this PR combined with simple blocking calls is not scalable to C10k. Specifically, you'll run into this same issue when using synchronous libraries like diesel if you don't isolate the blocking calls into something like a thread pool.

Is it not misleading to make the above claims? Is async-std not intended to cater to large scale servers?

FWIW the PR as a standalone feature is great - it is a much better fallback than blocking the entire executor. However I'm confused about the messaging going on and ecosystem recommendations.

@skade
Copy link
Collaborator

skade commented Dec 17, 2019

@overdrivenpotato But this is not the c10k problem.

The c10k problem is handling 10000 clients at the same time and keeping their sockets open. It is very usual to do async IO on the edges (because your client might be slow, the whole thing) and then spin up a thread for request handling. The library will not spawn 10000 threads on for the network sockets. It will even just scale down to 1 thread for 10000 sleeping sockets, which no other does.

Again: what you present is the "sleep 10000 threads problem" and has no relation to c10k. Forcibly exhausting the resource pool of a system is not very good example for anything, every practical system can be exhausted if you aim at the right spot internally.

@overdrivenpotato
Copy link

@skade I am only using sleeping threads for simplicity of example; the sleep can be replaced with something like a diesel query to make a real C10k case. We also are on the same page regarding edge async I/O for slow clients, this is why my example does not include requests and only calls make_response. There is no disagreement on async for edge I/O.

With that said, it is entirely realistic to have 10,000 concurrent requests going to and from the database layer. Here is a pseudocode example:

async fn make_response() -> u64 {
    diesel::sql_query("SELECT COUNT(*) FROM users")
        .first(CONNECTION)
        .unwrap()
}

This is a realistic and concrete C10k example. As with the previous example, this example could be expanded - this time with something like a real SQL query for a real application, but this is entirely orthogonal to the problem. The example could also call one of several blocking functions based on HTTP request route - again this is not the point of the example. The point is that async-std with this blocking method will not be able to handle 10k simultaneous connections calling blocking functions, as it will exhaust the number of threads available.

I can see that the async-std team is frustrated by some of the attacks that the community is sending out - this is not intended to be one of them. I am evaluating this library for production purposes, and so I am genuinely curious, is this just an oversight in usage of the new runtime? Is there some sort of post planned to clarify that blocking is not worry free?

Help in understanding this situation is appreciated. I am sure others will also find this thread useful.

@skade
Copy link
Collaborator

skade commented Dec 17, 2019

@overdrivenpotato You are still oversimplifying (and I don't want to sound hostile there) and I think that makes this discussion hard. First of all, this problem would exist if you use spawn_blocking as well (which ends you on threadpool, here we go) - and all systems I have seen using spawn_blocking naively are subject to exactly the same case. So if you run along those edges, you need to isolate anyways. It is completely okay in a c10k setting to not handle 10000 clients in one go, but to stagger them a little in high load scenarios.

Particularly, with our system, you need to first hit the case where all 10000 queries are active at the same time and don't return within a certain time frame.

As an more complex illustration:

async fn handle_request() {
    // some socket
    socket.read_to_string(&mut my_request).await;
    let result = do_database_request(&my_request); // may block
    socket.write_bytes(&result).await;
}

This task mixes a potentially blocking and two non-blocking stretches. The non-blocking stretches lay themselves to sleep in a classic fashion. Assuming that the database request is local and takes ~5ms and the client is a mobile phone with takes ~100ms per request, the chance of 10000 requests being in the database request phase at the same time is very low and needs vastly over 10000 clients. Plus: a fast database access can regulary be so fast that it runs below the general time that is considered a "blocked" task. So this is already a statistical process.

Note that this small time blocking (our current threshold is 10ms) is very usual even in tasks, e.g. for some computation or even memory allocation/reclamation.

Now, consider this approach:

async fn handle_request() {
    // some socket
    socket.read_to_string(&mut my_request).await;
    let result = do_database_request_with_caching(&my_request); //does a db request, statistically, in 33% of all cases
    socket.write_bytes(&result).await;
}

The whole thing becomes even harder to predict. If you were to wrap this in spawn_blocking, you would force a background thread to exist for every concurrent request, even if the cache is hit, making your initial scenario (resource exhaustion) more likely. You could now go and implement do_database_request_with_caching in a scheduler-aware fashion and move the blocking functionality in there, but that gets messy quick.

And this is the statistical process that makes an adaptive runtime so teasing: whether a task spends too much time or not is runtime behaviour. The problem is that that many people overlook is that the adaptive behaviour goes both ways: it will not throw additional resources at the problem if a task does not take much time, be it blocking or not.

The ceiling is really high there.

And this is what we mean with "you shouldn't worry about blocking": write your code, the runtime supports you and often does a right thing in a dynamic setting. Figuring out if it does exactly the right thing requires tracing of an actual systems and not uniform spawned tasks that operate on one edge of the spectrum. When you hit limits, start optimising and potentially using other approaches, e.g. a fixed size threadpool with a request queue for database request handling.

This does not say "you shouldn't think about the structure of your system", especially you should think about capacities. But the hyperfocus on blocking clouds our view there, IMHO. As an example: we have tons of non-blocking, fully-async libraries that have no mitigations against slowloris.

I stand by that statement, but yes, we intend to write more documentation specifically around the construction of systems (and also provide the appropriate tracing features to figure out actual problems in the future, if possible).

Given no system at hand and the whole thing being dependent on 100 of variables, there's not much we can do except building it and finding if we have actual problems.

I hope that helps your evaluation, I'd be happy to discuss any more complex case.

Addition: you can also think of more complex cases, where you read off a blocking socket (file) to a non-blocking socket (network) in a loop and it works fine, because the blocking call is most likely by any measure very fast.

@overdrivenpotato
Copy link

overdrivenpotato commented Dec 17, 2019

@skade Thanks for the reply. This specific statement clarifies to me your intentions with the new runtime:

[...] so if you run along those edges, you need to isolate anyways.

I see now that async-std does not recommend blindly eschewing sync isolation in favor of blocking calls. Have you considered softening your message: "You simply don’t have to worry about [blocking] anymore"? This is a fairly literal statement and I have assumed from the start that it is meant to be true. I would imagine many others are confused by this, possibly explaining the backlash you guys have received over this PR. Perhaps it would be better to imply that blocking is sometimes better and this should be measured for before assuming one way or the other. It is my experience that Rust developers are concerned heavily with correctness, moreso than other communities; and potential pitfalls like spurious panics are not worth the potential minor speed gains. FWIW, when using tokio, the natural way for me to implement a diesel bridge was a fixed-size thread pool that resulted in no danger of spurious panics.

Have you also considered drafting an RFC to relax the poll requirements on the Future trait? The runtime contract states:

An implementation of poll should strive to return quickly, and should not block.

According to the API, deliberately crafting async functions with blocking characteristics (not necessarily just database calls) breaks the non-blocking requirement. This does pose a coupling requirement as libraries specifically built for async-std may not work correctly for another executor like tokio or even in basic combinators such as those found in futures-rs.

Personally, these two points are important. Rust is a niche language as it stands, and any source of confusion makes it much more difficult to onboard for. It is even harder if the general Rust community is reading into this feature as making blocking "a thing of the past" based on an official blog post.

@skade
Copy link
Collaborator

skade commented Dec 17, 2019

@overdrivenpotato I consider the danger of async-std panicking along the level of the allocator failing (which panics or segfaults, depending on platform). In particular, a library correctly panicking according to a definition is correct - it has reached a state where it cannot continue.

For completeness sake (as everyone loves to just quote the first part, I wonder why), the full contract of Future is:

An implementation of poll should strive to return quickly, and should not block. Returning quickly prevents unnecessarily clogging up threads or event loops. If it is known ahead of time that a call to poll may end up taking awhile, the work should be offloaded to a thread pool (or something similar) to ensure that poll can return quickly.

I have considered proposing a rephrasing, for 4 reasons:

  • I find the rule weak and useless. It does not define "quickly". I'd say that many async functions break this rule. For example:
async fn image_manipulation()  {
    let image = load_image().await;
    transform_image(image)
};

Breaks this contract, depending on what "quickly" means. 1ms? 2ms? 5ms? Below a certain frequency? So, this forces spawning. Taken to the extreme, it forces spawning to a thread pool around every piece of non-awaited code!

  • async/.await was particularly written to allow the code style above
  • What do I do if I don't know ahead of time?
  • In the end quickly depends on how much leeway my scheduler gives me already, so that definition is already scheduler-bound. If my scheduling strategy insists on tasks running at a high frequency, 2ms is too much. If not, 10ms might be okay.

And that's particularly one pattern we've seen a lot: except in very constrained environments, where work packages are very predictable, this contract is hard to fulfil or quantify.

We already have the case where Futures are reactor bound, so I don't quite understand that point. There's one thing more I want to address: there is a difference between application code and library code. If you write a non-blocking library, it should not block. That also makes it compatible with all executors. If you write application-level code, you should buy into your runtime semantics and use one that fits your application and should not care about general applicability to multiple runtime systems.

This is a problem similar to how database libraries should be abstract over the store, while particular applications may well buy into specific store features.

@overdrivenpotato
Copy link

@skade Sure, panicking is correct, but this is a micro-optimization that can break down quickly. A tokio server can happily serve 10,000 clients on 1GB of total VRAM and a fixed size database pool, however with this new runtime async-std will choke somewhere around 119 clients accessing a blocking database and begin to fail spuriously.

There is really a difference in messaging between these quotes:

often does a right thing in a dynamic setting

[...] so if you run along those edges, you need to isolate anyways.

there is a difference between application code and library code. If you write a non-blocking library, it should not block

And:

The new async-std runtime relieves programmers of the burden of isolating blocking code from async code. You simply don’t have to worry about it anymore.

The new runtime enables you to fearlessly use synchronous libraries like diesel or rayon in any async-std application.

The new runtime relieves you of these concerns and allows you to do the blocking operation directly inside the async function

This strategy eliminates the need for separating async and blocking code.

As for the Future::poll trait, as long as you are specifically recommending blocking in only application code, this is not too relevant to the ecosystem at large, but (correct me if I'm wrong) I see no mention of this anywhere except this thread.

The biggest issue here is the disparity between the first selection of quotes and the second, why do you insist on using such highly opinionated and generalized phrases when it really depends on the application requirements? Sure, panicking on too many threads is a correct thing to do, but this is not the same class of error as memory exhaustion. tokio is a real example that this does not need to happen, while sacrificing arguably minimal real world performance. I would submit a PR to solve this issue but that is not really possible when it comes to social media updates and marketing-based blog posts.

@overdrivenpotato
Copy link

overdrivenpotato commented Dec 17, 2019

@skade Upon further experimentation, I am running into complete event loop blocking on async-std when utilizing simple blocking diesel calls with this new runtime. Here is the scenario:

We have a custom networking library with an entry function signature that looks similar to this:

async fn drive<T: Source>(source: T) -> Result<(), Error> { ... }

We have a custom scheduler that multiplexes multiple sub-Futures onto a single one, that is the one returned by drive. We also have an AsyncStdSource which implements the Source trait, this is how we drive a socket to completion (whether TCP, WebSocket, etc...). The future returned by drive is run via task::block_on(drive(...)).

It is worth noting that this bypasses the async_std::task::spawn utility completely. On tokio with regular asynchronous code and fixed size thread pools, this is a non issue. However because blocking in this new runtime will block the entire task, the server grinds to a halt as single client requests are handled completely sequentially. There seems to be no way around this if you require a sub-scheduler embedded within a single top-level Future.

EDIT: To be clear I fully expect this behaviour. This is for the sake of a real example that breaks down.

@skade
Copy link
Collaborator

skade commented Dec 18, 2019

@overdrivenpotato You can actually submit PRs to our blog, the link is a the bottom ;). It leads to https://github.com/async-rs/async-std, the blog post source is https://github.com/async-rs/async.rs/blob/master/source/blog/2019-12-16-stop-worrying-about-blocking-the-new-async-std-runtime.html.md.

The I'm trying to figure out why you multiplex on a single Future? Futures and tasks are units of concurrency, not of parallelism, so asking for parallel working there is hard. Using block_on there is definitely not the right approach, but that's with or without the new scheduler.

@overdrivenpotato
Copy link

@skade This is actually necessary to support structured concurrency. Freely spawning can lead to several exhaustion issues. We also have some additional requirements that are not worth getting into right now.

Anyways thanks for the help, but we will be sticking with tokio as this is causing too many headaches. I just found out they are also exploring this space currently.

@oblique
Copy link

oblique commented Dec 19, 2019

Please consider this counter-post before deprecating spawn_blocking.

@Demi-Marie
Copy link

From what I can tell, the real answer is to use an asynchronous semaphore to limit the number of blocking requests made at any given time.

@cynecx
Copy link

cynecx commented Dec 20, 2019

Sorry, I haven’t had time to actually read the code but perhaps to alleviate the issue @overdrivenpotato is bringing up, one could introduce some kind of “threadpool” concept where cpu-consuming task are automatically offloaded, and by introducing some lower/upper bounds on the thread count, thereby eliminating the issue where we exhaust real userspace-threads?

Edit: Oh well, I’ve just realized that this might not be possible because from userspace one cannot “preempt” the running task, therefore it can’t be migrated to the threadpool, so basically this issue is inherent to the nature of detecting long-running/blocking computations completely from userspace. Not sure how Go solves this though (Heard something about compiler inserted preemption points / library assisted preemption points / through unix signals?).

@ghost ghost mentioned this pull request Dec 24, 2019
@serzhiio
Copy link

serzhiio commented Jan 10, 2020

Some real life data here :)))
old-scheduler
new-scheduler
Await latency is a kind of indicator for scheduler, all numbers are micros.
New scheduler is really better and looks much more efficient.
P.S.: Not the best screens, but believe me :)

@dekellum
Copy link

dekellum commented Jan 10, 2020

Disclosure: I'm writing my own post related to blocking operations in the rust async runtimes, which I can link here later.

One humble observation, forgive me, without having first-hand experience with this branch of async-syd: Could end-users use an async-compatible Semaphore (such as the one in futures-intrusive crate) with fixed number of permits to help constrain the number of concurrent blocking ops and therefore the number of concurrent runtime threads here?

That's an approach that has been suggested with tokio, at least. I don't know enough yet, to know if that is a good or bad suggestion or entirely how applicable it is here, thanks.

@serzhiio
Copy link

I've noticed some allocating problem. Have no such issue with current scheduler.
new-scheduler-allocating

@yoshuawuyts
Copy link
Contributor

@serzhiio could it be that you're having a lot of expensive computation going on, which in turn requires a fair amount of threads? Each thread usually comes with about an 8MB static overhead (e.g. their own stack), so if there are a lot of threads being spawned, it seems obvious more memory would be consumed. This is the tradeoff for overall improved latency.

@dekellum
Copy link

@yoshuawuyts Semaphore?

@niklasf
Copy link

niklasf commented Jan 14, 2020

@dekellum Adding Semaphore can potentially lead to deadlocks. It's a nice tool on application level (for example when you know you're not going to recursively spawn blocking tasks), but seems tricky for a general purpose scheduler.

@dekellum
Copy link

dekellum commented Jan 14, 2020

I agree on suggestion to use at application level, @niklasf. That was missing/missed here or in the "Stop worrying about blocking" blog post, though. Just to be clear, my empirical results (linked post above) support the notion of "stop worrying" at least for SSD blocking reads/writes, which I found surprising given the conventional wisdom.

@serzhiio
Copy link

@serzhiio could it be that you're having a lot of expensive computation going on, which in turn requires a fair amount of threads? Each thread usually comes with about an 8MB static overhead (e.g. their own stack), so if there are a lot of threads being spawned, it seems obvious more memory would be consumed. This is the tradeoff for overall improved latency.

Hmm, but look at processes i have only 3 threads running(timer+runtime+machine). I though the main idea of new scheduler is to spawn no more blocking threads than really needed. So who is allocating memory? And as i saif i have no memory issues with current scheduler.

@yorickpeterse
Copy link

yorickpeterse commented Feb 3, 2020

The default stack size for a thread on unix is 2mb, but often set to 8mb. You're forcing 10.000 threads to be spawned concurrently concurrently. This means that unless you have >=20GB (or >=80GB) of RAM, your RAM will likely be exhausted by spawning threads making it impossible to spawn any more threads. This is expected behavior

I'm a bit late to the party, but this is incorrect and a common misconception about thread stack memory. OS threads use virtual memory for their stacks, and only use physical memory when actually needed. So if an OS thread requests 8 MB of VRAM for the stack but only physically uses 1 MB, then only 1 MB of physical RAM is used; not 8 MB.

The exact limit on virtual memory differs, but is usually in the range of terabytes, and if I'm not mistaken on Linux the limit is 256 TB. At 8 MB of VRAM per thread, this allows you to spawn 33 554 432 OS threads before running out of VRAM. Not that you should spawn that many OS threads, but it shows that memory is unlikely to be the problem for most users.

@ghost ghost mentioned this pull request Feb 15, 2020
}
}

sched.progress = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stjepang
I have one question.
Why set false to sched.progress here? I'm interested.

@Sherlock-Holo
Copy link

IMO, remove spawn_blocking is a bad idea. First, it will break a lot of codes, the second is we can't distinguish between async job and normal block job easily, it may bring some problems when maintaining a big project

@ghost ghost closed this Mar 15, 2020
@ghost
Copy link
Author

ghost commented Mar 15, 2020

Hey all, thank you for your insightful comments and help in testing out this PR! <3

As you've noticed, I haven't had much time to continue working on this PR. If somebody else wants to take over or help out, that would be great! I will now close this PR, but let me know.

One possible way of moving forward could be to keep spawn_blocking(), delete the part that automatically detects blocking tasks, and merge everything else. That would bring us immediate benefits and should be pretty uncontroversial.

@dignifiedquire
Copy link
Member

One possible way of moving forward could be to keep spawn_blocking(), delete the part that automatically detects blocking tasks, and merge everything else. That would bring us immediate benefits and should be pretty uncontroversial.

I think this would be a great step to make use of all of this work.

@k-nasa
Copy link
Member

k-nasa commented Mar 16, 2020

@stjepang I'll take orver this PR!

One possible way of moving forward could be to keep spawn_blocking(), delete the part that automatically detects blocking tasks, and merge everything else. That would bring us immediate benefits and should be pretty uncontroversial.

I try this.

@k-nasa k-nasa mentioned this pull request Mar 20, 2020
@hniksic
Copy link

hniksic commented Apr 4, 2020

Now that the PR has been closed, it would be really nice to update the blog. As it stands, the last entry is the post promising a new scheduler that handles blocking automagically. I only learned that the scheduler was abandoned by reading Stjepan's personal post which linked to this PR.

The project blog being out of date leaves a bad impression, making it look like the project is unmaintained or the maintainers don't care about communication. The previous sentence sounds harsh, but it's really not meant as an attack, but as constructive criticism from someone who appreciates your work and is about to use it in production. I'm sure that the users who were expecting the new scheduler will welcome the update.

@dignifiedquire
Copy link
Member

The smaller version has been merged to master, thanks to @k-nasa. We will be releasing a 1.6.0-alpha soon, so please start using either in your projects to test this well.

dignifiedquire added a commit that referenced this pull request Apr 9, 2020
This now matches more closely the logic as implemented in #631, and fixes the performance regression as far as I have observed.

Closes #746
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.