Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(fmt/lint): Fix error "Too many open files (os error 24)" #7274

Closed
wants to merge 8 commits into from

Conversation

dsherret
Copy link
Member

@dsherret dsherret commented Aug 30, 2020

This bug is caused because it seems task::spawn_blocking will actually create many threads instead of only using a few in the thread pool.

The docs on the task module page say:

The task::spawn_blocking function is similar to the task::spawn function discussed in the previous section, but rather than spawning an non-blocking future on the Tokio runtime, it instead spawns a blocking function on a dedicated thread pool for blocking tasks.

However, if you read further on the function's documentation page:

Tokio will spawn more blocking threads when they are requested through this function until the upper limit configured on the Builder is reached. This limit is very large by default, because spawn_blocking is often used for various kinds of IO operations that cannot be performed asynchronously. When you run CPU-bound code using spawn_blocking, you should keep this large upper limit in mind; to run your CPU-bound computations on only a few threads, you should use a separate thread pool such as rayon rather than configuring the number of blocking threads.

This PR should fix this issue by using Rayon instead as recommended by Tokio in the paragraph above. When originally implementing this, I wanted to use Rayon because the code is much simpler, but didn't want to introduce another dependency similar to Tokio... but with Tokio recommending Rayon I think this is ok to do?

  • Binary size: 28,116KB -> 28,213KB (on Windows) -- Edit: now after my final changes it's saying 28,125KB....
  • Speed: Seems to be the same as before.

Fixes #7257

cli/fmt.rs Show resolved Hide resolved
let has_error = has_error.load(Ordering::Relaxed);
if has_error {
std::process::exit(1);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just realizing that some of the code in here and in format_source_files could be extracted out and pushed down to run_parallelized (all the error handling inside the delegate too). Let me know if I should do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsherret I think it's fine for now

@@ -1,2 +1,2 @@
[WILDCARD]
error: Found 1 not formatted file
Error checking: [WILDCARD]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a regression? Seems like there is a failure to format a file instead of reporting that a file is not formatted. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I meant to comment about this. There is a file with a syntax error in the folder and previously it wouldn't return a non-zero exit code, but now it does (to match deno lint).

@magurotuna
Copy link
Member

magurotuna commented Aug 31, 2020

I'm not really sure and I haven't yet found conclusive clues, but I'm concerned that rayon is unsuitable for I/O bound operations like fs::read_to_string.

to run your CPU-bound computations on only a few threads, you should use a separate thread pool such as rayon rather than configuring the number of blocking threads.

This description from tokio's document is saying that rayon should be used in case of CPU bound computations. It does not say about I/O bound operations.
Besides, in the rayon::join document, we can read:

The assumption is that the closures given to join() are CPU-bound tasks that do not perform I/O or other blocking operations. If you do perform I/O, and that I/O should block (e.g., waiting for a network request), the overall performance may be poor.

which claims that I/O operations should not be handled by rayon::join, but this does not necessarily conclude that all APIs of rayon are unsuitable for those operations.

Anyone knows if it's okay to use rayon in our use case?

@dsherret
Copy link
Member Author

@magurotuna I think it's ok here. The read & writes here will be very quick and I think the overhead of switching threads for those might make it slower (will also add a lot of complexity). It makes sense for something like doing a network request that it should use a dedicated thread pool for that though.

@@ -21,7 +21,7 @@ use std::io::Write;
use std::path::Path;
use std::path::PathBuf;
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
use std::sync::{Arc, Mutex};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartlomieju you may want to update the deno_lint example to remove the Arcs. I just remembered this isn't necessary with Rayon.

@magurotuna
Copy link
Member

@dsherret
I understand. Thanks for your explanation :)

@ry
Copy link
Member

ry commented Aug 31, 2020

Thanks for the patch David. It's nice that Rayon is so small, but I'm not super keen to introduce yet another thread pool. Tokio should not be creating hundreds of threads. I can't imagine how that would be useful - I guess for blocking network APIs maybe. I think we could just set tokio's thread pool limit to something reasonable (50?) and it would also solve the problem.

Is there a way to reliably reproduce #7257?

@dsherret
Copy link
Member Author

@ry I have not been able to reproduce #7257 on my Mac even when lowering the ulimit. I believe @caspervonb is able to reproduce it every time though.

Yeah, I understand the concern with having two thread pool libraries and as I said, this is the reason I didn't originally use Rayon, however I think the advantage of the easier to understand code and fixing this problem outweigh the negative of having another thread pool library. Also, as you said it's small and Tokio recommends using Rayon for this.

@caspervonb
Copy link
Contributor

caspervonb commented Aug 31, 2020

Is there a way to reliably reproduce #7257?

On a mac; running deno fmt in the root folder of the repository does it for me.
Tried with a limit of 32, 64, 128, 256, 512 and 1024 files.

@ry
Copy link
Member

ry commented Aug 31, 2020

I'd prefer to just decrease the size of the thread pool - there is no situation where it makes sense for deno to be using hundreds of threads.

Closing in favor of #7290

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deno fmt "error: Too many open files (os error 24)"
5 participants