Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler refactor #2380

Merged
merged 3 commits into from May 29, 2019

Conversation

6 participants
@ry
Copy link
Collaborator

commented May 20, 2019

  • Compiler no longer has its own Tokio runtime.
  • Compiler handles one message and then exits.
  • Avoids recompiling the same module by introducing a hacky but simple HashSet<String> that stores the module names that have been already compiled
  • compiler options no longer have their own op
  • remove a lot of the mocking stuff in compiler.ts (like this._ts) not useful as we don't even have tests.
  • Depends on #2416
  • Turns off check_js because it causes fmt_test to die with OOM. fmt_test is modified to use a fresh DENO_DIR, so that it always downloads prettier.

@kitsonk @afinch7 reviews welcome.

@afinch7
Copy link
Contributor

left a comment

This might increase time required to compile in for smaller source files, but it might speed things up in larger projects where multiple large source files can be compiled in parallel. I think the trade off is worth it considering that results are cached after first run, and this allows the removal of the second tokio runtime.

Show resolved Hide resolved js/compiler.ts Outdated
@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 20, 2019

I'm not really concerned about performance at this point. I'm trying to get the compiler to be correct still... That said, I don't think this will be noticeably slower.

@ry ry force-pushed the ry:compiler_refactor branch from 9ff259f to 0fc4b65 May 21, 2019

@kitsonk

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

Compiler handles one message and then exits.

The compiler caches in memory the whole dependency tree of all the modules that contribute to the module being transpiled. I believe TypeScript keeps these parsed modules in memory as well to make it efficient to transpile other modules. It is very stated at runtime. I think Andy's comment is sort of pertinent. We already have a bit of overhead in spinning up the compiler that isn't represented in the snapshot that we never resolved either. I suspect more than a handful of modules, that constant up and down of the compiler is really going to impact performance.

I guess I don't understand why the compiler handling one message and exiting is "more correct". If it is to get more parallelism for the compiler, I am not sure how effective that will be, because the state of the TypeScript compiler is really important and I don't know how the architecture would really lend itself to parallelism.

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 23, 2019

I guess I don't understand why the compiler handling one message and exiting is "more correct"

In the abstract it's necessarily more correct - but the current situation is buggy and difficult to reason about.

  1. The compiler never exits - so that means unnecessary resources are being used during runtime.
  2. The compiler isolate uses its own tokio runtime
  3. When we introduce dynamic import, the compiler will need to be invoked again.

Having the compiler startup and shutdown makes things a bit simpler to think about. But the consequence is that the compiler aught to emit all compiled artifacts at once... I'm still toying with this.

@kitsonk

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

Having the compiler startup and shutdown makes things a bit simpler to think about.

Yes, that makes sense, that once you know you don't need the compiler, shut it down.

But the consequence is that the compiler aught to emit all compiled artifacts at once...

Yeah, and that is a potential challenge given the way we do it now. We could look again at creating and emitting a program instead of transpile module. When we did it last time, TypeScript was something like 20% slower. We never really isolated what was causing it. We are a lot cleaner in our abstraction of the compiler than we were back then, so we might actually be able to deal with it and or give enough feedback to the TypeScript team to be able to narrow down the problem.

When we introduce dynamic import, the compiler will need to be invoked again.

I believe in a lot of cases, if we migrated to emitting programs, TypeScript should do static analysis on all the code it can easily determine, including dynamic imports, so it should be outputting those modules, even if they don't get requested during program execution.

Don't bang your head too much on it if you don't want to right now, at least the TypeScript bit of emitting a program is something I can tackle after I get done with the Rust handling the compiler diagnostics.

Show resolved Hide resolved cli/BUILD.gn Outdated

@ry ry force-pushed the ry:compiler_refactor branch from 74dc640 to 826333c May 23, 2019

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 23, 2019

I believe in a lot of cases, if we migrated to emitting programs

I was surprised to find that ts.createProgram is undefined - I guess because TS doesn’t detect a node environment. I’m trying to use the LanguageService which has getProgram()...

@kitsonk

This comment has been minimized.

Copy link
Contributor

commented May 23, 2019

I was surprised to find that ts.createProgram is undefined

don't forget to update the exports in rollup config when you use new methods, as rollup does not detect the exports from TypeScript.

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 23, 2019

don't forget to update the exports in rollup config when you use new methods, as rollup does not detect the exports from TypeScript.

Oh wow - this saved me a lot of effort. Thanks!

@bartlomieju

This comment has been minimized.

Copy link
Contributor

commented May 24, 2019

@ry can you take a stab at issue described in this comment in this PR?

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 24, 2019

@bartlomieju I believe that issue is solved in this branch already

this._logDiagnostics(errors);
}
const ignoredOptions: string[] = [];
for (const key of Object.keys(options)) {

This comment has been minimized.

Copy link
@zekth

zekth May 24, 2019

Contributor

Use for in instead?

This comment has been minimized.

Copy link
@ry

ry May 25, 2019

Author Collaborator

Why?

This comment has been minimized.

Copy link
@zekth

zekth May 25, 2019

Contributor

it's more ES way but it works too.
like:
for (const key of Object.keys(options)) { == for (const key in options) {

@ry ry changed the title WIP Compiler refactor Compiler refactor May 25, 2019

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 25, 2019

@kitsonk @afinch7 This is now passing tests for me locally. It certainly needs some clean ups still, but PTAL.

@afinch7
Copy link
Contributor

left a comment

On my phone right now. Might take a closer look later. I'm in favor of how much simpler compiler.ts. Just don't forget about op_fetch_module_metadata.

@@ -169,37 +168,30 @@ pub fn dispatch_all_legacy(
}
}

pub fn op_selector_compiler(inner_type: msg::Any) -> Option<OpCreator> {

This comment has been minimized.

Copy link
@afinch7

afinch7 May 25, 2019

Contributor

What's your plan here? Add it back later?

This comment has been minimized.

Copy link
@ry

ry May 27, 2019

Author Collaborator

Yes. For various reasons, it's much easier for the compiler worker to share the state with the calling worker. I do like the ability to limit the ops to each worker - but this patch is quite complex and sharing the state is the easiest thing to get it working now.

@kitsonk
Copy link
Contributor

left a comment

I've done a skim of the TypeScript code. Only one comment, but if it is working then anything could be addressed iteratively down the road too.

noColor: typeof os.noColor;
interface CompilerReq {
rootNames: string[];
// TODO(ry) add compiler config to this interface.

This comment has been minimized.

Copy link
@kitsonk

kitsonk May 27, 2019

Contributor

I think that is going to be hard, if not impossible to do in Rust. Mostly because TypeScript supports JSONC (JSON with comments) and uses its own JSON parsers to accomplish this. serde does not support JSONC, so I don't think there is any solution but to pass a string/u8 buffer in the request. Technically there are some other things that the parser would resolve (like extending tsconfigs, project references, etc) which we don't have implemented which doesn't really make it practical to parse config on the Rust side as well.

This comment has been minimized.

Copy link
@ry

ry May 27, 2019

Author Collaborator

JSON parsers to accomplish this. serde does not support JSONC. serde does not support JSONC

Ok good to know. I guess we'll cross that bridge when we get there.

@ry ry force-pushed the ry:compiler_refactor branch from 792b803 to bd4cfcb May 27, 2019

@afinch7

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

Did some debugging, and found that anything less than 4 threads for the tokio pool will get stuck.
Here is a temp patch that seems to resolve the issue:

diff --git a/Cargo.lock b/Cargo.lock
index 5b1b9385..fd2530d9 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -226,6 +226,7 @@ dependencies = [
  "libc 0.2.54 (registry+https://github.com/rust-lang/crates.io-index)",
  "log 0.4.6 (registry+https://github.com/rust-lang/crates.io-index)",
  "nix 0.13.0 (registry+https://github.com/rust-lang/crates.io-index)",
+ "num_cpus 1.10.0 (registry+https://github.com/rust-lang/crates.io-index)",
  "rand 0.6.5 (registry+https://github.com/rust-lang/crates.io-index)",
  "regex 1.1.6 (registry+https://github.com/rust-lang/crates.io-index)",
  "remove_dir_all 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
diff --git a/cli/BUILD.gn b/cli/BUILD.gn
index 7887624e..aa7c56ab 100644
--- a/cli/BUILD.gn
+++ b/cli/BUILD.gn
@@ -32,6 +32,7 @@ main_extern_rlib = [
   "lazy_static",
   "libc",
   "log",
+  "num_cpus",
   "rand",
   "regex",
   "remove_dir_all",
diff --git a/cli/Cargo.toml b/cli/Cargo.toml
index c1b5387f..b3d40ab5 100644
--- a/cli/Cargo.toml
+++ b/cli/Cargo.toml
@@ -31,6 +31,7 @@ integer-atomics = "1.0.2"
 lazy_static = "1.3.0"
 libc = "0.2.54"
 log = "0.4.6"
+num_cpus = "1.10.0"
 rand = "0.6.5"
 regex = "1.1.6"
 remove_dir_all = "0.5.1"
diff --git a/cli/tokio_util.rs b/cli/tokio_util.rs
index 340574e4..b022b651 100644
--- a/cli/tokio_util.rs
+++ b/cli/tokio_util.rs
@@ -3,6 +3,7 @@ use crate::resources::Resource;
 use futures;
 use futures::Future;
 use futures::Poll;
+use num_cpus;
 use std::io;
 use std::mem;
 use std::net::SocketAddr;
@@ -23,6 +24,7 @@ pub fn create_threadpool_runtime() -> tokio::runtime::Runtime {
   #[allow(deprecated)]
   runtime::Builder::new()
     .threadpool_builder(threadpool_builder)
+    .core_threads(num_cpus::get().max(4))
     .build()
     .unwrap()
 }
@bartlomieju

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

@afinch7 it looks like it's somehow related to #2124, but in that issue it got stuck if you had 1 core_thread.

@afinch7

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

@bartlomieju I'm almost certain it's related to that issue at this point the last debug output I get is found local source. My patch does seem to work even with one core though. I guess os scheduling will eventually end up giving time to another process where tokio will not because it has to complete the current poll operation before it can change contexts.

@bartlomieju

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

@afinch7 thanks for explanation. We should somehow address problem with number of core_threads or it's gonna come up again further down the road when we'll need more of them.

@afinch7

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

With tokio core_threads = 2 I get suck with the first thread polling the closure in the lazy future here:
https://github.com/ry/deno/blob/bd4cfcbff2aa477e635774226f82c08d654809bd/cli/worker.rs#L349-L363
That closure then does a tokio spawn for the block_on call in execute_mod
https://github.com/ry/deno/blob/bd4cfcbff2aa477e635774226f82c08d654809bd/cli/worker.rs#L97-L103
The resulting RecursiveLoad will trigger a op_fetch_module_meta_data and
is unable to allocate a thread to poll DenoDir::fetch_module_meta_data
https://github.com/ry/deno/blob/bd4cfcbff2aa477e635774226f82c08d654809bd/cli/ops.rs#L499

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 28, 2019

@afinch7 Ok that's a good hint.. I'm still not sure what's happening - I kinda think hyper isn't getting properly notified of updates. I have narrowed this problem to a much smaller patch: #2416

Correct tokio_util::block_on() and op_fetch_module_meta_data
op_fetch_module_meta_data is an op that is used by the TypeScript
compiler. TypeScript requires this op to be sync. However the
implementation of the op does things on the event loop (like fetching
HTTP resources).

In certain situations this can lead to deadlocks. The runtime's thread
pool can be filled with ops waiting on the result of
op_fetch_module_meta_data. The runtime has a maximum number of
threads it can use (the number of logical CPUs on the system).

This patch changes tokio_util::block_on to launch a new Tokio runtime
for evaluating the future, thus bipassing the max-thread problem.

This is only an issue in op_fetch_module_meta_data. Other synchronous
ops are truly synchornous, not interacting with the event loop.  TODO
comments are added to direct future development.

@ry ry force-pushed the ry:compiler_refactor branch from bd4cfcb to e24acec May 28, 2019

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 28, 2019

Rebased on top of #2416

We should land that one first.

@ry ry force-pushed the ry:compiler_refactor branch from 2d724ac to b597229 May 28, 2019

@ry

This comment has been minimized.

Copy link
Collaborator Author

commented May 28, 2019

@piscisaureus @kitsonk @afinch7 PTAL - it is green now and I think ready for merging.

Change tools/fmt_test.py to always download prettier
This is to ensure a more fair test. Also we were already downloading
from the internet since we changed the URL to use std@v0.5.0. This
change exposes an OOM bug, which is then fixed in the upcoming compiler
refactor by changing checkJs compiler option to false.

@ry ry force-pushed the ry:compiler_refactor branch 2 times, most recently from 7a16b7f to f59e920 May 28, 2019

TS compiler refactor
* Compiler no longer has its own Tokio runtime. Compiler handles one
  message and then exits.

* Uses the simpler ts.CompilerHost interface instead of
  ts.LanguageServiceHost.

* avoids recompiling the same module by introducing a hacky but simple
  `hashset<string>` that stores the module names that have been already
  compiled.

* Removes the CompilerConfig op.

* Removes a lot of the mocking stuff in compiler.ts like `this._ts`. It
  is not useful as we don't even have tests.

* Turns off checkJs because it causes fmt_test to die with OOM.

@ry ry force-pushed the ry:compiler_refactor branch from f59e920 to 9f68f85 May 28, 2019

@afinch7
Copy link
Contributor

left a comment

Looks good.

Show resolved Hide resolved cli/compiler.rs
Show resolved Hide resolved cli/ops.rs
Show resolved Hide resolved js/compiler.ts

@ry ry merged commit 856c442 into denoland:master May 29, 2019

3 checks passed

Travis CI - Pull Request Build Passed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.