Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deno test is slow (pref: Dynamic imports) #2789

Closed
bartlomieju opened this issue Aug 16, 2019 · 13 comments
Closed

deno test is slow (pref: Dynamic imports) #2789

bartlomieju opened this issue Aug 16, 2019 · 13 comments

Comments

@bartlomieju
Copy link
Member

During work on denoland/std#516 it became apparent that dynamic imports have performance problems. Those problems occur if you import TypeScript file that needs to be compiled.

Case study:
Let's focus on example of deno_std as it's a good testing subject. Test runner looks for all files matching certain pattern and imports them dynamically using import() syntax.
Let's look at flags module; it has around 10 test files.

//deno_std/flags/kv_short_test.ts
// Copyright 2018-2019 the Deno authors. All rights reserved. MIT license.
import { test } from "../testing/mod.ts";
import { assertEquals } from "../testing/asserts.ts";
import { parse } from "./mod.ts";

test(function short(): void {
  const argv = parse(["-b=123"]);
  assertEquals(argv, { b: 123, _: [] });
});

test(function multiShort(): void {
  const argv = parse(["-a=whatever", "-b=robots"]);
  assertEquals(argv, { a: "whatever", b: "robots", _: [] });
});

As you can see this file imports test and assertEquals from testing module and parse from flags. Every test file in flags module imports them. So when we encounter dynamic import of such file following steps occur:

  1. encountered dynamic import
  2. import is ".ts" file so TS compiler worker is spawned
  3. Deno requests kv_short_test.ts to be compiled
  4. TS compiler loads kv_short_test.ts and calls resolveModuleNames for imports
  5. for each import statement:
    a) imported module is requested by TS compiler
    b) Deno looks for that file and sends its contents to TS compiler
    c) TS compiler processes the file and does points 4-5 for it
  6. File and all it's deps and compiled and output is cached by Deno
  7. TS compiler worker exits
  8. Deno loads compiled version of kv_short_test.ts and dynamic import is resolved.

The problem is that after compiling and importing kv_short_test.ts dynamic import for next file will go through exactly same steps. That means that for each file TS compiler will request and compile files like testing/mod.ts, testing/asserts.ts and flags.mod.ts. So if we have 10 test files in the module then TS compiler would compile flags and testing modules 10 times.

Note that dynamic imports use the same compiler pipeline as static imports.

Possible solution:
I'd expect TS compiler to have some kind of cache that it can request files from to avoid recompiling the same files again and again. We would wire to that cache and add an op that would look for already compiled module (the action that TsCompiler.get_compiled_module already does).

CC @ry @kitsonk @piscisaureus

@kitsonk
Copy link
Contributor

kitsonk commented Aug 17, 2019

Requesting all potential imports makes sense when you are compiler a program with TypeScript, as it needs to ensure that all possible code would be valid, is transpiled, and doesn't have any type issues.

I'd expect TS compiler to have some kind of cache that it can request files from to avoid recompiling the same files again and again.

It kind of sort of does, but that is really up to the compiler host to implement. Previous versions of the compiler did have some sort of caching, but one of the problems is that TypeScript doesn't know what the "filename" of a module is until it calls resolveModules(), which in the case of Deno, returns the full module. TypeScript just keeps asking questions, and it is up for the host to figure out how to keep those questions efficient. If TypeScript knows that one module equals another and that is already in memory, it does cache the parse, etc... It just doesn't cache anything at the "host" level.

Also, when we moved from AMD to ESM in the compiler, we also moved from single file transpile to program transpile. This means that TypeScript now tries to resolve everything upfront. Going back to single file would mean that a module wouldn't need to be transpiled until it is requested from the runtime, but it does mean that lazily loading the compiler would need to be repeatedly done if there were lots of dynamic imports and if spun down, TypeScript would lose its state and likely have to request a whole new load of source files in order to do the compiling.

@bartlomieju
Copy link
Member Author

Does is even make sense to type check dynamic imports? It's strictly runtime thing to import module dynamically and if there's type mismatch on TS level then final effect would be the same (error from TS compiler vs runtime error from V8). Maybe in case of dynamic imports we could use ts.transpileModule instead of ts.compileProgram.

I can see potential problem in that scenario (please correct me if I'm wrong): output of transpileModule might be different than from createProgram. So if you first import that module dynamically and then import it statically on another run Deno would serve cached version of that file. If there's a difference in the output it might lead to unexpected behavior.

@kitsonk
Copy link
Contributor

kitsonk commented Aug 19, 2019

I don't think it is strictly a runtime thing. People would expect to consume modules in a type safe way, especially if they are TypeScript modules and enforce the checking. Just having V8 throw runtime errors, you would argue why have TypeScript at all then. Plus not every TypeScript error would be a JavaScript runtime error.

There should be no type direct emits in TypeScript. The emit should always be the same. The risk with transpileModule though is that it inherently doesn't catch all the errors that createProgram does.

@bartlomieju
Copy link
Member Author

Pasting conversation from Gitter so it won't get lost:

@bartlomieju: TypeScript 3.6 ships with a feature that may be a solution to our problems with dynamic import performance - API for incremental builds and ability to save the state of compiler https://devblogs.microsoft.com/typescript/announcing-typescript-3-6/#apis-to-support-build-and-incremental

@kitsonk:
looking a bit more at that, I had noticed the oldProgram parameter before but didn't know what was up with that, now those APIs are exposed... there might be something to it... we had wanted a way to "warm" the compiler (ideally for snapshotting) but it might be we generate a baseline empty "old program" to hydrate all the underlying runtime environment (the Deno lib and the other libs). Certainly something to play with.

@bartlomieju:

we had wanted a way to "warm" the compiler (ideally for snapshotting) but it might be we generate a baseline empty "old program" to hydrate all the underlying runtime environment (the Deno lib and the other libs).

We'd probably could go further and save incremental states of compiler for each run - that would ensure that dynamic imports only import files that are "new", mitigating current problem with recompilation of same files. But certainly having "base" state of TS compiler would improve compilation times anyway

@ry ry changed the title perf: Dynamic imports deno test is slow (pref: Dynamic imports) Oct 8, 2019
@ry ry mentioned this issue Oct 8, 2019
43 tasks
@ry
Copy link
Member

ry commented Oct 8, 2019

When merging deno_std into the main repo, I found that running "deno test" is prohibitively slow due to this dynamic import issue. As a consequence I could not simply add it as a normal integration test (cli/tests/integration_tests.rs) and had to instead add three new github action builds so that it could be run in parallel to the rest of the tests.

See #3093

@kitsonk
Copy link
Contributor

kitsonk commented Oct 12, 2019

I was just thinking about this. Specifically related to deno test, we should consider having Rust collect all the test modules from the file system and passing them in as the root names to the compiler. What I think is currently happening is the following:

  • We push std/testing/runner.ts onto the root modules stack.
  • We fetch std/testing/runner.ts which reads the target directory to glob for the test files.
  • We then await import(testModule), where the module isn't in the cache.
  • We therefore spin up the compiler for each module, send it a message to transpile the module, which then puts it in the cache.

While spinning up the compiler should be quicker, and we need to tackle that, we still have a challenge with dynamic imports that aren't statically analysable. So to work around specifically this issue with deno test I think we should do something like this.

  • We push std/testing/runner.ts onto the root modules stack.
  • We do the logic of findTestModules() from testing/runner.ts in Rust, pushing any matches to the root modules stack.
  • The compiler will then transpile std/testing/runner.ts and any test modules globbed, populating the cache.
  • When we load the transpiled std/testing/runner.js into the runtime isolate, it will reglob for the test files and dynamically import them, but they will already be transpiled in the cache, therefore the compiler won't be spun up.

Two other thoughts of things we might want to do here:

  • Pass the array of globbed test files into std/testing/runner.ts via the environment or some other mechanism, so it doesn't have to reglob them from the file system.
  • Don't serially import the test modules in std/testing/runner.ts. Collect the import(testModule) promises and then await Promise.all() the collected promises. The way the code is written at the moment, it is in an async iterator, coupled then with awaiting the dynamic import, there is A LOT of promises knocking about, and promises always get resolved out of turn, so there are lots of microtasks being scheduled and other stuff. While the code would also be a bit less readable, it maybe worth unwrapping findTestModules() too and collecting its promises into an array of strings which can be mapped to the dynamic import which can then be a chain to another Promise.all of the dynamic imports.

Anyways we can do a lot more before spinning up the compiler performance actually becomes a serious bottleneck again for this.

@bartlomieju
Copy link
Member Author

What I think is currently happening is the following:

  • We push std/testing/runner.ts onto the root modules stack.
  • We fetch std/testing/runner.ts which reads the target directory to glob for the test files.
  • We then await import(testModule), where the module isn't in the cache.
    We therefore spin up the compiler for each module, send it a message to transpile the module, which then puts it in the cache.

That's right, the problem is that building next files we spend most on analyzing the same modules over and over, see first comment describing this behavior.

While spinning up the compiler should be quicker, and we need to tackle that, we still have a challenge with dynamic imports that aren't statically analysable. So to work around specifically this issue with deno test I think we should do something like this.

  • We push std/testing/runner.ts onto the root modules stack.
  • We do the logic of findTestModules() from testing/runner.ts in Rust, pushing any matches to the root modules stack.
  • The compiler will then transpile std/testing/runner.ts and any test modules globbed, populating the cache.
  • When we load the transpiled std/testing/runner.js into the runtime isolate, it will reglob for the test files and dynamically import them, but they will already be transpiled in the cache, therefore the compiler won't be spun up.

Spinning up the compiler is actually pretty fast, the slowest part is actual compilation by TS (program.emit()). I really don't like moving logic to Rust, this whole situation can be easily (although hacky) solved in runner itself. Instead of importing all files one by one we can just create file with all the found modules and dynamically load only that file. I bet that would cut run time of runner by 75% (because all test deps will be loaded and compiled only once). That said I still a feeling that we could improve our TS compiler host. During lifecycle of the program besides compiling initial entry point we might encounter dynamic imports. I do believe that after initial compilation we should store the state of TS compiler in Rust (so serialize and send state of compiler before exiting compiler worker) and then "rehydrate" the compiler during dynamic imports. I'm not so good in TS internals but I might be thinking about createIncrementalProgram and tsBuildInfoFile setting for compiler. WDYT?

@kitsonk
Copy link
Contributor

kitsonk commented Oct 13, 2019

Yeah, the problem with incremental is there is no way to serialize the state. There is building, which is still worth investigating but all of these assume the compiler keeps running with its state. I think it is going to be complex.

The other option is that for nom statically analysable imports is that we do single module transpiles with no default lib checking. It increases the risks of unsound code being run, and throwing runtime errors, but if you are doing large amounts of dynamic imports it seems like an acceptable tradeoff. It just means TypeScript won't catch all potential type errors.

@bartlomieju
Copy link
Member Author

With compiler upgrade (🔥) #3644 I think we can close this issue for now

@nayeemrmn
Copy link
Collaborator

@bartlomieju Does #3344 still make a significant difference?

@bartlomieju
Copy link
Member Author

@nayeemrmn good question, feel free to remove it and let's see CI test times

@ghost
Copy link

ghost commented May 9, 2021

Dynamic imports are still extremly slow :/
Even if the script has just 10 lines and is local and is vanilla javascript it takes from 5 to 10 seconds.
Even when i pre import them like this:

import { cheerio } from 'https://deno.land/x/cheerio/mod.ts';

window.jQuery = window.$ = window.cheerio = (await import('https://deno.land/x/cheerio/mod.ts')).cheerio;

Thats something i observerd generally: workers and dynamic imports do not seem to be affected by deno cache --reload also the DENO_DIR=/local/path is ignored by the run command, so pretty useless. I always operate on the local cache and when i change something on the remote file have to go with rm -rf ~/.cache/deno/deps/https/example.org/ ~/.cache/deno/gen/https/example.org/.

@bartlomieju
Copy link
Member Author

@einicher please open a new issue with your problem's description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants