Description
π Search Terms
"processSourceFile cache"
π Version & Regression Information
- I was unable to test this on prior versions because: I don't think this has changed since it was implemented
β― Playground Link
No response
π» Code
We have a large monorepo with several very large React apps.
π Actual behavior
Rebuilds wind up re-processing lots of files that did not change
π Expected behavior
Files that did not change can reuse their parsing from earlier
Additional information about the issue
It does not appear that --build mode uses caches for file reads and getSourceFile computation between runs.
Iβve been investigating long delays between the time a user saves a file in their editor and when the tsc process finishes and displays success or errors that it found. In doing some profiling, nearly 70% of the total time was being spent in processSourceFile and its descendants. This was surprising to me because my mental model was that it would have already processed these files and cached the results, and that only the changed files would have to be reevaluated.
From looking at the code, it appears that there are two fairly distinct codepaths depending on whether --build has been specified. When --build is used, it winds up in buildNextInvalidatedProjectWorker which clears the cache of parsed source files after every iteration. This is in contrast with with createWatchProgram, which appears to keep a persistent cache of the parsed files and takes care to evict things from the cache when necessary.
As a naive test, I added caching to createGetSourceFile which still performs the file read, compares the source fileβs string against the previous string for this fileName, and then skips the parse if they match (I assume this is flawed for many reasons, including leaking memory from files that have been deleted). This led to nearly a ~50% reduction in time to finish the type checking and print its status (30.5s to 17s). I assume this can be further improved by avoiding the file reads or string compares using something like file mtime or hashes.
Would it be possible to improve caching of the parsed source files in --build mode between runs? There appears to be quite a bit of benefit available if it is technically feasible.