Skip to content

Add caching to improve performance of rebuilds in --build mode #60701

Closed as not planned
@mfedderly

Description

@mfedderly

πŸ”Ž Search Terms

"processSourceFile cache"

πŸ•— Version & Regression Information

  • I was unable to test this on prior versions because: I don't think this has changed since it was implemented

⏯ Playground Link

No response

πŸ’» Code

We have a large monorepo with several very large React apps.

πŸ™ Actual behavior

Rebuilds wind up re-processing lots of files that did not change

πŸ™‚ Expected behavior

Files that did not change can reuse their parsing from earlier

Additional information about the issue

It does not appear that --build mode uses caches for file reads and getSourceFile computation between runs.

I’ve been investigating long delays between the time a user saves a file in their editor and when the tsc process finishes and displays success or errors that it found. In doing some profiling, nearly 70% of the total time was being spent in processSourceFile and its descendants. This was surprising to me because my mental model was that it would have already processed these files and cached the results, and that only the changed files would have to be reevaluated.

From looking at the code, it appears that there are two fairly distinct codepaths depending on whether --build has been specified. When --build is used, it winds up in buildNextInvalidatedProjectWorker which clears the cache of parsed source files after every iteration. This is in contrast with with createWatchProgram, which appears to keep a persistent cache of the parsed files and takes care to evict things from the cache when necessary.

As a naive test, I added caching to createGetSourceFile which still performs the file read, compares the source file’s string against the previous string for this fileName, and then skips the parse if they match (I assume this is flawed for many reasons, including leaking memory from files that have been deleted). This led to nearly a ~50% reduction in time to finish the type checking and print its status (30.5s to 17s). I assume this can be further improved by avoiding the file reads or string compares using something like file mtime or hashes.

Would it be possible to improve caching of the parsed source files in --build mode between runs? There appears to be quite a bit of benefit available if it is technically feasible.

Metadata

Metadata

Assignees

Labels

Working as IntendedThe behavior described is the intended behavior; this is not a bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions