Skip to content

fix: Make caching work correctly in async context#320

Merged
styfle merged 4 commits into
vercel:mainfrom
danez-labs:cache-fix
Dec 14, 2022
Merged

fix: Make caching work correctly in async context#320
styfle merged 4 commits into
vercel:mainfrom
danez-labs:cache-fix

Conversation

@danez

@danez danez commented Nov 1, 2022

Copy link
Copy Markdown
Contributor

Right now the cache was only populated after the call to the fs API succeed, which in an async context was leading to multiple file IO or analyze tasks started for the same file at once. This is because both file IO are async and so until the fs call finished other calls for the same file can still come in.

To fix this I extracted all file IO calls into a separate class that wraps everything correctly and instead of caching the actual result, now a Promise is cached. So the cache now always needs to be awaited.

As the values in the cache change this is definitely a breaking change. Something like this:

const cache = {};
await nodeFileTrace([file], {
    cache,
);

-const cacheValue = cache.fileCache.get(file);
+const cacheValue = await cache.fileCache.get(file);

I know this is quite a big change, just let me know what you think and if I should change anything. :)

In my personal example the number of calls to analyze went from ~2400 to ~1600 and the runtime from ~3s to ~2.7s (which includes other work than nft)

@danez danez requested review from ijjk and styfle as code owners November 1, 2022 17:42
Comment thread src/node-file-trace.ts Outdated
@codecov

codecov Bot commented Nov 3, 2022

Copy link
Copy Markdown

Codecov Report

Merging #320 (fa4b781) into main (047f030) will increase coverage by 0.07%.
The diff coverage is 89.65%.

@@            Coverage Diff             @@
##             main     #320      +/-   ##
==========================================
+ Coverage   81.23%   81.31%   +0.07%     
==========================================
  Files          13       14       +1     
  Lines        1503     1509       +6     
  Branches      553      554       +1     
==========================================
+ Hits         1221     1227       +6     
  Misses        115      115              
  Partials      167      167              
Impacted Files Coverage Δ
src/fs.ts 88.46% <88.46%> (ø)
src/node-file-trace.ts 91.16% <100.00%> (+0.85%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Comment thread src/node-file-trace.ts
@styfle styfle changed the title fix!: Make caching work correctly in async context fix: Make caching work correctly in async context Nov 3, 2022

@styfle styfle left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to add a test to ensure this fixes the behavior?

@danez

danez commented Nov 7, 2022

Copy link
Copy Markdown
Contributor Author

I added tests, which fail on the main branch.
The tests ensure, that all the cached functions are called only once for each file.

On main:
Screenshot 2022-11-07 at 16 19 52

On this branch:
Screenshot 2022-11-07 at 16 20 07

Comment thread test/unit.test.js Outdated
Comment thread src/node-file-trace.ts Outdated
@danez danez requested review from styfle and removed request for ijjk November 23, 2022 10:51
Comment thread src/node-file-trace.ts Outdated
Comment thread src/node-file-trace.ts

@styfle styfle left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make some changes but it looks like I can't push to your branch so I left a few comments

@danez danez force-pushed the cache-fix branch 2 times, most recently from 4b8eccb to 9b7fc6a Compare December 6, 2022 14:10
@danez

danez commented Dec 6, 2022

Copy link
Copy Markdown
Contributor Author

Okay, I revisited everything and instead of what I did before I extracted all file IO relevant functionality into a new class CachedFileSystem.
This makes the diff in node-file-trace a lot smaller, and also removes the need for the _internal variables there.

The change in analyze was also reverted as unnecessary because there is a check at the beginning of emitDependency that ensures it is only called once per file. The new tests also ensure it stays that way.

I think this is a lot cleaner now.

With this now the only breaking change is the values in the file IO caches changed to Promises instead of the actual results.

PS: I do not know why the github UI does not work on this branch. I can also not merge the base branch into this one via the UI.

@danez danez requested a review from styfle December 6, 2022 14:14
BREAKING CHANGE: The file-IO caches now contains promises instead of the actual result.

Previously the cache was only populated after the function call succeed which in a async
context was leading to multiple file IO or analyze tasks started for the same file
at once. Caching the Promise instead of the result makes this problem go away.
Comment thread src/fs.ts Outdated
Comment thread src/fs.ts Outdated
danez and others added 2 commits December 14, 2022 17:09
Co-authored-by: Steven <steven@ceriously.com>
Co-authored-by: Steven <steven@ceriously.com>
Comment thread src/node-file-trace.ts
this.fileCache = cache && cache.fileCache || new Map();
this.statCache = cache && cache.statCache || new Map();
this.symlinkCache = cache && cache.symlinkCache || new Map();
this.analysisCache = cache && cache.analysisCache || new Map();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should analysisCache also go in the CachedFileSystem class?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not used there and right now it is only used in the job class.

If we move the cache we would need to move the analysis functionality there too, which I'm not sure fits into CachedFileSystem

@styfle styfle left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks 🎉

@styfle styfle enabled auto-merge (squash) December 14, 2022 16:30
@styfle styfle requested a review from ijjk December 14, 2022 18:59
@danez

danez commented Dec 14, 2022

Copy link
Copy Markdown
Contributor Author

It seems because the fork I create is in an organization, GitHub does not allow edits from maintainers. 😒https://github.com/orgs/community/discussions/5634

I merged the main branch, hopefully that allows merging the PR :)

@styfle styfle merged commit 1a7f083 into vercel:main Dec 14, 2022
@github-actions

Copy link
Copy Markdown

🎉 This PR is included in version 0.22.6 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants