Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only restore Yarn caches on exact key hits #26133

Merged
merged 1 commit into from
Feb 9, 2023
Merged

Conversation

eps1lon
Copy link
Collaborator

@eps1lon eps1lon commented Feb 9, 2023

Summary

Current Yarn cache size: 555MB
Used Yarn cache size: 344MB

When we restore a global Yarn cache that's not specific to a lockfile entry (i.e. a fallback cache), we might restore packages that are no longer used. When we then run yarn install, we potentially add new packages to the cache.
For example:

  1. we bump a package version
  2. lockfile changes
  3. cache restore misses for exact key
  4. cache restore hits a prefix (fallback) containing the older version,
  5. yarn install adds the new version to the cache

Yarn is not clearing the unused packages from the global cache. So when we then save the cache we now retain the old and new version of a package in the global cache even though the old version is no longer used.
This means that the global cache grows indefinitely. Restoring the cache isn't free so CI install times will degrade over time.

Either we

  1. periodically prune the cache
  2. just not restore anything unless we have an exact hit.

The chosen tradeoff depends on the
relation of commits changing deps to commits not changing deps.
From my experience, we change deps rarely so I opted to only restore the cache on exact hits.

How did you test this change?

When we restore a yarn cache that's not specific to a lockfile entry,
we might restore packages that are no longer used.
When we then run yarn install, we potentially add new packages to the
cache. However, yarn is not clearing the unused packages from the global
cache. So when we then save the cache we now retain
unused packages in the global cache.
This means that the global cache grows indefinitely.
Restoring the cache isn't free so CI install times will
degrade over time.
Either we periodically prune the cache,  or just not restore anything
unless we have an exact hit.
The chosen tradeoff depends on the
relation of commits changing deps to commits not changing deps.
@eps1lon eps1lon requested a review from kassens February 9, 2023 16:39
@facebook-github-bot facebook-github-bot added CLA Signed React Core Team Opened by a member of the React Core Team labels Feb 9, 2023
Copy link
Member

@kassens kassens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks!

@eps1lon eps1lon merged commit c0b0b3a into main Feb 9, 2023
@eps1lon eps1lon deleted the effective-yarn-cache branch February 9, 2023 17:00
github-actions bot pushed a commit that referenced this pull request Feb 9, 2023
## Summary

[Current Yarn cache size:
555MB](https://app.circleci.com/pipelines/github/facebook/react/38163/workflows/70d0149e-b0bc-44e8-b8c9-e5c744cab89b/jobs/625334?invite=true#step-102-2)
[Used Yarn cache size:
344MB](https://app.circleci.com/pipelines/github/facebook/react/38166/workflows/4825d444-1426-4321-b95b-c540e6cdc6d7/jobs/625354?invite=true#step-104-5)

When we restore a global Yarn cache that's not specific to a lockfile
entry (i.e. a fallback cache), we might restore packages that are no
longer used. When we then run yarn install, we potentially add new
packages to the cache.
For example:
1. we bump a package version
2. lockfile changes
3. cache restore misses for exact key
4. cache restore hits a prefix (fallback) containing the older version,
5. yarn install adds the new version to the cache

Yarn is not clearing the unused packages from the global cache. So when
we then save the cache we now retain the old and new version of a
package in the global cache even though the old version is no longer
used.
This means that the global cache grows indefinitely. Restoring the cache
isn't free so CI install times will degrade over time.

Either we
1. periodically prune the cache
2. just not restore anything unless we have an exact hit.

The chosen tradeoff depends on the
relation of commits changing deps to commits not changing deps.
From my experience, we change deps rarely so I opted to only restore the
cache on exact hits.

## How did you test this change?

- run on `main` has 555MB of Yarn cache:
https://app.circleci.com/pipelines/github/facebook/react/38163/workflows/70d0149e-b0bc-44e8-b8c9-e5c744cab89b/jobs/625334?invite=true#step-102-2
- run on this branch only has 334MB of Yarn cache:
https://app.circleci.com/pipelines/github/facebook/react/38166/workflows/4825d444-1426-4321-b95b-c540e6cdc6d7/jobs/625354?invite=true#step-104-5

DiffTrain build for [c0b0b3a](c0b0b3a)
[View git log for this commit](https://github.com/facebook/react/commits/c0b0b3a9f80fd57e882859afd95c2f08599442ba)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed React Core Team Opened by a member of the React Core Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants