Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - chore: switch cache to Lean FRO R2 bucket #9409

Closed
wants to merge 5 commits into from
Closed

Conversation

Kha
Copy link
Collaborator

@Kha Kha commented Jan 3, 2024

This will use a new, fresh cache from this commit onward. We are still working on reviving the old cache for old commits.

Comment on lines +174 to +175
-- TODO: reimplement using HEAD requests?
let _ := overwrite
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am relatively sure this is not critical, but do we want it back at all? There could be a slight benefit of not paying for more costly write operations, but the number of cache put calls should not matter compared to the many more cache get calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To explain further, the S3 protocol as a distributed storage protocol will always override existing files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do want to avoid overwriting files if possible. The replacement files will never be the same as the ones that are there already because they contain metadata about the commit on which the CI was run, so this will repeatedly overwrite the files, meaning that there is less room for network caching effects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, where is that metadata? I'm only aware of the /c/{githash} files but those are conflict-free

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the ltar files themselves, added via a command line option in the leantar invocation in cache. I added it because it's useful for debugging when you receive a cache file from somewhere which has something weird in it, to track down the CI run that produced it for anomalies.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Well, HEAD works as a quick existence check if you want to reimplement override that way:

$ curl --head https://mathlib4.lean-cache.cloud/f/be8332708541827a.ltar
HTTP/2 200
date: Wed, 03 Jan 2024 14:13:17 GMT
content-length: 5314
accept-ranges: bytes
etag: "d15505d8f91d3d9772835174e3f4546d"
last-modified: Wed, 03 Jan 2024 13:16:46 GMT
vary: Accept-Encoding
cf-cache-status: DYNAMIC
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=uq8k15i7F7ba5QhhQELrEAMFILVl5rhgpZIzpv9m2inclQUXjHaU%2FESt0EOCNZH5CiaJG3ubQRBhqs8ACwyA5o0FSGkPoG%2FRE7LUhpjROCMZFDe6y0B7XUNRcSFhyOtukYcv57DfTB9mFcYD5rzD1M86JY9%2Fe5Zp"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
server: cloudflare
cf-ray: 83fbd76c7d014d62-FRA
alt-svc: h3=":443"; ma=86400

Doing separate HEAD/PUT of course suffers from TOCTOU but I assume that's "good enough" in this case

Copy link
Member

@digama0 digama0 Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could also avoid a second request by logging this information when we do lake exe cache get at the beginning of the run. The only case where that would differ is when two CI runs race to produce a given file, and in this case it doesn't really matter who wins.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We might even be able to go with mtimes for the sake of simplicity - upload every .olean file newer than the start of the build

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we're uploading .ltar files, not .olean files... but I think cache put already has the information needed there, since it calls cache pack which does not replace existing cache files. The trouble is that there might be a cache file that exists locally on the CI machine, which was not cleared by cache clean because it is useful to skip part of the build, but does not exist in the cloud (somehow); in this case it would see that the file already existed previously and would not upload it, meaning that the file would still not exist in the cloud afterwards. Ideally the system should be able to recover from situations like that.

@digama0
Copy link
Member

digama0 commented Jan 3, 2024

It works! 🎉 I'm going to merge this now since it's time critical, but it would be good to follow up with the other things.

bors r+

@github-actions github-actions bot added the ready-to-merge This PR has been sent to bors. label Jan 3, 2024
@digama0 digama0 marked this pull request as ready for review January 3, 2024 13:42
@digama0
Copy link
Member

digama0 commented Jan 3, 2024

bors r+

mathlib-bors bot pushed a commit that referenced this pull request Jan 3, 2024
This will use a new, fresh cache from this commit onward. We are still working on reviving the old cache for old commits.

Co-authored-by: Mario Carneiro <di.gama@gmail.com>
Comment on lines 29 to 31
/--
info: Try this: rw [@List.length_append, @List.length_append, Nat.two_mul, Nat.add_assoc, Nat.add_left_comm, Nat.add_right_comm, Nat.add_assoc]
-/
Copy link
Member

@digama0 digama0 Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems odd... tracked in #9410

@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Jan 3, 2024

Canceled.

@digama0
Copy link
Member

digama0 commented Jan 3, 2024

bors r+

mathlib-bors bot pushed a commit that referenced this pull request Jan 3, 2024
This will use a new, fresh cache from this commit onward. We are still working on reviving the old cache for old commits.

Co-authored-by: Mario Carneiro <di.gama@gmail.com>
@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Jan 3, 2024

Canceled.

@digama0
Copy link
Member

digama0 commented Jan 3, 2024

bors r+

mathlib-bors bot pushed a commit that referenced this pull request Jan 3, 2024
This will use a new, fresh cache from this commit onward. We are still working on reviving the old cache for old commits.

Co-authored-by: Mario Carneiro <di.gama@gmail.com>
@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Jan 3, 2024

Build failed:

@digama0
Copy link
Member

digama0 commented Jan 3, 2024

bors r+

mathlib-bors bot pushed a commit that referenced this pull request Jan 3, 2024
This will use a new, fresh cache from this commit onward. We are still working on reviving the old cache for old commits.

Co-authored-by: Mario Carneiro <di.gama@gmail.com>
@mathlib-bors
Copy link
Contributor

mathlib-bors bot commented Jan 3, 2024

Pull request successfully merged into master.

Build succeeded:

@mathlib-bors mathlib-bors bot changed the title chore: switch cache to Lean FRO R2 bucket [Merged by Bors] - chore: switch cache to Lean FRO R2 bucket Jan 3, 2024
@mathlib-bors mathlib-bors bot closed this Jan 3, 2024
@mathlib-bors mathlib-bors bot deleted the r2 branch January 3, 2024 15:57
digama0 added a commit that referenced this pull request Jan 3, 2024
mathlib-bors bot pushed a commit that referenced this pull request Jan 4, 2024
As [reported on Zulip](https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/The.20cache.20doesn't.20work/near/411058849), we're still working out some issues in the new cache provider. Since Azure is back online, let's move back to them while we work out the issues with cloudflare. The code is now set up to be able to toggle between both.



Co-authored-by: Mario Carneiro <di.gama@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has been sent to bors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants