Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split BinaryProvider into read vs write #998

Merged
merged 12 commits into from
May 10, 2023

Conversation

ras0219-msft
Copy link
Contributor

@ras0219-msft ras0219-msft commented Apr 1, 2023

This is a rewrite of the Binary Caching infrastructure to improve code sharing and reduce coupling.

This will make #908 more tractable because there will be significantly fewer lines to audit.

@autoantwort
Copy link
Contributor

In the past a provider was marked available if the provider pushed a specific package. This was removed here (expand diff required to see the line), resp. in #908 because the call was not thread safe and there is currently no situation were this is needed. It also could be kind of ambiguous because a provider can have multiple destinations and we don't know which destination has a package available.

With #802 the "push_success -> mark_restored flow" will become relevant for the first time because we build packages, will remove them and then maybe need them again. If we don't want to rebuild the package although it is in the cache, we have to mark it as available or remove the unavailable status. In #911 there is now one provider for every destination so that we don't lost the connection between the push and pull providers and are able to mark providers as available for a specific package.

@ras0219-msft
Copy link
Contributor Author

It's a really good point that we need to invalidate the precheck status of any packages that were pushed -- otherwise we would refuse to try to download them.

However, we SHOULD be running .precheck() on every plan -- which will then correctly re-populate status information. I don't think coupling readers to writers really does what we want, because nothing says that pushing to one reader can't enable pulling from another. I think the correct fix (which you observed) is to completely invalidate the status upon push. Then, when we begin to execute a new plan, we run a .precheck() which will reconsider all read providers (in performance order).

@autoantwort
Copy link
Contributor

However, we SHOULD be running .precheck() on every plan

This is currently already the case :)

I don't think coupling readers to writers really does what we want, because nothing says that pushing to one reader can't enable pulling from another.

That is true (but I think this only happens rarely), but at least for the destinations we pushed a package to, we know we can read it back from there. Invalidating also works and is needed for providers to which we don't have pushed a package, but for the other providers that means unnecessary work (but the overhead is probably negligible).

@ras0219-msft ras0219-msft changed the title Draft rewrite of binarycaching.cpp Split BinaryProvider into read vs write May 8, 2023
@ras0219-msft ras0219-msft marked this pull request as ready for review May 8, 2023 21:46
Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the CR comments are nits. A few non-nits:

  • I'm weakly opposed to RemoveFilesystem.
  • The comments describing the interface contracts.
  • Localizing vendor().

include/vcpkg/archives.h Outdated Show resolved Hide resolved
include/vcpkg/base/files.h Outdated Show resolved Hide resolved
include/vcpkg/base/files.h Outdated Show resolved Hide resolved
"StoredBinariesToDestinations": "Stored binaries in {count} destinations.",
"_StoredBinariesToDestinations.comment": "An example of {count} is 42.",
"StoredBinariesToDestinations": "Stored binaries in {count} destinations in {elapsed}.",
"_StoredBinariesToDestinations.comment": "An example of {count} is 42. An example of {elapsed} is 3.532 min.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, elapsed may be bogus when localized?

include/vcpkg/archives.h Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved

struct AssetSourcesParser : ConfigSegmentsParser
std::vector<CacheAvailability> ReadOnlyBinaryCache::precheck(View<InstallPlanAction> actions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's now possible to write tests for this, and that's probably worth it now that this function is doing complicated stuff to map between universes.

src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/base/files.cpp Outdated Show resolved Hide resolved
Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as long as the files.associations thing that looks unintended (?) is fixed.

@BillyONeal
Copy link
Member

I'm sorry, I tried https://github.com/microsoft/vcpkg-tool/assets/1544943/3b3f5fd8-254a-4991-ac1f-ab6bdca97eda

@ras0219-msft ras0219-msft merged commit 8d2923f into microsoft:main May 10, 2023
5 checks passed
@ras0219-msft ras0219-msft deleted the dev/roschuma/bincache branch May 10, 2023 16:39
@ras0219-msft
Copy link
Contributor Author

Items left unaddressed:

  • ReadOnlyBinaryCache::precheck should be more testable now -- we should add more tests
  • Elapsed times are suboptimally localized

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants