Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong cache directory #54

Closed
Milerius opened this issue Nov 15, 2020 · 12 comments
Closed

Wrong cache directory #54

Milerius opened this issue Nov 15, 2020 · 12 comments

Comments

@Milerius
Copy link

Milerius commented Nov 15, 2020

according to:

const vcpkgDir = path.normalize(core.getInput(runvcpkglib.vcpkgDirectory));

it's save a wrong cache (from vcpkg docs):

Currently, our file-based backend is enabled by passing the undocumented --binarycaching flag to any 
Vcpkg command or setting the undocumented environment variable VCPKG_FEATURE_FLAGS to binarycaching. 

We will replace this feature flag with an on-by-default user-wide behavior, plus command line and environment-based configurability.

The on-by-default configuration will specify the file-based archive protocol on either 
%LOCALAPPDATA%/vcpkg/archives (Windows) or $XDG_CACHE_HOME/vcpkg/archives (Unix). 
If XDG_CACHE_HOME is not defined on Unix, we will fall back to $HOME/.cache/vcpkg/archives based 
on the XDG Base Directory Specification. 
This can be redirected with a symlink, or completely overridden with the command line or environment. 
In the future we can also consider having a user-wide configuration file, 
however we do not believe this is important for any of our key scenarios.

https://vcpkg.readthedocs.io/en/latest/specifications/binarycaching/#solution-aspects

for example with your actions every run it's what happens:

  Could not locate cached archive: /home/runner/.cache/vcpkg/archives/b0/b0c723932ac57ff82b8bb08dad40b2b9c5ed146d.zip

  Stored binary cache: /home/runner/.cache/vcpkg/archives/b0/b0c723932ac57ff82b8bb08dad40b2b9c5ed146d.zip

Caching paths: '/home/runner/work/atomicDEX-Desktop/atomicDEX-Desktop/ci_tools_atomic_dex/vcpkg-repo,!/home/runner/work/atomicDEX-Desktop/atomicDEX-Desktop/ci_tools_atomic_dex/vcpkg-repo/packages,!/home/runner/work/atomicDEX-Desktop/atomicDEX-Desktop/ci_tools_atomic_dex/vcpkg-repo/buildtrees,!/home/runner/work/atomicDEX-Desktop/atomicDEX-Desktop/ci_tools_atomic_dex/vcpkg-repo/downloads'
  Running save-cache
  /usr/local/bin/tar --use-compress-program zstd -T0 -cf cache.tzst -P -C /home/runner/work/atomicDEX-Desktop/atomicDEX-Desktop --files-from manifest.txt
  Cache saved successfully

^ we can see that $HOME/.cache/vcpkg/archives or C:\Users\runneradmin\AppData\Local\vcpkg\archives\ is missing in the caching path

would be nice to have a v6 that cache the good directory

I never made typescript in my life but an equivalent in c++ would be :

std::vector<std::filesystem::path> get_cache_paths() 
{
     std::vector<std::filesystem::path> paths;
#if defined(_WIN32) || defined(WIN32)
        const char* localappdata = nullptr;
        
        if (localappdata = std::getenv("LOCALAPPDATA"); localappdata)
        {
              paths.push_back(std::filesystem::path(localappdata) / "vcpkg" / "archives");
        }
#else
       const char* home = nullptr;
       
        if (home = std::getenv("XDG_CACHE_HOME"); home)
        {
              paths.push_back(std::filesystem::path(home) / "vcpkg" / "archives");
        } else if (home = std::getenv("HOME"); home) {
              paths.push_back(std::filesystem::path(home) / ".cache" / "vcpkg" / "archives");
        }
#endif
   return paths;
}
@lukka
Copy link
Owner

lukka commented Nov 15, 2020

@Milerius the run-vcpkg action has been designed for vcpkg "classic", where all the installed ports are installed in $VCPKG_ROOT/installed directory.

The binary caching is a preview feature this action is not currently supporting. Before supporting "modern/future" vcpkg, I'd need to understand in detail how this newly created features (e.g., binary caching and manifest) are working and how to support them.
I'd be glad if the exact and complete specs would be provided in here before I can do it.
If I do it first, I'll share them in here before rushing implementing those, as feedback is always useful.

@strega-nil
Copy link

@lukka binary caching is no longer unstable, and it's the recommended way to do caching nowadays; see the docs here. The readthedocs are usually out of date, since someone has to manually run the script to update them.

@lukka
Copy link
Owner

lukka commented Nov 15, 2020

@strega-nil: good to know! I need time to write a design around it, i.e. how would the flowchart change.

TL;DR: I think it would be helpful to have in vcpkg a command that prints out the result of the vcpkg's cache directory computation logic (i.e. it prints the directory).

Details:
Is it needed to implement an high fidelity replica of the logic described here?
run-vcpkg needs to know whether binary caching is being used or not; and it needs to reliably know what is the directory where the cache needs to be restored into (before running vcpkg), and saved later from. I.e. whatever vcpkg thinks it is the right directory for caching, that is the one run-vcpkg must accurately know as well.

@strega-nil
Copy link

@ras0219 should be able to tell you more :)

@lukka
Copy link
Owner

lukka commented Nov 17, 2020

Please review the following and let me know if you are on the same page.

TL;DR
There is no reason to cache the binary cache directory content on the GitHub cache service, as it will just increase the restore from cache scenario. Continuing caching the $VCPKG_ROOT directory is the most performant solution.

Details
Please remember that run-vcpkg puts into the GitHub cache service basically all of $VCPKG_ROOT (excluding some directories to dramatically reduce the required space). So when the action installs sqlite3 package and the package has been restored from the GitHub cache, it takes <0.1 second.

OTOH, if run-vcpkg is changed to do as requested here, installing sqlite3 (i.e. vcpkg install sqlite3) with the $VCPKG_ROOT/installed directory empty, it would take >3 seconds. The times looks like spent mostly in detecting the "compiler hash for triplet..." (longer on Windows than on Linux, but in both cases it's >1 second), and expanding the archive and writing the new files onto $VCPKG_ROOT/installed directory.

@ras0219-msft
Copy link

In the cache hit scenario, that's the correct TL;DR -- as long as every package is in the installed\ folder, we won't consult the archives. The only way a package would be missing from the installed folder is if the cache didn't hit, in which case the archives would also be missing. There is potentially some improvement available via fallback keys on the archives directory, however I think this is relatively low value.

Instead, we recommend using GitHub Packages[1] to store the binaries via our NuGet backend. This will reuse packages as much as possible and complement the Cache Task to maximize performance and minimize rebuilds.

[1] https://github.com/microsoft/vcpkg/blob/master/docs/users/binarycaching.md#github-packages

@lukka
Copy link
Owner

lukka commented Nov 28, 2020

based on the conclusions, no changes are needed to this action regarding the cached directories.

@lukka lukka closed this as completed Nov 28, 2020
@lukka lukka mentioned this issue Jan 4, 2021
@Be-ing
Copy link

Be-ing commented Aug 6, 2021

There is no reason to cache the binary cache directory content on the GitHub cache service, as it will just increase the restore from cache scenario. Continuing caching the $VCPKG_ROOT directory is the most performant solution.

Caching VCPKG_ROOT is hardly helpful, at least for manifest mode. All it helps with is avoiding needing to bootstrap the vcpkg executable. Caching VCPKG_ROOT/packages, VCPKG_ROOT/buildtrees, VCPKG_ROOT/downloads does not help; vcpkg still does a full rebuild every time (in manifest mode at least).

So when the action installs sqlite3 package and the package has been restored from the GitHub cache, it takes <0.1 second.
OTOH, if run-vcpkg is changed to do as requested here, installing sqlite3 (i.e. vcpkg install sqlite3) with the $VCPKG_ROOT/installed directory empty, it would take >3 seconds. The times looks like spent mostly in detecting the "compiler hash for triplet..." (longer on Windows than on Linux, but in both cases it's >1 second), and expanding the archive and writing the new files onto $VCPKG_ROOT/installed directory.

For these use cases caching is hardly relevant anyway. For complex applications with large dependencies like FFmpeg (30 minute build), wxWidgets (10 minute build), or Qt5 (>2 hour build!), the cost of computing the hash for vcpkg's package cache key is trivial compared to the build time. Running vcpkg install for this dependency list for Tenacity:

  "dependencies": [
    "libflac",
    "lilv",
    "libmad",
    "libogg",
    "libsndfile",
    "soxr",
    "libvorbis",
    "soundtouch",
    "libsbsms",
    "portaudio",
    "portmidi",
    "portsmf",
    "mp3lame",
    "libogg",
    "libvorbis",
    "zlib",
    "expat",
    "sqlite3",
    "ffmpeg",

    "libpng",
    "tiff",
    "zlib"
  ]

takes 9 seconds, which is very fast from my perspective.

There is potentially some improvement available via fallback keys on the archives directory, however I think this is relatively low value.

The benefits of fallback restore keys are huge if an application has even one dependency that takes a long time to build.

I have fixed these issues in #94.

@lukka
Copy link
Owner

lukka commented Aug 6, 2021

@Be-ing Thanks for adding this comment! I'd like to put your comments on the right context first, as I think we are almost on the same page, only some clarifications are still needed.

Caching VCPKG_ROOT is hardly helpful, at least for manifest mode. All it helps with is avoiding needing to bootstrap the vcpkg executable. Caching VCPKG_ROOT/packages, VCPKG_ROOT/buildtrees, VCPKG_ROOT/downloads does not help; vcpkg still does a full rebuild every time (in manifest mode at least).

Caching $VCPKG_ROOT means to save its content except for what is not useful. That is, all but packages, buildtrees and downloads. And please note that when the vcpkg.json manifest mode is used (which is the mode the action and the samples are tailored with) a full rebuild does not take place.

I'd be curious to see what is the run-vcpkg run time in cache-hit scenario on the tenacity project (which is the "restore VCPKG_ROOT scenario except what is not needed"), compared to the time spent to leverage vcpkg's binary caching created archives (and restoring those from cache).

Also, I really appreciated your contribution on #94, which is something I was preparing already but I did not publish yet. So you beat me to it!

@Be-ing
Copy link

Be-ing commented Aug 6, 2021

I'd be curious to see what is the run-vcpkg run time in cache-hit scenario on the tenacity project (which is the "restore VCPKG_ROOT scenario except what is not needed"), compared to the time spent to leverage vcpkg's binary caching created archives (and restoring those from cache).

Without #94 every package is fully rebuilt for any cache miss. The action is hardly useful for my purposes like that.

@Be-ing
Copy link

Be-ing commented Aug 6, 2021

I just realized I got confused by the incorrect information in the README:

          # Ensure the vcpkg artifacts are cached, they are generated in the 'CMAKE_BINARY_DIR/vcpkg_installed' directory.
          additionalCachedPaths: ${{ env.buildDir }}/vcpkg_installed

Manifest mode installs to vcpkg_installed in the root of the repository being built (the same directory as the vcpkg.json file). This directory is outside the CMake build directory.

@Be-ing
Copy link

Be-ing commented Aug 6, 2021

Caching the vcpkg_installed directory in manifest mode is indeed slightly faster that caching the user local binary cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants