Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid cloning cached registries #25

Merged
merged 13 commits into from
Jan 5, 2024

Conversation

omus
Copy link
Contributor

@omus omus commented Jan 4, 2024

Depends on:

Fixes:

Moves away from cloning to a temporary directory in order to interact better with julia-actions/cache when registry cloning is enabled. The key component here is to use a consistent name of the registry such that we know the repo has already been downloaded and we can skip an additional clone. In order to do this I've opted to clone the registry using the repo name into the depot registry directory. As some registries can have different registry names from repo names I've added a symlink into the depot registry as well such that both names co-exist.

Luckily, Julia is smart enough to avoid updating both registries. For example:

❯ mkdir -p /tmp/julia-depot/registries

❯ cd /tmp/julia-depot/registries

❯ export JULIA_DEPOT_PATH=/tmp/julia-depot

❯ git clone https://github.com/JuliaRegistries/General
Cloning into 'General'...
remote: Enumerating objects: 792384, done.
remote: Counting objects: 100% (11484/11484), done.
remote: Compressing objects: 100% (769/769), done.
remote: Total 792384 (delta 10868), reused 11214 (delta 10711), pack-reused 780900
Receiving objects: 100% (792384/792384), 230.00 MiB | 35.60 MiB/s, done.
Resolving deltas: 100% (432024/432024), done.
Updating files: 100% (42228/42228), done.

❯ ln -s General GeneralSymlink

❯ julia -e 'using Pkg; Pkg.Registry.update()'
    Updating registry at `/tmp/julia-depot/registries/General`
    Updating git-repo `https://github.com/JuliaRegistries/General`

❯ rm GeneralSymlink

❯ ln -s General 1-GeneralSymlink

❯ julia -e 'using Pkg; Pkg.Registry.update()'
    Updating registry at `/tmp/julia-depot/registries/1-GeneralSymlink`
    Updating git-repo `https://github.com/JuliaRegistries/General`

❯ julia-1.6 -e 'using Pkg; Pkg.Registry.update()'
    Updating registry at `/tmp/julia-depot/registries/1-GeneralSymlink`
    Updating git-repo `https://github.com/JuliaRegistries/General`

@omus
Copy link
Contributor Author

omus commented Jan 4, 2024

Uncovered a performance issue while testing this against a real workflow. The getRegistryName function now parses the General/Registry.toml file which is only 929K at the time I write this but this parsing can take over 4 minutes :(. The toml package has a related issue but it's old enough that I'm doubtful we can have proper fix: BinaryMuse/toml-node#39

@omus
Copy link
Contributor Author

omus commented Jan 4, 2024

Note that Julia seems to be keeping track of registry URLs it has updated and there isn't any symlink specific logic:

❯  mkdir -p /tmp/julia-depot2/registries

❯ cd /tmp/julia-depot2/registries

❯ export JULIA_DEPOT_PATH=/tmp/julia-depot2

❯ git clone https://github.com/JuliaRegistries/General General1
Cloning into 'General1'...
remote: Enumerating objects: 792469, done.
remote: Counting objects: 100% (12926/12926), done.
remote: Compressing objects: 100% (492/492), done.
remote: Total 792469 (delta 12567), reused 12681 (delta 12433), pack-reused 779543
Receiving objects: 100% (792469/792469), 229.36 MiB | 29.74 MiB/s, done.
Resolving deltas: 100% (432253/432253), done.
Updating files: 100% (42240/42240), done.

❯ git clone https://github.com/JuliaRegistries/General General2
Cloning into 'General2'...
remote: Enumerating objects: 792469, done.
remote: Counting objects: 100% (13033/13033), done.
remote: Compressing objects: 100% (846/846), done.
remote: Total 792469 (delta 12363), reused 12761 (delta 12183), pack-reused 779436
Receiving objects: 100% (792469/792469), 230.05 MiB | 34.76 MiB/s, done.
Resolving deltas: 100% (432061/432061), done.
Updating files: 100% (42240/42240), done.

❯ julia -e 'using Pkg; Pkg.Registry.update()'
    Updating registry at `/tmp/julia-depot2/registries/General1`
    Updating git-repo `https://github.com/JuliaRegistries/General`

@omus
Copy link
Contributor Author

omus commented Jan 4, 2024

Validated that these changes work in a private workflow.

@omus omus marked this pull request as ready for review January 4, 2024 20:51
@omus omus changed the title Avoid caching cloned repositories Avoid cloning cached repositories Jan 4, 2024
@omus omus changed the title Avoid cloning cached repositories Avoid cloning cached registries Jan 4, 2024
@omus
Copy link
Contributor Author

omus commented Jan 5, 2024

I've validated the merge commit works in a private workflow

@christopher-dG christopher-dG merged commit 92feccc into julia-actions:main Jan 5, 2024
@omus omus deleted the cv/skip-cached-registry branch January 9, 2024 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants