Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update of local registry with new recipes #566

Closed
surak opened this issue Jul 18, 2022 · 16 comments
Closed

Update of local registry with new recipes #566

surak opened this issue Jul 18, 2022 · 16 comments

Comments

@surak
Copy link
Contributor

surak commented Jul 18, 2022

The shpc update seems to update only existing recipes, but to have all the latest ones from you, one needs to get a newer version of shpc, correct?

What I mean is newer recipes which would not show on shpc show until you get a newer shpc.

Would it make sense to have a "update" setting which would bring new recipes as well? What do you think?

@vsoch
Copy link
Member

vsoch commented Jul 18, 2022

Would it not work to just pull the latest from the repository? Otherwise we would need to somehow unlink the two.

@surak
Copy link
Contributor Author

surak commented Jul 18, 2022

That works when you are working on a git repo, not with the working release numbers. But if other sites have no problem with that, it’s fine.

@vsoch
Copy link
Member

vsoch commented Jul 18, 2022

Oh that's a good point! So hmm - should we have another update command that serves to just update the main registry directory via a clone of some reference? Would you want to only have the new container.yaml or to overwrite existing ones too?

@surak
Copy link
Contributor Author

surak commented Jul 19, 2022

To update existing ones, the command is there, right? Although there's a hack on the documentation about iterating over them, that seems less than ideal.

What do you thing about the debian model: "update" brings info about latest stuff, "upgrade" does in fact touch your files.

@vsoch
Copy link
Member

vsoch commented Jul 19, 2022

This is a great idea @surak ! I'll work on a PR for you tonight!

@vsoch
Copy link
Member

vsoch commented Jul 19, 2022

okay I have a WIP pull request:

#567

I still need to add proper testing - going out for a run but will try to do that tonight! This update is kind of neat because I represent a new class for a registry, and it can be a local or remote. I'm using this new remote class -> GitHub provider to manage the new upgrade command. Since we only support local filesystem registries for shpc to directly use at the moment (e.g., the registry directory and registry list in settings) we don't actually need to check if a remote exists for the ones found in the settings.yaml. But...

There is this really cool idea that if we do want to support an entirely remote (on GitHub) registry, then (as long as we are able to represent some of the structure in the container.yaml, e.g., the template files) we could do all interactions over the web. And then we could actually gut out the registry from shpc entirely, and just update it at a different GitHub repository. E.g., it would be like:

  • shpc install
  • find https://github.com/singularityhub/shpc-registry in registries
  • load in remote provider
  • look for container.yaml in remote provider (defaulting to main branch)
  • If container.yaml is found, get metadata via requests.get, and any associated / needed files the same way (likely I'd need to update logic to find templates, etc., to be done via the registry class)
  • install as you would before!

And if a user uses an entirely remote registry, they wouldn't technically ever need to run update or upgrade - the remote would always be updated. And only if they decided to clone a local filesystem registry then they might have use for the command, and we'd want to refactor it so that it only looks for updates on any filesystem registries found in settings.yaml.

Ping @muffato and @marcodelapierre would be interested in your thoughts here! For a center that wants to just use the entire registry locally (no internet pings) they could just clone it and add the path to the registry list in their settings.

@vsoch
Copy link
Member

vsoch commented Jul 20, 2022

OK - I have a design I think I like to start? I've removed the WIP from #567 - I have basic tests but to be fair I don't have a good way to fully test the upgrade from a remote (instead I test upgrade from a local). I'm thinking we might want a dummy remote repository to test from, or actually I could just use shpc here. I'm too tired tonight to do that, so it's on my TODO for tomorrow. Even without that, there should be enough to take a look!

Update: just kidding I added the final test - I have extreme will-power even when sleepy! 😪

@muffato
Copy link
Contributor

muffato commented Jul 20, 2022

@vsoch . Here we have extracted the registry from https://github.com/singularityhub/singularity-hpc/ , pruned it to the software and versions (tags) we want to have on disk, and have copied it to an internal gitlab repository. This way, our registry represents the entire software stack we have installed through shpc. Our SOP for adding/updating a container.yaml and deploying it actually involves 1) updating a git clone of this repository, 2) rsync-ing it to another location that is configured as the shpc registry, and 3) the actual shpc install. I'm not entirely sure why the rsync is necessary. Without it we could as well make shpc read from the gitlab repository directly (because its visibility is "public", so no login is required). On that last point, if you want to support on-the-fly https access, you may have to deal with credentials, whereas for git clone, you could require people to use ssh + keys so that shpc wouldn't have to deal with login details itself.

@vsoch
Copy link
Member

vsoch commented Jul 20, 2022

Oh interesting! That's a lot of steps - I wonder if we could reduce?

For the current upgrade implementation - we are doing a git clone (from the repository here) so are you saying that wouldn't be allowed from your other location? If you aren't interested in having new recipes (since you are using a curated set) you probably wouldn't be interested in this new upgrade command. But if upgrade had an option to say "Just get me the updates for the ones I do have installed, don't add new ones" would that work for your use case (e.g., and you'd be allowed to clone to /tmp?) Or more generally I can ask - does your final location not allow any kind of internet access?

How are you pruning it?

@marcodelapierre
Copy link
Contributor

Here at Pawsey Centre I have done the following:

  1. pip install versioned SHPC
  2. git clone and prune SHPC registry from github; still doing it in a versioned way for reproducibility purposes
  3. add second, site-specific, version controlled, registry in SHPC settings, to allow add/edit recipes that are not (yet) in the main SHPC repo

That being said, I like the idea of also being able to refer to a remote registry!

@vsoch
Copy link
Member

vsoch commented Jul 21, 2022

Ah - so it sounds like the "prune" operation is being done by both of you (@marcodelapierre and @muffato) - I wonder if we need a streamlined way to do that? Assuming a non-air-gapped system, I could see something along the lines of (for an existing install)

  • pip install versioned shpc
  • (the update case) be able to have a registry (separate) from shpc where you can say "update the containers I have here from the remote"
  • (the fresh install case) be able to init a registry (also separate) from shpc where you can say "only add the containers I am specifying in this set).

And @marcodelapierre - if we have an "update this local registry from remote" you could easily add your second site-specific registry. I would even say they shouldn't be required to be in the main shpc repo (but if they are share-able definitely welcome!)

Could you share what you do when you prune? Is it a manual clone and delete stuff?

@marcodelapierre
Copy link
Contributor

@vsoch when you say "containers" above, do you mean the recipes or the actual installations?
would it be good for shpc to control both refreshing of the recipes and re-installations for refreshed recipes? and to control these two steps independently (i.e. only do the first, then do the second, or do both)

also, in the fresh install case point above, are you saying to only init a registry with a subset of a remote registry? I don't see the benefit in this case, the registry is small, no harm in always mirroring the full one? or may I misunderstood?

@marcodelapierre
Copy link
Contributor

marcodelapierre commented Jul 21, 2022

my setup process is versioned on github, here we go:

https://github.com/PawseySC/pawsey-spack-config/blob/main/setonix/setup_scripts/setup_shpc.sh#L27-L37

when I said prune, I meant that I git clone the shpc github repo, and only retain the registry directory.

as regards my second site-specific registry, it is indeed maintained here (there are just a few specific things - and note I haven't updated it to use overrides yet!): https://github.com/PawseySC/pawsey-spack-config/tree/main/setonix/shpc_registry

@vsoch
Copy link
Member

vsoch commented Jul 21, 2022

Oh I mean just the recipes, sorry about that! The user / registry owner would need to run the installs again to choose actual updates (and maybe we could help with that?)

For the fresh case, it’s actually the same as before except we do whatever prune process the user wants. And no prune means just a clone of the remote.

@vsoch
Copy link
Member

vsoch commented Aug 27, 2022

This was a little over a month of work - but so worth it! 🥳 We just merged the first version of shpc with an entirely remote registry - thanks to @muffato for the careful review and @marcodelapierre @surak here for the discussion! There are a few issues opened by @muffato with respect to private / ssh remotes that I'll likely work on next, although probably not tonight. But we can close the issue here, as sync was added too :)

@vsoch vsoch closed this as completed Aug 27, 2022
@vsoch
Copy link
Member

vsoch commented Aug 27, 2022

@marcodelapierre I don't remember where you mentioned it, but monthly releases of shpc-registry are now setup! singularityhub/shpc-registry@902d48a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants