Skip to content

Multiple package repos in config.yaml#294

Merged
bcumming merged 26 commits into
eth-cscs:mainfrom
msimberg:multiple-package-repos
Jun 9, 2026
Merged

Multiple package repos in config.yaml#294
bcumming merged 26 commits into
eth-cscs:mainfrom
msimberg:multiple-package-repos

Conversation

@msimberg

Copy link
Copy Markdown
Collaborator

Extends config.yaml to accept not only

...
    packages:
        repo: https://github.com/foo/spack-packages.git
        commit: abcdef

but also multiple repos:

...
    packages:
        myrepo:
            repo: https://github.com/bar/other-packages.git
            commit: fedcba
            path: path/to/repo
        builtin:
            repo: https://github.com/foo/spack-packages.git
            commit: abcdef

Order of repos is significant in the same way it's significant in spack config files (https://spack.readthedocs.io/en/latest/repositories.html#search-order-and-overriding-packages). Packages from repos higher up on the list take precedence.

I made a small change so that there are also two implicit repos (currently on main there's one: alps). A repo recipe exists if and only if there are packages in the recipe. alps still exists, but doesn't have the recipe packages. The example above would produce the following list of repos:

  • recipe
  • alps
  • myrepo
  • builtin

Currently there's no enforcement that the repo entry has the same name as the repo namespace (similar to spack itself, if I understand it correctly). Is this ok?

There's an optional path property that can be used to point to a non-default subdirectory within the cloned git repo. For spack-packages this is always in repos/spack_repo/builtin, so the default is repos/spack_repo/{name}. In spack-packages there is an index file https://github.com/spack/spack-packages/blob/develop/spack-repo-index.yaml which could be used to locate all the package repos within a git repo. Spack uses this when it clones package repos through git. For now I decided it's overkill to support that. Thoughts?

@msimberg msimberg requested review from albestro and bcumming May 20, 2026 07:44
@msimberg msimberg marked this pull request as ready for review May 20, 2026 07:56
@albestro

Copy link
Copy Markdown
Contributor

Currently there's no enforcement that the repo entry has the same name as the repo namespace (similar to spack itself, if I understand it correctly). Is this ok?

I quickly checked and I agree with you. The important part is the matching between namespace (defined in repos.yaml) and the folder name. Not sure what the name is there for, probably just for distinguish the different configurations.

There's an optional path property that can be used to point to a non-default subdirectory within the cloned git repo. For spack-packages this is always in repos/spack_repo/builtin, so the default is repos/spack_repo/{name}. In spack-packages there is an index file https://github.com/spack/spack-packages/blob/develop/spack-repo-index.yaml which could be used to locate all the package repos within a git repo. Spack uses this when it clones package repos through git. For now I decided it's overkill to support that. Thoughts?

I fully agree, unless there will be a reason to do so, I'd add as requirement that additional repos should follow the "standard"/implicit repo structure.

@albestro albestro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice extensions, really nice done! just a few minor comments from my side.

Comment thread stackinator/builder.py Outdated
Comment thread stackinator/schema/config.json Outdated
Comment on lines +40 to +45
"commit": {
"oneOf": [
{"type" : "string"},
{"type" : "null"}
]
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the test, this is managed at python level with dict.get which returns None if not set, instead of "enforcing" it with the schema.

If it was on purpose, it's ok for me. Just noting this because it might have happened "by accident" and it is not what has been done elsewhere, where default values are used

Suggested change
"commit": {
"oneOf": [
{"type" : "string"},
{"type" : "null"}
]
}
"commit": {
"oneOf": [
{"type" : "string"},
{"type" : "null"}
],
default: null
}

In case, it applies also to the next more "advanced" schema.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semi-intentional, but you might have ideas on how to improve this since I know you've worked more with the schema+validator. The current validator seems to set defaults unconditionally for subobjects in oneOf. It'd e.g. take

...
  packages:
    myrepo:
      repo: https://...

and make it into

...
  packages:
    commit: null
    myrepo:
      repo: https://...

which doesn't pass validation anymore, because it should be one of those two.

It seems I can set a default for commit in the branch where there's a list of repos so I added that default now. But for packages:commit I can't add back the old default null without changing the validator.


But that raises a good question: should we even allow commit to be empty? I see now there's a fallback path for it when cloning repos and when commit isn't given it'll just use whatever is checked out by default after cloning a repo. I think this is probably not something we ever want. I couldn't see any recipes in alps-uenv that would set repo but not set the commit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that setting a default value of null is best, though I have to admit we have not been fully consistent in the schema (i.e. some Oneof fields do not provide a default value).

https://github.com/eth-cscs/stackinator/blob/main/stackinator/schema/environments.json#L78-L90

I would suggest two modifications:

  • define a repo_def type (see env_var_def for example), instead of repeating yourself.
  • make the list alternative a real list, not a dictionary, to make the ordering implicit (as opposed to relying on order being preserved in the Python dict)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a repo_def, thanks for pointing that out. I didn't know that was a thing.

Now the spack:commit (not spack:packages:commit) still allows null. This is unchanged, but should we require that one as well? since it's outside the scope of this pr I'd leave it out, but if you prefer to change it as well for consistency I don't mind changing it.

Comment thread stackinator/builder.py Outdated
@msimberg msimberg requested review from bcumming and removed request for bcumming May 29, 2026 15:13

@bcumming bcumming left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - looks good!

  • serious suggestions about the yaml specificaction
  • a minor nit about where code "belongs"

Comment thread .github/workflows/main.yaml Outdated
Comment thread stackinator/schema/config.json Outdated
Comment on lines +40 to +45
"commit": {
"oneOf": [
{"type" : "string"},
{"type" : "null"}
]
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that setting a default value of null is best, though I have to admit we have not been fully consistent in the schema (i.e. some Oneof fields do not provide a default value).

https://github.com/eth-cscs/stackinator/blob/main/stackinator/schema/environments.json#L78-L90

I would suggest two modifications:

  • define a repo_def type (see env_var_def for example), instead of repeating yourself.
  • make the list alternative a real list, not a dictionary, to make the ordering implicit (as opposed to relying on order being preserved in the Python dict)

Comment thread stackinator/builder.py Outdated
)
f.write("\n")

def _resolve_packages(self, packages):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor style nit.

As a general rule (that is not followed strictly), it is better to perform this sort of processing of inputs in the recipe, and save.

The base path might not be available (or easy to determine) in the recipe, but you could add a method to recipes:

packages = recipe.packages(base)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now a spack_package_repos property in Recipe. I also added a check that in the multi-repo case the repos are not named alps or recipe since those are used by stackinator itself.

@msimberg

msimberg commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator Author

I've kept the dict of repos approach as discussed, but if you still change your mind the list of repos change is in https://github.com/msimberg/stackinator/commits/multiple-package-repos-list-of-repos/.

Otherwise I think I've responded or resolved the requested changes. I've not yet changed any documentation and I haven't re-run this on a real recipe yet, and I want to do both still before merging, but not before you're happy with the current implementation.

@msimberg

msimberg commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

I added a bit of documentation about the new config options.

One thing that stood out when I changed it is the reserved alps package repo name. That might be worth changing at some point if stackinator is to be used at other sites?

@msimberg msimberg requested a review from bcumming June 9, 2026 09:50
@bcumming bcumming merged commit f3cd36d into eth-cscs:main Jun 9, 2026
2 checks passed
@msimberg msimberg deleted the multiple-package-repos branch June 9, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants