Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ability to compare git references to spack install #24639

Merged
merged 65 commits into from Sep 15, 2021

Conversation

vsoch
Copy link
Member

@vsoch vsoch commented Jun 30, 2021

This will allow a user to (from anywhere a Spec is parsed including both name and version) refer to a git commit in lieu of
a package version, and be able to make comparisons with releases in the history based on commits (or with other commits). We do this by way of:

  • Adding a property, is_commit, to a version, meaning I can always check if a version is a commit and then change some action.
  • Adding an attribute to the Version object which can lookup commits from a git repo and find the last known version before that commit, and the distance
  • Construct new Version comparators, which are tuples. For normal versions, they are unchanged. For commits with a previous version x.y.z, d commits away, the comparator is (x, y, z, '', d). For commits with no previous version, the comparator is ('', d) where d is the distance from the first commit in the repo.
  • Metadata on git commits is cached in the misc_cache, for quick lookup later.
  • Git repos are cached as bare repos in ~/.spack/git_repos
  • In both caches, git repo urls are turned into file paths within the cache

If a commit cannot be found in the cached git repo, we fetch from the repo. If a commit is found in the cached metadata, we do not recompare to newly downloaded tags (assuming repo structure does not change). The cached metadata may be thrown out by using the spack clean -m option if you know the repo structure has changed in a way that invalidates existing entries. Future work will include automatic updates.

Finding previous versions

Spack will search the repo for any tags that match the string of a version given by the version directive. Spack will also search for any tags that match v + string for any version string. Beyond that, Spack will search for tags that match a SEMVER regex (i.e., tags of the form x.y.z) and interpret those tags as valid versions as well. Future work will increase the breadth of tags understood by Spack

For each tag, Spack queries git to determine whether the tag is an ancestor of the commit in question or not. Spack then sorts the tags that are ancestors of the commit by commit-distance in the repo, and takes the nearest ancestor. The version represented by that tag is listed as the previous version for the commit.

Not all commits will find a previous version, depending on the package workflow. Future work may enable more tangential relationships between commits and versions to be discovered, but many commits in real world git repos require human knowledge to associate with a most recent previous version. Future work will also allow packages to specify commit/tag/version relationships manually for such situations.

Version comparisons.

The empty string is a valid component of a Spack version tuple, and is in fact the lowest-valued component. It cannot be generated as part of any valid version. These two characteristics make it perfect for delineating previous versions from distances. For any version x.y.z, (x, y, z, '', _) will be less than any "real" version beginning x.y.z. This ensures that no distance from a release will cause the commit to be interpreted as "greater than" a version which is not an ancestor of it.

Signed-off-by: vsoch vsoch@users.noreply.github.com

@vsoch
Copy link
Member Author

vsoch commented Jun 30, 2021

okay updated to try and parse any semantic versions found in git tags.

@vsoch vsoch force-pushed the add/github-version-comparison branch 6 times, most recently from fb1b4f0 to 43967d5 Compare July 1, 2021 16:00
This works by way of adding a property, is_commit, to a version, and then given that a version
is a commit, we return a GitFetcherStrategy to use it. In additional, we instruct
the version command to generate a lookup of commits that keeps track of the order
(for comparing between commits) and the previous and next spack version (for comparison
of a commit with a spack version string). If the commit does not have known releases before
it, then the previous is None and we cannot determine the relationship and return False.
The same is true if the commit does not have any known releases after it, although it
is unlikely to hit this case unless the user is asking for a version that has been
released but not added to spack.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch vsoch force-pushed the add/github-version-comparison branch from 43967d5 to 090c798 Compare July 1, 2021 16:57
@vsoch
Copy link
Member Author

vsoch commented Jul 1, 2021

@becker33 the last test is still finishing but we can probably discuss it now anyway. Let me know if this is in the right direction of what we had in mind!

@vsoch vsoch requested a review from becker33 July 6, 2021 19:29
vsoch and others added 2 commits July 9, 2021 10:12
@vsoch
Copy link
Member Author

vsoch commented Jul 12, 2021

ping @becker33 ! I thought this was time sensitive (opened 12 days ago, and I think we only had under 2 months?) so would it be possible to get reviewed soon?

@vsoch
Copy link
Member Author

vsoch commented Jul 14, 2021

@spackbot hello!

@spackbot-app
Copy link

spackbot-app bot commented Jul 14, 2021

Hello!

@vsoch
Copy link
Member Author

vsoch commented Jul 14, 2021

@spackbot re-run pipeline

@spackbot-app
Copy link

spackbot-app bot commented Jul 14, 2021

I'm not able to re-run the pipeline now because I don't have authentication.

@tgamblin
Copy link
Member

@spackbot hello

@spackbot-app
Copy link

spackbot-app bot commented Jul 15, 2021

Hello!

@tgamblin
Copy link
Member

@spackbot re-run pipeline

@spackbot-app
Copy link

spackbot-app bot commented Jul 15, 2021

I've started that pipeline for you!

@vsoch
Copy link
Member Author

vsoch commented Jul 15, 2021

@spackbot commands

@spackbot-app
Copy link

spackbot-app bot commented Jul 15, 2021

You can interact with me in many ways!

  • @spackbot hello: say hello and get a friendly response back!
  • @spackbot help or @spackbot commands: see this message
  • @spackbot run pipeline or @spackbot re-run pipeline: to request a new run of the GitLab CI pipeline

I'll also help to label your pull request and assign reviewers!
If you need help or see there might be an issue with me, open an issue here

@spackbot-app
Copy link

spackbot-app bot commented Jul 15, 2021

Sorry spack-bot[bot], I cannot do that for you. Only users with write can make this request!

@spackbot-app
Copy link

spackbot-app bot commented Jul 15, 2021

Hola!

Copy link
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great -- only final requests are:

  1. Use bare repositories and git fetch in the cache (instead of repos with a work tree and git pull) to avoid checking out the whole work tree (which is slow). We don't need the work tree -- just the history.
  2. Change the PR description to match what this ended up looking like.

And minor things below.

def mock_git_version_info(tmpdir, scope="function"):
"""Create a mock git repo with known structure

The structure of commits in this repo is as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use two colons and indent so that text below is preformatted in docs. https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#literal-blocks

@tgamblin
Copy link
Member

Oh, and finally: file locking on the caches if that is not already done.

@haampie
Copy link
Member

haampie commented Sep 14, 2021

This will allow a user to (from the command line) install a spack package at a particular commit

Is this still in the scope of this PR? Or is this about comparison only? Can we get a few lines in the docs about how version comparison works, whether it's necessary to add manual mappings from tags to spack versions or whether they're automatically mapped, what happens if a commit is not before / after a tagged commit (does the latest untagged commit on the release/1.x branch map to version 2.0, master, develop, infinity?)

@tgamblin tgamblin merged commit ef5ad4e into spack:develop Sep 15, 2021
@tgamblin tgamblin added this to In progress in Spack 0.17.0 Release via automation Sep 15, 2021
@tgamblin tgamblin moved this from In progress to Done in Spack 0.17.0 Release Sep 15, 2021
scheibelp added a commit that referenced this pull request Apr 12, 2022
Spack added support in #24639 for ad-hoc Git-commit-hash-based
versions: A user can install a package x@hash, where X is a package
that stores its source code in a Git repository, and the hash refers
to a commit in that repository which is not recorded as an explicit
version in the package.py file for X.

A couple issues were found relating to this:

* If an environment defines an alternative package repo (i.e. with
  repos.yaml), and spack.yaml contains user Specs with ad-hoc
  Git-commit-hash-based versions for packages in that repo,
  then as part of retrieving the data needed for version comparisons
  it will attempt to retrieve the package before the environment's
  configuration is instantiated.
* The bookkeeping information added to compare ad-hoc git versions was
  being stripped from Specs during concretization (such that user
  Specs which succeeded before concretizing would then fail after)

This addresses the issues:

* The first issue is resolved by deferring access to the associated
  Package until the versions are actually compared to one another.
* The second issue is resolved by ensuring that the Git bookkeeping
  information is explicitly applied to Specs after they are concretized.

This also:

* Resolves an ambiguity in the mock_git_version_info fixture used to
  create a tree of Git commits and provide a list where each index
  maps to a known commit.
* Isolates the cache used for Git repositories in tests using the
  mock_git_version_info fixture
* Adds a TODO which points out that if the remote Git repository
  overwrites tags, that Spack will then fail when using
  ad-hoc Git-commit-hash-based versions
joequant pushed a commit to joequant/spack that referenced this pull request Apr 17, 2022
)

Spack added support in spack#24639 for ad-hoc Git-commit-hash-based
versions: A user can install a package x@hash, where X is a package
that stores its source code in a Git repository, and the hash refers
to a commit in that repository which is not recorded as an explicit
version in the package.py file for X.

A couple issues were found relating to this:

* If an environment defines an alternative package repo (i.e. with
  repos.yaml), and spack.yaml contains user Specs with ad-hoc
  Git-commit-hash-based versions for packages in that repo,
  then as part of retrieving the data needed for version comparisons
  it will attempt to retrieve the package before the environment's
  configuration is instantiated.
* The bookkeeping information added to compare ad-hoc git versions was
  being stripped from Specs during concretization (such that user
  Specs which succeeded before concretizing would then fail after)

This addresses the issues:

* The first issue is resolved by deferring access to the associated
  Package until the versions are actually compared to one another.
* The second issue is resolved by ensuring that the Git bookkeeping
  information is explicitly applied to Specs after they are concretized.

This also:

* Resolves an ambiguity in the mock_git_version_info fixture used to
  create a tree of Git commits and provide a list where each index
  maps to a known commit.
* Isolates the cache used for Git repositories in tests using the
  mock_git_version_info fixture
* Adds a TODO which points out that if the remote Git repository
  overwrites tags, that Spack will then fail when using
  ad-hoc Git-commit-hash-based versions
tgamblin pushed a commit that referenced this pull request Jun 28, 2022
Building on #24639, this allows versions to be prefixed by `git.`. If a version begins `git.`, it is treated as a git ref, and handled as git commits are starting in the referenced PR.

An exception is made for versions that are `git.develop`, `git.main`, `git.master`, `git.head`, or `git.trunk`. Those are assumed to be greater than all other versions, as those prefixed strings are in other contexts.
bhatiaharsh pushed a commit to bhatiaharsh/spack that referenced this pull request Aug 8, 2022
Building on spack#24639, this allows versions to be prefixed by `git.`. If a version begins `git.`, it is treated as a git ref, and handled as git commits are starting in the referenced PR.

An exception is made for versions that are `git.develop`, `git.main`, `git.master`, `git.head`, or `git.trunk`. Those are assumed to be greater than all other versions, as those prefixed strings are in other contexts.
@@ -276,10 +319,13 @@ def satisfies(self, other):
gcc@4.7 so that when a user asks to build with gcc@4.7, we can find
a suitable compiler.
"""
self_cmp = self._cmp(other.commit_lookup)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this mixing self and other?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants