Skip to content

Fix AssertionError when cloning at annotated tag#10719

Open
jelmer wants to merge 1 commit intopython-poetry:mainfrom
jelmer:fix-peeled-tags
Open

Fix AssertionError when cloning at annotated tag#10719
jelmer wants to merge 1 commit intopython-poetry:mainfrom
jelmer:fix-peeled-tags

Conversation

@jelmer
Copy link
Contributor

@jelmer jelmer commented Feb 7, 2026

When cloning a Git repository at an annotated tag, if the peeled tag reference (refs/tags/v1.0.0^{}) is not available in the fetch result, Poetry would set HEAD to the tag object SHA instead of the commit SHA. This caused reset_index() to fail with:

  AssertionError: assert isinstance(obj, Commit)

The fix peels tag objects recursively to extract the underlying commit SHA before setting HEAD. This ensures HEAD always points to a Commit object, not a Tag object.

Dulwich has already been updated to print clearer errors in this situation, which should be in 1.0.1

Pull Request Check List

Resolves: #10658

  • Added tests for changed code.
  • Updated documentation for changed code.

@sourcery-ai
Copy link

sourcery-ai bot commented Feb 7, 2026

Reviewer's Guide

Ensures Git clone operations at annotated (including nested) tags always set HEAD to the underlying commit object by peeling tag objects recursively, and adds regression tests for annotated and nested annotated tag cloning using real dulwich repos.

Sequence diagram for resolving HEAD at annotated tag

sequenceDiagram
    actor Developer
    participant GitBackend
    participant FetchPackResult
    participant Repo
    participant ObjectStore
    participant Tag
    participant Commit

    Developer->>GitBackend: resolve(remote_refs, repo)
    GitBackend->>GitBackend: _normalise(remote_refs, repo)
    GitBackend->>GitBackend: _set_head(remote_refs, repo)

    alt ref_is_symbolic_HEAD
        GitBackend->>FetchPackResult: read refs[ref]
        FetchPackResult-->>GitBackend: head
    else ref_is_specific_tag_or_ref
        GitBackend->>FetchPackResult: read refs[ref]
        FetchPackResult-->>GitBackend: head
    end

    GitBackend->>Repo: access object_store
    Repo-->>GitBackend: ObjectStore
    GitBackend->>ObjectStore: get(head)
    alt object_found
        ObjectStore-->>GitBackend: Tag or Commit
        loop peel_tag_until_commit
            GitBackend->>GitBackend: isinstance(obj, Tag)
            alt obj_is_Tag
                GitBackend->>Tag: read object
                Tag-->>GitBackend: object_type, sha
                GitBackend->>ObjectStore: get(sha)
                ObjectStore-->>GitBackend: Tag or Commit
            else obj_is_not_Tag
                GitBackend->>GitBackend: obj is Commit (stop peeling)
            end
        end
    else object_missing
        ObjectStore-->>GitBackend: KeyError
        GitBackend->>GitBackend: skip peeling (handled during fetch)
    end

    GitBackend->>FetchPackResult: set refs[ref] = head
    GitBackend->>FetchPackResult: set refs[HEAD] = head
    GitBackend-->>Developer: HEAD now points to Commit
Loading

Updated class diagram for Git backend tag peeling logic

classDiagram
    class GitBackend {
        - bytes ref
        - str revision
        + resolve(remote_refs FetchPackResult, repo Repo) void
        - _normalise(remote_refs FetchPackResult, repo Repo) void
        - _set_head(remote_refs FetchPackResult, repo Repo) void
    }

    class FetchPackResult {
        + dict~bytes, bytes~ refs
    }

    class Repo {
        + ObjectStore object_store
    }

    class ObjectStore {
        + __getitem__(sha bytes) GitObject
    }

    class GitObject {
    }

    class Tag {
        + tuple~str, bytes~ object
    }

    class Commit {
    }

    GitBackend --> FetchPackResult : uses
    GitBackend --> Repo : uses
    Repo --> ObjectStore : has
    ObjectStore <|.. GitObject : returns
    GitObject <|-- Tag
    GitObject <|-- Commit
    GitBackend ..> Tag : peels
    GitBackend ..> Commit : ensures_HEAD_points_to
Loading

File-Level Changes

Change Details Files
Ensure HEAD is set to a commit when resolving refs, peeling annotated (and nested) tags to their underlying commit SHA.
  • Pass the dulwich Repo instance into _set_head so it can inspect objects when resolving the remote head.
  • Extend _set_head to look up the resolved ref object in the repo.object_store and, while it is a Tag, update the head SHA to the tag’s target object SHA.
  • Handle missing objects in the store gracefully by catching KeyError and leaving resolution to the fetch process.
  • Keep remote_refs.refs[ref] and remote_refs.refs[b'HEAD'] synchronized with the peeled commit SHA.
src/poetry/vcs/git/backend.py
Add regression tests covering cloning at annotated and nested annotated tags using the real dulwich backend (no mocks).
  • Add test_clone_annotated_tag to verify cloning at an annotated tag results in HEAD pointing to a Commit and that the working tree contents are correct.
  • Add test_clone_nested_annotated_tags to construct a commit, an annotated tag pointing at the commit, and a second annotated tag pointing at the first, then verify cloning at the outer tag peels to the underlying commit SHA.
  • Mark both new tests with pytest.mark.skip_git_mock to run only against the real Git backend and use dulwich Repo/porcelain APIs to construct the repositories.
tests/vcs/git/test_backend.py

Assessment against linked issues

Issue Objective Addressed Explanation
#10658 Fix the AssertionError raised by dulwich.repo.Repo.get_parents when updating a git+ssh VCS dependency pinned to a tag (in particular when the tag is an annotated tag).

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The tag-peeling loop in _set_head has no safeguard against cyclic tag references; consider tracking visited SHAs or imposing an iteration limit to avoid potential infinite loops on malformed repositories.
  • Swallowing KeyError in the tag-peeling block leaves head potentially pointing at a tag object again; it may be safer to either log/raise in this case or ensure the object is fetched/available before proceeding so HEAD is guaranteed to resolve to a commit.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The tag-peeling loop in `_set_head` has no safeguard against cyclic tag references; consider tracking visited SHAs or imposing an iteration limit to avoid potential infinite loops on malformed repositories.
- Swallowing `KeyError` in the tag-peeling block leaves `head` potentially pointing at a tag object again; it may be safer to either log/raise in this case or ensure the object is fetched/available before proceeding so HEAD is guaranteed to resolve to a commit.

## Individual Comments

### Comment 1
<location> `tests/vcs/git/test_backend.py:295-304` </location>
<code_context>
+@pytest.mark.skip_git_mock
</code_context>

<issue_to_address>
**suggestion (testing):** Strengthen the test by asserting that HEAD matches the expected commit SHA, not just that it is a Commit object.

In `test_clone_annotated_tag`, you currently only check that `HEAD` is a `Commit`, which guards against it being a `Tag`. To make the test stronger, also assert that this `Commit` is the one created in the source repo by capturing its SHA (e.g., from `porcelain.commit` or `repo.head()` before tagging) and comparing that value to `head_sha`. This verifies both the type and that the peeling logic resolves to the correct commit.

Suggested implementation:

```python
    from dulwich import porcelain
    from dulwich.objects import Commit

```

```python
    # Create a source repository with an annotated tag

```

```python
    repo = Repo.init(str(source_path))

```

```python
    # HEAD should be a commit (not a Tag) and it should be the same commit
    # that was created in the source repository before tagging.
    assert isinstance(head, Commit)
    assert head_sha == expected_head_sha

```

I only see the beginning of `test_clone_annotated_tag`, so you will need to wire in the expected SHA where the initial commit is created:

1. When you create the commit in the source repository (likely via `porcelain.commit` or `repo.do_commit`), capture its SHA, for example:
   - If using `porcelain.commit(str(source_path), message=b"Initial commit")`, assign the return value to `expected_head_sha`.
   - If using `repo.do_commit(...)`, use `expected_head_sha = repo.head()` (or `repo[repo.head()].id` depending on how you compute `head_sha`).
2. Ensure that `head_sha` in the assertion block refers to the SHA of `HEAD` in the cloned repo (whatever variable you currently derive from the peeled `HEAD` object).
3. Make sure `expected_head_sha` is defined in the same scope as the final assertions so that `assert head_sha == expected_head_sha` compiles and runs.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

When cloning a Git repository at an annotated tag, if the peeled tag
reference (refs/tags/v1.0.0^{}) is not available in the fetch result,
Poetry would set HEAD to the tag object SHA instead of the commit SHA.
This caused reset_index() to fail with:
  AssertionError: assert isinstance(obj, Commit)

The fix peels tag objects recursively to extract the underlying commit
SHA before setting HEAD. This ensures HEAD always points to a Commit
object, not a Tag object.

Fixes python-poetry#10658
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AssertionError in get_parents when updating package from private repo

1 participant