Skip to content

Commit

Permalink
fix files list on file rename
Browse files Browse the repository at this point in the history
GitPython parses the output of `git diff --numstat` to get the
files changed in a commit.
This breaks when a commit contains a file rename, because the output
of `git diff` is different than expected.
This is the output of a normal commit:

    $ git diff --numstat 8f41a39^ 8f41a39
    30      5       test/test_repo.py

And this a commit containing a rename:

    $ git diff --numstat 185d847^ 185d847
    3       1       .github/workflows/{test_pytest.yml => Future.yml}

This can be triggered by this code:

    for commit in repo.iter_commits():
        print(commit.hexsha)
            for file in commit.stats.files:
                print(file)

Which will print for the normal commit:

    8f41a39
    'test/test_repo.py'

And when there is a rename:

    185d847
    '.github/workflows/{test_pytest.yml => Future.yml}'

Additionally, when a path member is removed, the file list become
a list of strings, breaking even more the caller. This is in the
Linux kernel tree:

    $ git diff --numstat db401875f438^ db401875f438
    1       1       tools/testing/selftests/drivers/net/mlxsw/{spectrum-2 => }/devlink_trap_tunnel_ipip6.sh

and GitPython parses it as:

    db401875f438168c5804b295b93a28c7730bb57a
    ('tools/testing/selftests/drivers/net/mlxsw/{spectrum-2 => '
    '}/devlink_trap_tunnel_ipip6.sh')

Fix this by pasing the --no-renames option to `git diff` which ignores
renames and print the same output as if the file was deleted from the
old path and created in the new one:

    $ git diff --numstat --no-renames 185d847^ 185d847
    57      0       .github/workflows/Future.yml
    0       55      .github/workflows/test_pytest.yml
  • Loading branch information
teknoraver committed Jan 12, 2023
1 parent 90c81a5 commit b5694ec
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 2 deletions.
4 changes: 2 additions & 2 deletions git/objects/commit.py
Original file line number Diff line number Diff line change
Expand Up @@ -324,14 +324,14 @@ def stats(self) -> Stats:
:return: git.Stats"""
if not self.parents:
text = self.repo.git.diff_tree(self.hexsha, "--", numstat=True, root=True)
text = self.repo.git.diff_tree(self.hexsha, "--", numstat=True, no_renames=True, root=True)
text2 = ""
for line in text.splitlines()[1:]:
(insertions, deletions, filename) = line.split("\t")
text2 += "%s\t%s\t%s\n" % (insertions, deletions, filename)
text = text2
else:
text = self.repo.git.diff(self.parents[0].hexsha, self.hexsha, "--", numstat=True)
text = self.repo.git.diff(self.parents[0].hexsha, self.hexsha, "--", numstat=True, no_renames=True)
return Stats._list_from_string(self.repo, text)

@property
Expand Down
14 changes: 14 additions & 0 deletions test/test_commit.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,20 @@ def check_entries(d):
self.assertEqual(commit.committer_tz_offset, 14400, commit.committer_tz_offset)
self.assertEqual(commit.message, "initial project\n")

def test_renames(self):
commit = self.rorepo.commit("185d847ec7647fd2642a82d9205fb3d07ea71715")
files = commit.stats.files

# when a file is renamed, the output of git diff is like "dir/{old => new}"
# unless we disable rename with --no-renames, which produces two lines
# one with the old path deletes and another with the new added
self.assertEqual(len(files), 2)
for path, d in files.items():
self.assertNotIn("{", path)
self.assertNotIn("=>", path)
self.assertNotIn("}", path)
# END for each stated file

def test_unicode_actor(self):
# assure we can parse unicode actors correctly
name = "Üäöß ÄußÉ"
Expand Down

0 comments on commit b5694ec

Please sign in to comment.