Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diff::a_path is wrong for unicode characters #1099

Closed
marxin opened this issue Jan 4, 2021 · 8 comments · Fixed by #1102
Closed

Diff::a_path is wrong for unicode characters #1099

marxin opened this issue Jan 4, 2021 · 8 comments · Fixed by #1102

Comments

@marxin
Copy link
Contributor

marxin commented Jan 4, 2021

The problem is slightly mentioned in #865.

Let's consider the following example:

git init foo
cd ./foo
echo a > a
git add a
git commit -a -m 'Add a'
echo "foo" > špatně.txt
git add špatně.txt
git commit -a -m Add
In [4]: Repo('.').commit().diff('HEAD~')[0].a_path
Out[4]: '"\\305\\241patn\\304\\233.txt"'

Unless one uses git config --global core.quotepath false, the file name is quoted and I don't see a simple solution how to decode it to unicode? One can use -z option that can resolve the problem:

git diff HEAD~ --raw -z | cat
:000000 100644 0000000 257cc56 Ašpatně.txt
@Byron
Copy link
Member

Byron commented Jan 4, 2021

Thanks a lot, I could reproduce the issue and even managed to sneak in the -z flag like so:

 Repo('.').commit().diff('HEAD~', z=True)[0].a_path

This executes git diff-tree -z c69d9db65234b59d43f17be40667bdbf0a2106f8 HEAD~1 -r --abbrev=40 --full-index -M --raw --no-color and produces

:100644 000000 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 0000000000000000000000000000000000000000 Dšpatně.txt

whereas the version without -z produces

:100644 000000 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 0000000000000000000000000000000000000000 D"\305\241patn\304\233.txt"

exactly as expected.

However, the resulting commit object was empty for some reason, maybe due to a silent parsing failure - even though it doesn't look like it.

@marxin
Copy link
Contributor Author

marxin commented Jan 5, 2021

This executes git diff-tree -z c69d9db65234b59d43f17be40667bdbf0a2106f8 HEAD~1 -r --abbrev=40 --full-index -M --raw --no-color and produces

Note that separator characters is \x00 in this case. The following patch works for me:

diff --git a/git/diff.py b/git/diff.py
index 0fc30b9e..17ef15af 100644
--- a/git/diff.py
+++ b/git/diff.py
@@ -108,6 +108,7 @@ class Diffable(object):
             args.append("-p")
         else:
             args.append("--raw")
+            args.append("-z")
 
         # in any way, assure we don't see colored output,
         # fixes https://github.com/gitpython-developers/GitPython/issues/172
@@ -483,7 +484,7 @@ class Diff(object):
             if not line.startswith(":"):
                 return
 
-            meta, _, path = line[1:].partition('\t')
+            meta, _, path = line[1:].partition('\x00')
             old_mode, new_mode, a_blob_id, b_blob_id, _change_type = meta.split(None, 4)
             # Change type can be R100
             # R: status letter

Can you please upstream the patch?

@Byron
Copy link
Member

Byron commented Jan 5, 2021

@marxin could you have a look at the linked PR�? That's as far as I could get it, and it looks like there is more to it than the change above. Feel free to submit a new PR with your fixes and I will close mine. Thank you.

@marxin
Copy link
Contributor Author

marxin commented Jan 5, 2021

@marxin could you have a look at the linked PR�? That's as far as I could get it, and it looks like there is more to it than the change above. Feel free to submit a new PR with your fixes and I will close mine. Thank you.

Thank you for the start. I added one more commit and it seems it works. Please see PR #1102.

Byron added a commit that referenced this issue Jan 5, 2021
@Byron Byron added this to the v3.1.12 - Bugfixes milestone Jan 5, 2021
@Byron
Copy link
Member

Byron commented Jan 5, 2021

Awesome, thanks so much!

@marxin
Copy link
Contributor Author

marxin commented Jan 5, 2021

I thank you! See you.

@marxin
Copy link
Contributor Author

marxin commented Jan 5, 2021

@Byron When can one expect a release that will contain the fix?

@Byron
Copy link
Member

Byron commented Jan 6, 2021

v3.1.12 was just released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment