-
-
Notifications
You must be signed in to change notification settings - Fork 404
Closed
Description
When upgrading an application built on top of pygit2 from 0.24.2 to 0.27.2, our test suite caught a regression in the handling of commit messages with unspecified encoding, which I tracked down to commit bbf4b79. You can see the problem with the very commit to git itself that's referenced in comments in to_unicode_n
:
$ python
Python 2.7.6 (default, Nov 13 2018, 12:45:42)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pygit2
>>> repo = pygit2.Repository('.')
>>> repo['c31820c2']
<_pygit2.Commit object at 0x7fa699661230>
>>> repo['c31820c2'].message
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf8 in position 126: invalid start byte
Would it perhaps make sense for Commit_message__get__
to call to_unicode(message, encoding, NULL)
rather than to_unicode(message, encoding, "strict")
, so that it continues to benefit from the fallback to "replace"
even after this change?
Metadata
Metadata
Assignees
Labels
No labels