Regression in 0.27.1: Commit.message no longer handles unspecified encodings correctly

When upgrading an application built on top of pygit2 from 0.24.2 to 0.27.2, our test suite caught a regression in the handling of commit messages with unspecified encoding, which I tracked down to commit bbf4b79d86dd41847165d961bf6fb2ac3c9e0b2d.  You can see the problem with the very commit to git itself that's referenced in comments in `to_unicode_n`:

```
$ python
Python 2.7.6 (default, Nov 13 2018, 12:45:42)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pygit2
>>> repo = pygit2.Repository('.')
>>> repo['c31820c2']
<_pygit2.Commit object at 0x7fa699661230>
>>> repo['c31820c2'].message
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf8 in position 126: invalid start byte
```

Would it perhaps make sense for `Commit_message__get__` to call `to_unicode(message, encoding, NULL)` rather than `to_unicode(message, encoding, "strict")`, so that it continues to benefit from the fallback to `"replace"` even after this change?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Regression in 0.27.1: Commit.message no longer handles unspecified encodings correctly #839

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Regression in 0.27.1: Commit.message no longer handles unspecified encodings correctly #839

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions