-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DiffLine.content is represented as unicode #610
Comments
When you say "The field Python has weird issues with encodings; I wouldn't be surprised if it was the language itself. |
The docs do not mention the type at all. But (at least until 0.24.0, which was the last version I tested) the actual implementation of My file is a unicode file. But as git does not know anything about encodings, from git's point of view it is a stream of bytes. Thus I expect What do you mean with "weird issues with encodings"?!? This is definitely a bug in pygit2 (as it uses unicode instead of str/bytes). |
|
If the sourcecode contains a non-ascii characters (i.e. "Ö" or 'Ä') in a line, all of them will be replaced by '\xfffd'. This makes pygit2 diff unusable for all codebases that are using non-ascii text (which should be quite a lot). If pygit2 would return
|
Make a PR for bytes type, then; you can't convert non-ascii to string on the fly. |
This is not about "on-the-fly" convertion. This is about loosing all characters > 0x80. |
This is very much like #790 but concerning Besides Thanks @mrh1997 for raising the issue, thanks @arcsector for bringing it back in my radar. |
0.27.1 (2018-06-02) ------------------------- Breaking changes: - Now ``discover_repository`` returns ``None`` if repository not found, instead of raising ``KeyError`` `#531 <https://github.com/libgit2/pygit2/issues/531>`_ Other changes: - New ``DiffLine.raw_content`` `#610 <https://github.com/libgit2/pygit2/issues/610>`_ - Fix tests failing in some cases `#795 <https://github.com/libgit2/pygit2/issues/795>`_ - Automatize wheels upload to pypi `#563 <https://github.com/libgit2/pygit2/issues/563>`_
0.27.2 (2018-09-16) ------------------------- - Add support for Python 3.7 `#809 <https://github.com/libgit2/pygit2/issues/809>`_ - New ``Object.short_id`` `#799 <https://github.com/libgit2/pygit2/issues/799>`_ `#806 <https://github.com/libgit2/pygit2/pull/806>`_ `#807 <https://github.com/libgit2/pygit2/pull/807>`_ - New ``Repository.descendant_of`` and ``Repository.branches.with_commit`` `#815 <https://github.com/libgit2/pygit2/issues/815>`_ `#816 <https://github.com/libgit2/pygit2/pull/816>`_ - Fix repository initialization in ``clone_repository(...)`` `#818 <https://github.com/libgit2/pygit2/issues/818>`_ - Fix several warnings and errors, commits `cd896ddc <https://github.com/libgit2/pygit2/commit/cd896ddc>`_ and `dfa536a3 <https://github.com/libgit2/pygit2/commit/dfa536a3>`_ - Documentation fixes and improvements `#805 <https://github.com/libgit2/pygit2/pull/805>`_ `#808 <https://github.com/libgit2/pygit2/pull/808>`_ 0.27.1 (2018-06-02) ------------------------- Breaking changes: - Now ``discover_repository`` returns ``None`` if repository not found, instead of raising ``KeyError`` `#531 <https://github.com/libgit2/pygit2/issues/531>`_ Other changes: - New ``DiffLine.raw_content`` `#610 <https://github.com/libgit2/pygit2/issues/610>`_ - Fix tests failing in some cases `#795 <https://github.com/libgit2/pygit2/issues/795>`_ - Automatize wheels upload to pypi `#563 <https://github.com/libgit2/pygit2/issues/563>`_
The field
DiffLine.content
contains a unicode line. As git does not know anything about the encoding of the files to be diffed (they are blobs), I expect this object to be of typestr
in py2 andbytes
in py3.Even worse if a file is i.e. latin-1 encoded and contains latin-1 specific characters, all these characters are mapped to '\xfffd'. Thus is impossible to diff non-ascii encoded text files correctly.
I suppose this is a pygit2 bug, as the libgit2.h interface works correctly, as it exposes this field as
const char *
(see https://github.com/libgit2/libgit2/blob/HEAD/include/git2/diff.h#L555)The text was updated successfully, but these errors were encountered: