CatFileContentStream.execute() should probably safe_decode() stdout and stderr #470

Closed
warsaw opened this Issue Jun 13, 2016 · 0 comments

Comments

Projects
None yet
2 participants
Contributor

warsaw commented Jun 13, 2016

FTR, using Python 3.5 here.

In a Debian project, we want to essentially git show <ref>:debian/changelog but the changelog has some bogus non-utf-8 characters in it. Here's an excerpt (not sure if this will come through in the GH issue):

sbuild (0.24) unstable; urgency=low

  * remove -qq from apt-get call in the updatechroot script
  * fix upgradechroot output and add -u to -y
  * added oldstable to distribution options
  * fix for dependency calculation for --arch-all builds from
    Martin K<F6>gler (Closes: #180859)
  * libpng-dev => libpng12-0-dev in sbuild.conf
  * add dpkg-dev to package dependencies - thanks Michael Banck
    (Closes: #182234)
  * chroot building fix and waldi's patch still to come

 -- Rick Younie <younie@debian.org>  Sat, 19 Apr 2003 14:41:03 -0700

However, the command tracebacks (notice the weird <F6> in the changelog entry).

Traceback (most recent call last):
  File "/home/barry/projects/ubuntu/uddgit/usd-importer/usd-import", line 144, in get_changelog_versions_from_treeish
    ref, self._local_repo.git.show('%s:debian/changelog' % ref)))
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 459, in <lambda>
    return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 920, in _call_process
    return self.execute(make_call(), **_kwargs)
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 708, in execute
    stdout_value = stdout_value.decode(defenc)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 1341: invalid start byte

where defenc is utf-8. Since git/compat.py already has a safe_decode() method, that should probably be used instead on stdout_value and stderr_value to ensure you don't get an exception on bogus data.

warsaw added a commit to warsaw/GitPython that referenced this issue Jun 15, 2016

@warsaw warsaw referenced this issue Jun 15, 2016

Merged

Fix issue #470 #475

@Byron Byron closed this in #475 Jun 20, 2016

Byron added a commit that referenced this issue Jun 20, 2016

@Byron Byron added the acknowledged label Jun 20, 2016

@Byron Byron added this to the v2.0.6 - Bugfixes milestone Jun 20, 2016

yarikoptic pushed a commit to yarikoptic/GitPython that referenced this issue Sep 8, 2017

Description: Calling a git command (e.g. `show`) that contains invalid
 UTF-8 will cause a UnicodeDecodeError.
Author: Barry Warsaw <barry@ubuntu.com>
Bug: gitpython-developers#470

Patch-Name: issue470-safe-decode.patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment