Skip to content

Commit

Permalink
Fix invalid UTF-8 encoding with Python 3
Browse files Browse the repository at this point in the history
This fix fixes tensorflow#4

- UTF-8 codec can't decode byte 0x83.
- In Python 2, the six.u(value) is decoded with the unicode-escape codec.
- Non-UTF-8 strings must be converted to unicode objects before being added.
  • Loading branch information
GitHub30 committed Nov 1, 2018
1 parent a842112 commit 491065e
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion adanet/core/report_accessor.py
Expand Up @@ -111,7 +111,8 @@ def _update_proto_map_from_dict(proto, field_name, dictionary):
field[key].bool_value = value
elif isinstance(value, (six.string_types, six.binary_type)):
# Proto requires unicode strings.
field[key].string_value = six.u(value)
unicode_string = six.u(value) if six.PY2 else value.decode("unicode_escape")
field[key].string_value = unicode_string
elif isinstance(value, int):
field[key].int_value = value
elif isinstance(value, float):
Expand Down

0 comments on commit 491065e

Please sign in to comment.