UnicodeDecodeError #37

Closed
reactormonk opened this Issue Nov 26, 2013 · 2 comments

Projects

None yet

3 participants

@reactormonk

When iterating through a specific replay, I get

 Traceback (most recent call last):
  File "hooks.py", line 83, in <module>
    print(successful_hooks(game))
  File "hooks.py", line 15, in successful_hooks
    for tick, user_messages, game_events, world, modifiers in game.stream(tick=0):
  File "build/bdist.linux-x86_64/egg/skadi/demo.py", line 136, in __iter__
  File "build/bdist.linux-x86_64/egg/skadi/demo.py", line 186, in advance
  File "build/bdist.linux-x86_64/egg/skadi/engine/user_message.py", line 67, in parse
  File "/scratch/02213/sh37476/vis/evaluate/dota-analysis/lib/python2.7/site-packages/google/protobuf/message.py", line 182, in ParseFromString
    self.MergeFromString(serialized)
  File "/scratch/02213/sh37476/vis/evaluate/dota-analysis/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 795, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/scratch/02213/sh37476/vis/evaluate/dota-analysis/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 827, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/scratch/02213/sh37476/vis/evaluate/dota-analysis/lib/python2.7/site-packages/google/protobuf/internal/decoder.py", line 410, in DecodeField
    field_dict[key] = local_unicode(buffer[pos:new_pos], 'utf-8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd1 in position 30: unexpected end of data

http://s000.tinyupload.com/?file_id=91725447477505323662

Owner

Minimal reproduction of the problem:

>>> import protobuf.impl.usermessages_pb2 as um
>>> st = um.CUserMsg_SayText2()
>>> d = '\x08\x0b\x10\x01\x1a\rDOTA_Chat_All"\x1f\xd0\x9f\xd0\xb0\xd1\x88\xd0\xb0 \xd1\x87\xd0\xbb\xd0\xb5\xd0\xbd\xd0\xb0 \xd0\xbf\xd0\xbe\xd1\x82\xd1\x80\xd0\xbe\xd1*\x02+1'
>>> st.ParseFromString(d)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/google/protobuf/message.py", line 182, in ParseFromString
    self.MergeFromString(serialized)
  File "/usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 795, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 827, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/usr/lib/python2.7/site-packages/google/protobuf/internal/decoder.py", line 410, in DecodeField
    field_dict[key] = local_unicode(buffer[pos:new_pos], 'utf-8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd1 in position 30: unexpected end of data
Owner

This is, unfortunately, a protobuf problem. I will make skadi 1.0 swallow this and other similar exceptions so the whole process doesn't give out. The associated user messages (in this case, chat messages) will not be provided to the consumer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment