Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

UnicodeDecodeError on my mp3 #3

Closed
ibeex opened this Issue · 5 comments

2 participants

@ibeex

hi, when I do info = TinyTag.get('my.mp3') i get this error
I can provide mp3

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-2-015becc94593> in <module>()
----> 1 info = TinyTag.get('my.mp3')

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in get(cls, filename, tags, length)
     56         if filename.lower().endswith('.mp3'):
     57             with open(filename, 'rb') as af:
---> 58                 return ID3(af, tags=tags, length=length)
     59         elif filename.lower().endswith(('.oga', '.ogg')):
     60             with open(filename, 'rb') as af:

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in __init__(self, filehandler, tags, length)
    117     def __init__(self, filehandler, tags=True, length=True):
    118         TinyTag.__init__(self)
--> 119         self.load(filehandler, tags=tags, length=length)
    120
    121     def _determine_length(self, fh):

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in load(self, filehandler, tags, length)
     77         """
     78         if tags:
---> 79             self._parse_tag(filehandler)
     80             filehandler.seek(0)
     81         if length:

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_tag(self, fh)
    159
    160     def _parse_tag(self, fh):
--> 161         self._parse_id3v2(fh)
    162         if not self.has_all_tags():  # try to get more info using id3v1
    163             fh.seek(-128, 2)  # id3v1 occuppies the last 128 bytes

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_id3v2(self, fh)
    183             while parsed_size < size:
    184                 is_id3_v22 = major == 2
--> 185                 frame_size = self._parse_frame(fh, is_v22=is_id3_v22)
    186                 if frame_size == 0:
    187                     break

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_frame(self, fh, is_v22)
    219                     self._parse_track(content)
    220                 else:
--> 221                     self._set_field(fieldname, content, self._decode_string)
    222             return frame_size
    223         return 0

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _set_field(self, fieldname, bytestring, transfunc)
     88             return
     89         if transfunc:
---> 90             setattr(self, fieldname, transfunc(bytestring))
     91         else:
     92             setattr(self, fieldname, bytestring)

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _decode_string(self, b)
    228             return self._unpad(codecs.decode(b[1:], 'ISO-8859-1'))
    229         if b[0:3] == b'\x01\xff\xfe':
--> 230             return self._unpad(codecs.decode(b[3:], 'UTF-16'))
    231         return self._unpad(codecs.decode(b, 'ISO-8859-1'))
    232

/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_16.pyc in decode(input, errors)
     14
     15 def decode(input, errors='strict'):
---> 16     return codecs.utf_16_decode(input, errors, True)
     17
     18 class IncrementalEncoder(codecs.IncrementalEncoder):

UnicodeDecodeError: 'utf16' codec can't decode byte 0x00 in position 20: truncated data
@devsnd
Owner

Hi ibeex,
Thanks for your report!

I have got an idea why it doesn't work, so if you've got the time, please try to change line 230 from

return self._unpad(codecs.decode(b[3:], 'UTF-16'))

to

return codecs.decode(self._unpad(b[3:]), 'UTF-16')

Changing the code as shown above will first remove the zero-termination byte and then decode the string, instead of doing it the other way around.

Since UTF16 always expects get 2 bytes per character as input, but the byte string may be zero-terminated, making giving the bytestring an odd length, so there's a byte "missing", leading to truncated data.

@ibeex

Still the same, I am not sure that this mp3 is actually utf-16, mediainfo can read tags but all python libs can't read artist, album tags.
here is new error

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _decode_string(self, b)
    228             return self._unpad(codecs.decode(b[1:], 'ISO-8859-1'))
    229         if b[0:3] == b'\x01\xff\xfe':
--> 230             return codecs.decode(self._unpad(b[3:]), 'UTF-16')
    231         return self._unpad(codecs.decode(b, 'ISO-8859-1'))
    232

/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_16.pyc in decode(input, errors)
     14
     15 def decode(input, errors='strict'):
---> 16     return codecs.utf_16_decode(input, errors, True)
     17
     18 class IncrementalEncoder(codecs.IncrementalEncoder):

UnicodeDecodeError: 'utf16' codec can't decode byte 0x4c in position 0: truncated data
@devsnd
Owner

Ok, could you please upload the MP3 somewhere, so I can try to find out the problem on my own?

@ibeex ibeex closed this
@ibeex ibeex reopened this
@devsnd devsnd closed this in 33ee413
@devsnd
Owner

Hey @ibeex, I have fixed the problem and added the first few bytes of the mp3 you send to the test suite. Please try it out again now, and feel free to reopen this issue, or open another one, if you find something :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.