Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] AIFF text should be treated as UTF-8 #123

Closed
cskau opened this issue Dec 1, 2021 · 2 comments
Closed

[BUG] AIFF text should be treated as UTF-8 #123

cskau opened this issue Dec 1, 2021 · 2 comments
Labels

Comments

@cskau
Copy link

cskau commented Dec 1, 2021

While the AIFF 1.3 spec says that text fields "contain pure ASCII characters", it also says:

A char can contain more than just ASCII characters. It can contain any number from -128 to 127 (inclusive).

And in practice AIFF files are nowadays often tagged with UTF-8 encoded text, rather than simple ASCII.

Trying to load such an AIFF file with TinyTag will currently result in an exception:

'ascii' codec can't decode byte 0xc3 in position 26: ordinal not in range(128)

The encoding is hardcoded here (as well as two more places lower down):
https://github.com/devsnd/tinytag/blob/23bd79d601484856c95b07fdad927191f9203949/tinytag/tinytag.py#L1282

ASCII is however effectively a subset of UTF-8 (i.e. any valid ASCII is also valid UTF-8 by design). So treating these fields as UTF-8 encoded will give exactly the same behaviour for ASCII encoded text, but add support for the (now) much more common UTF-8.

To add support for UTF-8 encoded text fields in AIFF is as simple as changing the three text field loads to:

self.title = self._unpad(chunk.read().decode('UTF-8')) 

Sample File
An example of an AIFF file with UTF-8 encoded text fields can be found here:
https://chillhop.bandcamp.com/track/velvet

You'll need to click Digital Track to download the track in various formats, including AIFF. (Note you can choose to pay 0 to download it for free.)

@cskau cskau added the bug label Dec 1, 2021
@devsnd devsnd closed this as completed in 55b8971 Dec 13, 2021
@devsnd
Copy link
Member

devsnd commented Dec 13, 2021

Hey @cskau ,

I made the changes you suggested and UTF-8 in AIFF is now supported in the master branch. this will be part of the next release.

@cskau
Copy link
Author

cskau commented Dec 13, 2021

Fantastic! Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants