Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Songs copied to Android device have odd ID3 tags? #1893

Closed
jackwilsdon opened this issue Feb 24, 2016 · 47 comments
Closed

Songs copied to Android device have odd ID3 tags? #1893

jackwilsdon opened this issue Feb 24, 2016 · 47 comments
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." stale

Comments

@jackwilsdon
Copy link
Member

Problem

So I'm not entirely sure if this is a beets bug or something completely different, however some songs that contain a right single quotation mark in the title are rendered as ’ by pretty much any Android music player (I tested Google Play Music, an ID3 editor and Pulsar Music Player) which all exhibit the issue.

I also have an album that contains a degree sign in it's title, which is rendered as instead of °.

Here's are some screenshots of the issue (from Google Play Music):

And the output from beets list (unicode characters are printed correctly):

Caravan Palace - <|°_°|> - Lone Digger
Caravan Palace - <|°_°|> - Comics
Caravan Palace - <|°_°|> - Aftermath
<continued>
OK Go - Hungry Ghosts - The Writing’s on the Wall
<continued>
Two Door Cinema Club - Tourist History - Eat That Up, It’s Good for You

It seems like it's something to do with how Android's music API (I assume that's what these apps are using) handles unicode in ID3 tags? I'm not sure whether or not this is a bug in the Android API itself or how beets is storing the ID3 tags (possibly related to #1885?).

I am using beet convert -d Music and copying the converted files to my device, however I don't believe this is related (the issue is happening with both flac files (which are converted) and mp3 files (which are not)).

Setup

  • OS: Mac OS X Yosemite (10.10.5)
  • Python version: 2.7.11
  • beets version: 1.3.16
  • I tried turning off plugins, and that made the problem go away (yes/no): no

My configuration (output of beet config) is:

directory: ~/Music/Music
convert:
    max_bitrate: 320
    never_convert_lossy_files: yes
    copy_album_art: no
    format: mp3
    formats:
        mp3:
            command: ffmpeg -i $source -ab 320k -map_metadata 0 $dest
            extension: mp3
        alac:
            command: ffmpeg -i $source -y -vn -acodec alac $dest
            extension: m4a
        aac:
            command: ffmpeg -i $source -y -vn -acodec libfaac -aq 100 $dest
            extension: m4a
        opus: ffmpeg -i $source -y -vn -acodec libopus -ab 96k $dest
        flac: ffmpeg -i $source -y -vn -acodec flac $dest
        ogg: ffmpeg -i $source -y -vn -acodec libvorbis -aq 2 $dest
        wma: ffmpeg -i $source -y -vn -acodec wmav2 -vn $dest
    dest:
    auto: no
    threads: 8
    tmpdir:

    paths: {}
    pretend: no
    quiet: no
    embed: yes
fetchart:
    minWidth: 500
    maxWidth: 1024
    enforce_ratio: yes
    minwidth: 0
    sources:
    - coverart
    - itunes
    - amazon
    - albumart
    cautious: no
    maxwidth: 0
    auto: yes
    cover_names:
    - cover
    - front
    - art
    - album
    - folder
    remote_priority: no
embedart:
    maxwidth: 1024
    remove_art_file: yes
    ifempty: yes
    compare_threshold: 0
    auto: yes

plugins: info convert fetchart embedart missing lastgenre
lastgenre:
    count: 1
    source: album
    force: yes
    min_weight: 10
    auto: yes
    whitelist: yes
    separator: ', '
    fallback:
    canonical: no
missing:
    count: no
    total: no
@jackwilsdon
Copy link
Member Author

So after running exiftool -v3 -l 09\ Eat\ That\ Up\,\ It’s\ Good\ for\ You.mp3, I can see that the file itself does contain the correct tags (as expected, as cmus and other music players on OS X see the name fine).

Here is a royalty-free mp3 with the tags from one of the songs applied to it (using beet import) that exhibits the issue: Kevin MacLeod - Pixelland

Here is the output from exiftool (note: it doesn't seem to be able to handle unicode tags, however the hex value shows that the right single quotation mark is e2 80 99 which is correct):

  ExifToolVersion = 10.08
  FileName = 09 Eat That Up, It...s Good for You.mp3
  Directory = .
  FileSize = 7735810
  FileModifyDate = 1456326549
  FileAccessDate = 1456326907
  FileInodeChangeDate = 1456326549
  FilePermissions = 33188
  FileType = MP3
  FileTypeExtension = MP3
  MIMEType = audio/mpeg
  MPEGAudioVersion = 3
  AudioLayer = 1
  AudioBitrate = 13
  SampleRate = 0
  ChannelMode = 0
  MSStereo = 0
  MPEG_Audio_Bit26-27 = 0
  IntensityStereo = 0
  CopyrightFlag = 0
  OriginalMedia = 0
  Emphasis = 0
  ID3Size = 254340
ID3v2.4.0:
  + [ID3v2_4 directory, 254202 bytes]
  | Title = Eat That Up, It...s Good for You
  | - Tag 'TIT2' (34 bytes):
  |     0014: 03 45 61 74 20 54 68 61 74 20 55 70 2c 20 49 74 [.Eat That Up, It]
  |     0024: e2 80 99 73 20 47 6f 6f 64 20 66 6f 72 20 59 6f [...s Good for Yo]
  |     0034: 75 00                                           [u.]
  | Artist = Two Door Cinema Club
  | - Tag 'TPE1' (22 bytes):
  |     0040: 03 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |     0050: 20 43 6c 75 62 00                               [ Club.]
  | Track = 9/10
  | - Tag 'TRCK' (6 bytes):
  |     0060: 03 39 2f 31 30 00                               [.9/10.]
  | Album = Tourist History
  | - Tag 'TALB' (17 bytes):
  |     0070: 03 54 6f 75 72 69 73 74 20 48 69 73 74 6f 72 79 [.Tourist History]
  |     0080: 00                                              [.]
  | PartOfSet = 1/1
  | - Tag 'TPOS' (5 bytes):
  |     008b: 03 31 2f 31 00                                  [.1/1.]
  | RecordingTime = 2010-02-24
  | - Tag 'TDRC' (12 bytes):
  |     009a: 03 32 30 31 30 2d 30 32 2d 32 34 00             [.2010-02-24.]
  | Genre = Indie Rock
  | - Tag 'TCON' (12 bytes):
  |     00b0: 03 49 6e 64 69 65 20 52 6f 63 6b 00             [.Indie Rock.]
  | PictureMIMEType = image/jpeg
  | PictureType = 3
  | PictureDescription = 
  | Picture = .....JFIF......C.........................................................[snip]
  | - Tag 'APIC' (244621 bytes):
  |     00c6: 03 69 6d 61 67 65 2f 6a 70 65 67 00 03 00 ff d8 [.image/jpeg.....]
  |     00d6: ff e0 00 10 4a 46 49 46 00 01 01 00 00 01 00 01 [....JFIF........]
  |     00e6: 00 00 ff db 00 43 00 01 01 01 01 01 01 01 01 01 [.....C..........]
  |     00f6: 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 [................]
  |     0106: 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 [................]
  |     [snip 244541 bytes]
  | BeatsPerMinute = 0
  | - Tag 'TBPM' (3 bytes):
  |    3bc5d: 03 30 00                                        [.0.]
  | Compilation = 0
  | - Tag 'TCMP' (3 bytes):
  |    3bc6a: 03 30 00                                        [.0.]
  | OriginalReleaseTime = 2010-02-17
  | - Tag 'TDOR' (12 bytes):
  |    3bc77: 03 32 30 31 30 2d 30 32 2d 31 37 00             [.2010-02-17.]
  | Language = eng
  | - Tag 'TLAN' (5 bytes):
  |    3bc8d: 03 65 6e 67 00                                  [.eng.]
  | Media = Digital Media
  | - Tag 'TMED' (15 bytes):
  |    3bc9c: 03 44 69 67 69 74 61 6c 20 4d 65 64 69 61 00    [.Digital Media.]
  | Band = Two Door Cinema Club
  | - Tag 'TPE2' (22 bytes):
  |    3bcb5: 03 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |    3bcc5: 20 43 6c 75 62 00                               [ Club.]
  | Publisher = Kitsun..
  | - Tag 'TPUB' (10 bytes):
  |    3bcd5: 03 4b 69 74 73 75 6e c3 a9 00                   [.Kitsun...]
  | PerformerSortOrder = Two Door Cinema Club
  | - Tag 'TSOP' (22 bytes):
  |    3bce9: 03 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |    3bcf9: 20 43 6c 75 62 00                               [ Club.]
  | UserDefinedText = (ALBUMARTISTSORT) Two Door Cinema Club
  | - Tag 'TXXX' (38 bytes):
  |    3bd09: 03 41 4c 42 55 4d 41 52 54 49 53 54 53 4f 52 54 [.ALBUMARTISTSORT]
  |    3bd19: 00 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |    3bd29: 20 43 6c 75 62 00                               [ Club.]
  | UserDefinedText = (Album Artist Credit) Two Door Cinema Club
  | - Tag 'TXXX' (42 bytes):
  |    3bd39: 03 41 6c 62 75 6d 20 41 72 74 69 73 74 20 43 72 [.Album Artist Cr]
  |    3bd49: 65 64 69 74 00 54 77 6f 20 44 6f 6f 72 20 43 69 [edit.Two Door Ci]
  |    3bd59: 6e 65 6d 61 20 43 6c 75 62 00                   [nema Club.]
  | UserDefinedText = (Artist Credit) Two Door Cinema Club
  | - Tag 'TXXX' (36 bytes):
  |    3bd6d: 03 41 72 74 69 73 74 20 43 72 65 64 69 74 00 54 [.Artist Credit.T]
  |    3bd7d: 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 20 43 [wo Door Cinema C]
  |    3bd8d: 6c 75 62 00                                     [lub.]
  | UserDefinedText = (CATALOGNUMBER) 355028890
  | - Tag 'TXXX' (25 bytes):
  |    3bd9b: 03 43 41 54 41 4c 4f 47 4e 55 4d 42 45 52 00 33 [.CATALOGNUMBER.3]
  |    3bdab: 35 35 30 32 38 38 39 30 00                      [55028890.]
  | UserDefinedText = (MusicBrainz Album Artist Id) 6f1de078-6684-4792-820d-2ffad64c15ed
  | - Tag 'TXXX' (66 bytes):
  |    3bdbe: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3bdce: 75 6d 20 41 72 74 69 73 74 20 49 64 00 36 66 31 [um Artist Id.6f1]
  |    3bdde: 64 65 30 37 38 2d 36 36 38 34 2d 34 37 39 32 2d [de078-6684-4792-]
  |    3bdee: 38 32 30 64 2d 32 66 66 61 64 36 34 63 31 35 65 [820d-2ffad64c15e]
  |    3bdfe: 64 00                                           [d.]
  | UserDefinedText = (MusicBrainz Album Id) 54d5c88c-7a5b-4502-93ab-a6b245611a94
  | - Tag 'TXXX' (59 bytes):
  |    3be0a: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3be1a: 75 6d 20 49 64 00 35 34 64 35 63 38 38 63 2d 37 [um Id.54d5c88c-7]
  |    3be2a: 61 35 62 2d 34 35 30 32 2d 39 33 61 62 2d 61 36 [a5b-4502-93ab-a6]
  |    3be3a: 62 32 34 35 36 31 31 61 39 34 00                [b245611a94.]
  | UserDefinedText = (MusicBrainz Album Release Country) XE
  | - Tag 'TXXX' (38 bytes):
  |    3be4f: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3be5f: 75 6d 20 52 65 6c 65 61 73 65 20 43 6f 75 6e 74 [um Release Count]
  |    3be6f: 72 79 00 58 45 00                               [ry.XE.]
  | UserDefinedText = (MusicBrainz Album Status) Official
  | - Tag 'TXXX' (35 bytes):
  |    3be7f: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3be8f: 75 6d 20 53 74 61 74 75 73 00 4f 66 66 69 63 69 [um Status.Offici]
  |    3be9f: 61 6c 00                                        [al.]
  | UserDefinedText = (MusicBrainz Album Type) album
  | - Tag 'TXXX' (30 bytes):
  |    3beac: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3bebc: 75 6d 20 54 79 70 65 00 61 6c 62 75 6d 00       [um Type.album.]
  | UserDefinedText = (MusicBrainz Artist Id) 6f1de078-6684-4792-820d-2ffad64c15ed
  | - Tag 'TXXX' (60 bytes):
  |    3bed4: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 72 74 [.MusicBrainz Art]
  |    3bee4: 69 73 74 20 49 64 00 36 66 31 64 65 30 37 38 2d [ist Id.6f1de078-]
  |    3bef4: 36 36 38 34 2d 34 37 39 32 2d 38 32 30 64 2d 32 [6684-4792-820d-2]
  |    3bf04: 66 66 61 64 36 34 63 31 35 65 64 00             [ffad64c15ed.]
  | UserDefinedText = (MusicBrainz Release Group Id) a3597f45-b9d9-4c8a-803e-0a7d0d4d4e9b
  | - Tag 'TXXX' (67 bytes):
  |    3bf1a: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 52 65 6c [.MusicBrainz Rel]
  |    3bf2a: 65 61 73 65 20 47 72 6f 75 70 20 49 64 00 61 33 [ease Group Id.a3]
  |    3bf3a: 35 39 37 66 34 35 2d 62 39 64 39 2d 34 63 38 61 [597f45-b9d9-4c8a]
  |    3bf4a: 2d 38 30 33 65 2d 30 61 37 64 30 64 34 64 34 65 [-803e-0a7d0d4d4e]
  |    3bf5a: 39 62 00                                        [9b.]
  | UserDefinedText = (Script) Latn
  | - Tag 'TXXX' (13 bytes):
  |    3bf67: 03 53 63 72 69 70 74 00 4c 61 74 6e 00          [.Script.Latn.]
  | ID3_UFID = http://musicbrainz.org5dabd513-6e5a-4d95-a534-35cb3a4cc976
  | - Tag 'UFID' (59 bytes):
  |    3bf7e: 68 74 74 70 3a 2f 2f 6d 75 73 69 63 62 72 61 69 [http://musicbrai]
  |    3bf8e: 6e 7a 2e 6f 72 67 00 35 64 61 62 64 35 31 33 2d [nz.org.5dabd513-]
  |    3bf9e: 36 65 35 61 2d 34 64 39 35 2d 61 35 33 34 2d 33 [6e5a-4d95-a534-3]
  |    3bfae: 35 63 62 33 61 34 63 63 39 37 36                [5cb3a4cc976]
  | Lyrics = 
  | - Tag 'USLT' (6 bytes):
  |    3bfc3: 03 00 00 00 00 00                               [......]
ID3v1:
  + [BinaryData directory, 128 bytes]
  | Title = Eat That Up, It?s Good for You
  | - Tag 0x0003 (30 bytes, string[30]):
  |   760985: 45 61 74 20 54 68 61 74 20 55 70 2c 20 49 74 3f [Eat That Up, It?]
  |   760995: 73 20 47 6f 6f 64 20 66 6f 72 20 59 6f 75       [s Good for You]
  | Artist = Two Door Cinema Club
  | - Tag 0x0021 (30 bytes, string[30]):
  |   7609a3: 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 20 [Two Door Cinema ]
  |   7609b3: 43 6c 75 62 00 00 00 00 00 00 00 00 00 00       [Club..........]
  | Album = Tourist History
  | - Tag 0x003f (30 bytes, string[30]):
  |   7609c1: 54 6f 75 72 69 73 74 20 48 69 73 74 6f 72 79 00 [Tourist History.]
  |   7609d1: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Year = 2010
  | - Tag 0x005d (4 bytes, string[4]):
  |   7609df: 32 30 31 30                                     [2010]
  | Comment = 
  | - Tag 0x0061 (30 bytes, string[30]):
  |   7609e3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  |   7609f3: 00 00 00 00 00 00 00 00 00 00 00 00 00 09       [..............]
  | Track = 0 9
  | - Tag 0x007d (2 bytes, int8u[2]):
  |   7609ff: 00 09                                           [..]
  | Genre = 187
  | - Tag 0x007f (1 bytes, int8u[1]):
  |   760a01: bb                                              [.]

@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Feb 24, 2016
@sampsyo
Copy link
Member

sampsyo commented Feb 24, 2016

Very strange! A couple of questions come to mind:

  • As a counterfactual, do you have any files with these characters that do display correctly on Android? Comparing them with the broken files might yield some insight.
  • Does this only come up for MP3/ID3, or have you tried other formats too?

@jackwilsdon
Copy link
Member Author

  1. So it looks like any songs using an ASCII quote (which is 27 in hex) render fine. From my initial findings, it seems that Android reads the ID3 tags as ASCII instead of Unicode, which leads to the issues I am experiencing.
  2. It also happens with FLAC files, I suspect because Android is generally handling all music metadata as ASCII.

I've noticed something else unusual; I have another song in my library with a unicode character in the title that renders fine on Android, which furthers my belief that there is an issue with the ID3 tag itself.

The unicode character in the other song is (a floral heart) which renders perfectly on Android.

Here is an exiftool dump of the file that renders the unicode correctly:

  ExifToolVersion = 10.08
  FileName = 02 ... (Ripe & Ruin).mp3
  Directory = .
  FileSize = 2270216
  FileModifyDate = 1456067754
  FileAccessDate = 1456341320
  FileInodeChangeDate = 1456067754
  FilePermissions = 33188
  FileType = MP3
  FileTypeExtension = MP3
  MIMEType = audio/mpeg
  MPEGAudioVersion = 3
  AudioLayer = 1
  AudioBitrate = 3
  SampleRate = 0
  ChannelMode = 1
  MSStereo = 0
  MPEG_Audio_Bit26-27 = 0
  IntensityStereo = 0
  CopyrightFlag = 0
  OriginalMedia = 1
  Emphasis = 0
  VBRFrames = 2760
  VBRBytes = 2100579
  ID3Size = 169637
ID3v2.4.0:
  + [ID3v2_4 directory, 169499 bytes]
  | Title = ... (Ripe & Ruin)
  | - Tag 'TIT2' (19 bytes):
  |     0014: 03 e2 9d a6 20 28 52 69 70 65 20 26 20 52 75 69 [.... (Ripe & Rui]
  |     0024: 6e 29 00                                        [n).]
  | Artist = alt-J
  | - Tag 'TPE1' (7 bytes):
  |     0031: 03 61 6c 74 2d 4a 00                            [.alt-J.]
  | Track = 2/13
  | - Tag 'TRCK' (6 bytes):
  |     0042: 03 32 2f 31 33 00                               [.2/13.]
  | Album = An Awesome Wave
  | - Tag 'TALB' (17 bytes):
  |     0052: 03 41 6e 20 41 77 65 73 6f 6d 65 20 57 61 76 65 [.An Awesome Wave]
  |     0062: 00                                              [.]
  | PartOfSet = 1/1
  | - Tag 'TPOS' (5 bytes):
  |     006d: 03 31 2f 31 00                                  [.1/1.]
  | RecordingTime = 2012
  | - Tag 'TDRC' (6 bytes):
  |     007c: 03 32 30 31 32 00                               [.2012.]
  | Genre = Electronic
  | - Tag 'TCON' (12 bytes):
  |     008c: 03 45 6c 65 63 74 72 6f 6e 69 63 00             [.Electronic.]
  | PictureMIMEType = image/jpeg
  | PictureType = 3
  | PictureDescription = 
  | Picture = .....JFIF...HH..C........................................................[snip]
  | - Tag 'APIC' (98167 bytes):
  |     00a2: 03 69 6d 61 67 65 2f 6a 70 65 67 00 03 00 ff d8 [.image/jpeg.....]
  |     00b2: ff e0 00 10 4a 46 49 46 00 01 01 01 00 48 00 48 [....JFIF.....H.H]
  |     00c2: 00 00 ff db 00 43 00 03 02 02 03 02 02 03 03 03 [.....C..........]
  |     00d2: 03 04 03 03 04 05 08 05 05 04 04 05 0a 07 07 06 [................]
  |     00e2: 08 0c 0a 0c 0c 0b 0a 0b 0b 0d 0e 12 10 0d 0e 11 [................]
  |     [snip 98087 bytes]
  | Private (SubDirectory) -->
  | + [PRIV directory, 69368 bytes]
  | | TRAKTOR4 = DMRT....RDH 0.SKHC....DOMF.....NSRV..ATAD....BDNA....@WTRAq..}} 086/WW[snip]
  | | - Tag 'TRAKTOR4' (69359 bytes):
  | |     0009: 44 4d 52 54 e3 0c 01 00 02 00 00 00 52 44 48 20 [DMRT........RDH ]
  | |     0019: 30 00 00 00 03 00 00 00 53 4b 48 43 04 00 00 00 [0.......SKHC....]
  | |     0029: 00 00 00 00 e2 c2 8c 00 44 4f 4d 46 04 00 00 00 [........DOMF....]
  | |     0039: 00 00 00 00 1a 0b df 07 4e 53 52 56 04 00 00 00 [........NSRV....]
  | |     0049: 00 00 00 00 07 00 00 00 41 54 41 44 9b 0c 01 00 [........ATAD....]
  | |     [snip 69279 bytes]
  | BeatsPerMinute = 89
  | - Tag 'TBPM' (4 bytes):
  |    28f25: 03 38 39 00                                     [.89.]
  | Compilation = 0
  | - Tag 'TCMP' (3 bytes):
  |    28f33: 03 30 00                                        [.0.]
  | OriginalReleaseTime = 2012-05-28
  | - Tag 'TDOR' (12 bytes):
  |    28f40: 03 32 30 31 32 2d 30 35 2d 32 38 00             [.2012-05-28.]
  | InitialKey = 2m
  | - Tag 'TKEY' (4 bytes):
  |    28f56: 03 32 6d 00                                     [.2m.]
  | Language = eng
  | - Tag 'TLAN' (5 bytes):
  |    28f64: 03 65 6e 67 00                                  [.eng.]
  | Media = CD
  | - Tag 'TMED' (4 bytes):
  |    28f73: 03 43 44 00                                     [.CD.]
  | Band = alt-J
  | - Tag 'TPE2' (7 bytes):
  |    28f81: 03 61 6c 74 2d 4a 00                            [.alt-J.]
  | Publisher = Liberator Music
  | - Tag 'TPUB' (17 bytes):
  |    28f92: 03 4c 69 62 65 72 61 74 6f 72 20 4d 75 73 69 63 [.Liberator Music]
  |    28fa2: 00                                              [.]
  | PerformerSortOrder = alt-J
  | - Tag 'TSOP' (7 bytes):
  |    28fad: 03 61 6c 74 2d 4a 00                            [.alt-J.]
  | UserDefinedText = (ALBUMARTISTSORT) alt-J
  | - Tag 'TXXX' (23 bytes):
  |    28fbe: 03 41 4c 42 55 4d 41 52 54 49 53 54 53 4f 52 54 [.ALBUMARTISTSORT]
  |    28fce: 00 61 6c 74 2d 4a 00                            [.alt-J.]
  | UserDefinedText = (Album Artist Credit) alt-J
  | - Tag 'TXXX' (27 bytes):
  |    28fdf: 03 41 6c 62 75 6d 20 41 72 74 69 73 74 20 43 72 [.Album Artist Cr]
  |    28fef: 65 64 69 74 00 61 6c 74 2d 4a 00                [edit.alt-J.]
  | UserDefinedText = (Artist Credit) alt-J
  | - Tag 'TXXX' (21 bytes):
  |    29004: 03 41 72 74 69 73 74 20 43 72 65 64 69 74 00 61 [.Artist Credit.a]
  |    29014: 6c 74 2d 4a 00                                  [lt-J.]
  | UserDefinedText = (CATALOGNUMBER) LIB140CD
  | - Tag 'TXXX' (24 bytes):
  |    29023: 03 43 41 54 41 4c 4f 47 4e 55 4d 42 45 52 00 4c [.CATALOGNUMBER.L]
  |    29033: 49 42 31 34 30 43 44 00                         [IB140CD.]
  | UserDefinedText = (MusicBrainz Album Artist Id) fc7bbf00-fbaa-4736-986b-b3ac0266ca9b
  | - Tag 'TXXX' (66 bytes):
  |    29045: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    29055: 75 6d 20 41 72 74 69 73 74 20 49 64 00 66 63 37 [um Artist Id.fc7]
  |    29065: 62 62 66 30 30 2d 66 62 61 61 2d 34 37 33 36 2d [bbf00-fbaa-4736-]
  |    29075: 39 38 36 62 2d 62 33 61 63 30 32 36 36 63 61 39 [986b-b3ac0266ca9]
  |    29085: 62 00                                           [b.]
  | UserDefinedText = (MusicBrainz Album Id) 53042259-1287-4f47-9a99-5a7413df7b3f
  | - Tag 'TXXX' (59 bytes):
  |    29091: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    290a1: 75 6d 20 49 64 00 35 33 30 34 32 32 35 39 2d 31 [um Id.53042259-1]
  |    290b1: 32 38 37 2d 34 66 34 37 2d 39 61 39 39 2d 35 61 [287-4f47-9a99-5a]
  |    290c1: 37 34 31 33 64 66 37 62 33 66 00                [7413df7b3f.]
  | UserDefinedText = (MusicBrainz Album Release Country) AU
  | - Tag 'TXXX' (38 bytes):
  |    290d6: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    290e6: 75 6d 20 52 65 6c 65 61 73 65 20 43 6f 75 6e 74 [um Release Count]
  |    290f6: 72 79 00 41 55 00                               [ry.AU.]
  | UserDefinedText = (MusicBrainz Album Status) Official
  | - Tag 'TXXX' (35 bytes):
  |    29106: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    29116: 75 6d 20 53 74 61 74 75 73 00 4f 66 66 69 63 69 [um Status.Offici]
  |    29126: 61 6c 00                                        [al.]
  | UserDefinedText = (MusicBrainz Album Type) album
  | - Tag 'TXXX' (30 bytes):
  |    29133: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    29143: 75 6d 20 54 79 70 65 00 61 6c 62 75 6d 00       [um Type.album.]
  | UserDefinedText = (MusicBrainz Artist Id) fc7bbf00-fbaa-4736-986b-b3ac0266ca9b
  | - Tag 'TXXX' (60 bytes):
  |    2915b: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 72 74 [.MusicBrainz Art]
  |    2916b: 69 73 74 20 49 64 00 66 63 37 62 62 66 30 30 2d [ist Id.fc7bbf00-]
  |    2917b: 66 62 61 61 2d 34 37 33 36 2d 39 38 36 62 2d 62 [fbaa-4736-986b-b]
  |    2918b: 33 61 63 30 32 36 36 63 61 39 62 00             [3ac0266ca9b.]
  | UserDefinedText = (MusicBrainz Release Group Id) 0d8562eb-7f72-427b-8a0b-984cc5ee7766
  | - Tag 'TXXX' (67 bytes):
  |    291a1: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 52 65 6c [.MusicBrainz Rel]
  |    291b1: 65 61 73 65 20 47 72 6f 75 70 20 49 64 00 30 64 [ease Group Id.0d]
  |    291c1: 38 35 36 32 65 62 2d 37 66 37 32 2d 34 32 37 62 [8562eb-7f72-427b]
  |    291d1: 2d 38 61 30 62 2d 39 38 34 63 63 35 65 65 37 37 [-8a0b-984cc5ee77]
  |    291e1: 36 36 00                                        [66.]
  | UserDefinedText = (Script) Latn
  | - Tag 'TXXX' (13 bytes):
  |    291ee: 03 53 63 72 69 70 74 00 4c 61 74 6e 00          [.Script.Latn.]
  | ID3_UFID = http://musicbrainz.org875bad60-ef81-42aa-b719-b97455092e45
  | - Tag 'UFID' (59 bytes):
  |    29205: 68 74 74 70 3a 2f 2f 6d 75 73 69 63 62 72 61 69 [http://musicbrai]
  |    29215: 6e 7a 2e 6f 72 67 00 38 37 35 62 61 64 36 30 2d [nz.org.875bad60-]
  |    29225: 65 66 38 31 2d 34 32 61 61 2d 62 37 31 39 2d 62 [ef81-42aa-b719-b]
  |    29235: 39 37 34 35 35 30 39 32 65 34 35                [97455092e45]
  | Lyrics = 
  | - Tag 'USLT' (6 bytes):
  |    2924a: 03 00 00 00 00 00                               [......]
ID3v1:
  + [BinaryData directory, 128 bytes]
  | Title = ? (Ripe & Ruin)
  | - Tag 0x0003 (30 bytes, string[30]):
  |   22a38b: 3f 20 28 52 69 70 65 20 26 20 52 75 69 6e 29 00 [? (Ripe & Ruin).]
  |   22a39b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Artist = alt-J
  | - Tag 0x0021 (30 bytes, string[30]):
  |   22a3a9: 61 6c 74 2d 4a 00 00 00 00 00 00 00 00 00 00 00 [alt-J...........]
  |   22a3b9: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Album = An Awesome Wave
  | - Tag 0x003f (30 bytes, string[30]):
  |   22a3c7: 41 6e 20 41 77 65 73 6f 6d 65 20 57 61 76 65 00 [An Awesome Wave.]
  |   22a3d7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Year = 2012
  | - Tag 0x005d (4 bytes, string[4]):
  |   22a3e5: 32 30 31 32                                     [2012]
  | Comment = 
  | - Tag 0x0061 (30 bytes, string[30]):
  |   22a3e9: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  |   22a3f9: 00 00 00 00 00 00 00 00 00 00 00 00 00 02       [..............]
  | Track = 0 2
  | - Tag 0x007d (2 bytes, int8u[2]):
  |   22a405: 00 02                                           [..]
  | Genre = 52
  | - Tag 0x007f (1 bytes, int8u[1]):
  |   22a407: 34                                              [4]

And here is a copy of the file (audio replaced with Kevin MacLeod - Pixelland again).

EDIT: After doing a preliminary diff of the exiftool outputs, I can't see an obvious difference between the tags.

@sampsyo
Copy link
Member

sampsyo commented Feb 24, 2016

Truly mysterious.

So, we can hypothesize that beets is doing something wrong to files that is causing Android to read their tags using the wrong encoding. Following that lead, has the Alt-J track (which seems to work fine) been "touched" by beets? If so, we might want to reject that hypothesis.

A second hypothesis would be that some characters, but not all, are causing problems. Are there any tracks with degrees symbols or curly apostrophes that do work correctly?

@jackwilsdon
Copy link
Member Author

Sadly I don't have a copy of the alt-J track that has not been "touched" by beets, as my entire music library is managed by it.

I don't have any tracks in my library that have degrees symbols or curly apostrophes that do work, so I can't prove/disprove your second hypothesis.

I find it odd that the degree symbol works but is prefixed with a "B" and I can't work out why. A unicode degree symbol is C2 B0, neither of which represent the letter "B".

@sampsyo
Copy link
Member

sampsyo commented Feb 25, 2016

Yeah, that thing with the capital B is truly strange. Maybe this is some other non-ASCII, non-Unicode encoding we're seeing?

@Kernald
Copy link
Contributor

Kernald commented Feb 28, 2016

I noticed a while ago the same issue with the <|°_°|> album from Caravan Palace, copied to my phone via beets alternatives. I'll try to rip the album from the CD again and copy it directly, without using beets, just to check.

@sampsyo
Copy link
Member

sampsyo commented Feb 28, 2016

Wow; we have independent confirmation from another source for the same album! Thanks, @Kernald; keep us posted about what you find.

@Kernald
Copy link
Contributor

Kernald commented Feb 29, 2016

So, I have the FLAC files, directly extracted from the CD via SoundJuicer. But as Google Play Music doesn't recognize those, I converted them to MP3 with ffmpeg:

find . -name "*.flac" -exec ffmpeg -i {} -ab 160k -map_metadata 0 -id3v2_version 3 {}.mp3 \;

Here, the album title was displayed correctly:

Input #0, flac, from './Disc 1 - 04 - Aftermath.flac':
  Metadata:
    TITLE           : Aftermath
    ARTIST          : Caravan Palace
    track           : 4
    TRACKTOTAL      : 11
    ALBUM           : <|°_°|>

[…]

Output #0, mp3, to './Disc 1 - 04 - Aftermath.flac.mp3':
  Metadata:
    TIT2            : Aftermath
    TPE1            : Caravan Palace
    TRCK            : 4
    TRACKTOTAL      : 11
    TALB            : <|°_°|>

And… Here's the result. I can't see that as an improvement ;-)

screenshot_20160229-191927

If you have any suggestion, I'll be happy to try!

@sampsyo
Copy link
Member

sampsyo commented Feb 29, 2016

Wow. Truly fascinating. I guess we can conclude ffmpeg has the same problem getting the encoding right for what Android expects?

I tried googling for similar problems. This is a really old bug, and probably not relevant: https://code.google.com/p/android/issues/detail?id=2688

I tested the hypothesis anyway:

>>> print(u'<|°_°|>'.encode('utf8').decode('latin1'))
<|塡|>

So that doesn't explain either wrong result. 😢

I can't seem to find any encoding/decoding mismatch that produces exactly these results… maybe it's time to write a script to try them all and see what happens??

@sampsyo
Copy link
Member

sampsyo commented Feb 29, 2016

Well, I gave ftfy a try. Here's what it found:

>>> ftfy.fixes.fix_encoding_and_explain('Writing’s')
('Writing’s', [('encode', 'sloppy-windows-1252', 0), ('decode', 'utf-8', 0)])

And sure enough:

>>> 'Writing’s'.encode('utf8').decode('windows-1252')
'Writing’s'

which indicates that beets is writing UTF-8 and Android is trying to interpret it as a weird Windows codepage. 😕

Still no leads on what's up with the B° mojibake though… and I don't know how to type those Chinese (?) characters to give those a try.

@sampsyo
Copy link
Member

sampsyo commented Feb 29, 2016

One other sad fact: ID3 doesn't even seem to specify Windows 1252 as one of the possible encodings: https://en.wikipedia.org/wiki/ID3#ID3v2

So it's a mystery as to why Android is using it.

@sampsyo
Copy link
Member

sampsyo commented Feb 29, 2016

OK, sorry for all the commenting, but I apparently can't let this go!

First: I looked at @jackwilsdon's files, and they correctly report the encoding as UTF-8. So that's a dead end.

Some googling revealed that this is probably a bug in Android: https://code.google.com/p/android/issues/detail?id=81428

People still seem to be complaining about it as recently as last month. Apparently, the Android frameworks—for all media formats—just ignore the specified encoding and guess, based on the data, which encoding it uses. Guessing encodings is notoriously difficult, so it frequently guesses wrong. Given that, I'm not sure I can see how to work around this. 😢

@jackwilsdon
Copy link
Member Author

It sounds like there isn't much we can do sadly, just wait for Android to fix it I guess!

I've been thinking of a fix Beets-side but it feels a bit hacky;

A tagreplace plugin could be written to replace certain characters with others, configured in the user's configuration file. This wouldn't work for tags like <°_°> but it would work for the quotes I think.

@lazka
Copy link
Contributor

lazka commented Mar 21, 2016

In quodlibet we use utf-8 for ascii text and utf-16 for everything else. If Android is really guessing, utf-16 shouldn't give it much choice.

@jurf
Copy link

jurf commented Dec 10, 2017

I have the same problem, except it only happens for some files. E.g. Led Zeppelin’s D’yer Mak’er appears as D’yer Mak’er, but The Beatles’ I’ve Got a Feeling appears correctly. Is there any way I can force the same encoding or whatever it is that is used there everywhere, in Beets?

@nerone-github
Copy link

nerone-github commented Dec 24, 2017

Hi, I've stumbled across this issue as well, and yea we know it's Androids fault, but there is an easy way to to fool it, so that the tags are displayed correctly.

If you have a FLAC file which is wrong, just add a russian UTF-8 character instead of using only ANSI (Latin) letters. The result will be that the tag is recognized as UTF-8

Suggestion: Don't use a start-letter to keep correct Alphabetic order

you can replace the following Letters which look the same in cyrillic and latin, but will cause an UTF8 recognition

КОМЕТА ВНРСХ оеа рсх (You can use copy+paste here, these are the cyrillic variants)

The replaced letter can be anywhere within the tags e.g. artist, album, title etc. and it will recognize the file as UTF8

@indivisible
Copy link

Might be a slightly different android bug, but I've had similar issues with the android media scanner. (Note: since the default scanner is broken, different vendors might have slightly different "improved" versions with different bugs)

I got things working by only using ID3 v2.4 tags, UTF-8 encoding and no extended headers.

The extended header thing makes things really confusing: if the tag has it, then android will fail to read any info from the tag, and fall back to the v1 one, most likely failing to decode any fancy characters.

I've created a script that I use to make my files "android-safe" (be sure to keep backups!)

@lazka
Copy link
Contributor

lazka commented Mar 28, 2019

mutagen doesn't write extended headers

@jurf
Copy link

jurf commented May 6, 2019

@lazka: but I guess they could already be present.

@lazka
Copy link
Contributor

lazka commented May 6, 2019

@lazka: but I guess they could already be present.

mutagen replaces the header always

@jurf
Copy link

jurf commented May 6, 2019

Why is this happening then?

@lazka
Copy link
Contributor

lazka commented May 6, 2019

@indivisible ^?

@tnyeanderson
Copy link

See this issue. Vanilla seems to be the only open source android music player that has gotten around this bug. To do it, they had to build their own database instead of using the mediastore. There is nothing beets can do to fix this, EXCEPT if someone makes a plugin to find and replace characters in ID3 tags. Something like this in config.yaml:

plugins: replacer

replacer:
  fields: all
  replace:
    - ’:'
    - °:*

Should have the option to only replace certain fields (song title, etc)? Or maybe not. This could impact performance when using the autotagger as it will have to check every field. And even then it's not the best solution. The best solution would be for Android to stop playing guessing games and use the encoding that is set by the app.... good luck with that :)

@lazka
Copy link
Contributor

lazka commented Jun 18, 2019

There is nothing beets can do to fix this

#1893 (comment) should work

@tnyeanderson
Copy link

Reading the google issue it doesn't look like UTF-16 will be a panacea. And according to this, support for UTF-16 may not be ubiquitous enough to rationalize such a change.

I am by no means an expert on encoding. Can someone please tell me I'm wrong? :)

@tnyeanderson
Copy link

https://stackoverflow.com/a/48270759/5057843

This contradicts info I have read elsewhere (saying UTF-16 DOES fix the issue). Not sure who to believe...

@stale
Copy link

stale bot commented Jul 11, 2020

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 11, 2020
@Karcsii
Copy link

Karcsii commented Jul 12, 2020

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Issue still relevant and unresolved. Waiting for somebody who knows what causes the problem and can suggest a way to avoid it or until Google fixes it.

@stale stale bot removed the stale label Jul 12, 2020
@jtpavlock jtpavlock added feature features we would like to implement and removed needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." labels Jul 12, 2020
@sampsyo
Copy link
Member

sampsyo commented Jul 12, 2020

Hello! For @jtpavlock, I'm not actually sure this is an issue we should keep open as tagged "feature," even though it's still clearly affecting people—because it's not yet clear whether there is anything we (on the beets side) can do about it. One criterion I like to use when transitioning from "needinfo" to a more specific tag is that we have enough information that the issue is now actionable: that is, someone with the time and energy can plausibly do something about it. For now, I think this issue still needs more information before anyone can actually fix it.

@jtpavlock
Copy link
Contributor

@sampsyo makes sense, sorry about that. Since this one seems like an oddball in it may be in extended limbo, I was just trying to think what should be done, if anything, to prevent the repeated stale-bot messages.

@lazka
Copy link
Contributor

lazka commented Jul 12, 2020

I did a quick test with <|°_°|> and utf8/utf16/utf16be and Android guesses in all cases and always wrong, so ignore my suggestion (to use utf16) from before.

@jtpavlock jtpavlock added needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." and removed feature features we would like to implement labels Jul 12, 2020
@sampsyo
Copy link
Member

sampsyo commented Jul 12, 2020

No worries, @jtpavlock, and thanks for taking a look, @lazka!

@stale
Copy link

stale bot commented Sep 10, 2020

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 10, 2020
@tnyeanderson
Copy link

An AOSP bug was first reported in 2009 that causes this. In 2014, the root cause and fix for it was suggested.

8 days ago (09.04.2020) this issue was closed as wontfix.

Apparently some manufacturers have implemented a fix in their own distros, but this is manufacturer/device dependent.

I can't tell if the issue is still relevant on my end, but it might be for some users. Not sure how to move forward...

@stale stale bot removed the stale label Sep 12, 2020
@jackwilsdon
Copy link
Member Author

I'll see if I can still reproduce this issue if I get a second - it seems like it was just closed as it's an old issue, we're free to raise a new one if it's still a problem.

@stale
Copy link

stale bot commented Nov 12, 2020

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 12, 2020
@jackwilsdon
Copy link
Member Author

jackwilsdon commented Nov 16, 2020

Looks like this issue is still present with Android's MediaStorage API:

screenshot of corrupt text

Using simple music player, which seems to just pull the text straight from the API.

@stale stale bot removed the stale label Nov 16, 2020
@tnyeanderson
Copy link

Confirming this is still relevant. Not sure on the progress of the old android bugs, but the results are the same in the app--wrong character displays.

@tnyeanderson
Copy link

Any update on this? If not, in the meantime is there any way to run a find/replace for all instances of a specific character or string for a given tag in the whole library? The alternative is using a separate tool to retag with the more compatible characters, then reimport all tracks to beets as-is (so beets has up to date info in its db).... then continue to check/retag/reimport when there's any problems with future imports.

I have a LOT of issues with a certain character (right single quotation mark). I can find all instances of the character using beet list $(printf "\xE2\x80\x99") but it seems like writing a beet modify script to replace them with a regular apostrophe is a dangerous game due to how beets outputs information to the terminal.

If someone has a solution or workaround, please let me know!

@stale
Copy link

stale bot commented Mar 6, 2021

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 6, 2021
@stale stale bot closed this as completed Mar 14, 2021
@tnyeanderson
Copy link

Still relevant

@kytta
Copy link

kytta commented Mar 27, 2022

Chiming in to say I have the exact same problem.

It wouldn't be as bad, if everything would be treated as Windows-1252, but here, some songs do and some don't, which creates multiple albums in my media library.

As an example, Oasis' album (What’s the Story) Morning Glory?

  • The song She’s Electric is being read correctly, and the album's name is being displayed as (What’s the Story) Morning Glory?
  • All other songs have their album read (What’s the Story) Morning Glory?; the fourth song is then displayed as Don’t Look Back in Anger

Such a shame that Google says it's an "obsolete wontfix", when clearly lots of people have this issue to this day. Yet, I am not sure if we need to keep this issue open as it's not the beets' fault

EDIT: the players I've tried were Auxio, Spotify, and YouTube Music (the last two obviously set to "local files" mode)

@makawity
Copy link

Although this is not a problem in Beets, there is a plugin that seems to be able to fix it in Beets library if you want to go that route:
https://github.com/edgars-supe/beets-importreplace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." stale
Projects
None yet
Development

No branches or pull requests

13 participants