New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

beets crashing with an utf-8 decoding error #2168

Closed
Armael opened this Issue Aug 13, 2016 · 5 comments

Comments

Projects
None yet
3 participants
@Armael

Armael commented Aug 13, 2016

Problem

When importing my music library (beet import mymusic/), beets crashed with the following traceback:

Traceback (most recent call last):
  File "/usr/bin/beet", line 9, in <module>
    load_entry_point('beets==1.3.19', 'console_scripts', 'beet')()
  File "/usr/lib/python2.7/site-packages/beets/ui/__init__.py", line 1266, in main
    _raw_main(args)
  File "/usr/lib/python2.7/site-packages/beets/ui/__init__.py", line 1253, in _raw_main
    subcommand.func(lib, suboptions, subargs)
  File "/usr/lib/python2.7/site-packages/beets/ui/commands.py", line 967, in import_func
    import_files(lib, paths, query)
  File "/usr/lib/python2.7/site-packages/beets/ui/commands.py", line 944, in import_files
    session.run()
  File "/usr/lib/python2.7/site-packages/beets/importer.py", line 320, in run
    pl.run_parallel(QUEUE_SIZE)
  File "/usr/lib/python2.7/site-packages/beets/util/pipeline.py", line 251, in run
    msg = next(self.coro)
  File "/usr/lib/python2.7/site-packages/beets/importer.py", line 1202, in read_tasks
    for t in task_factory.tasks():
  File "/usr/lib/python2.7/site-packages/beets/importer.py", line 1038, in tasks
    for dirs, paths in self.paths():
  File "/usr/lib/python2.7/site-packages/beets/importer.py", line 1090, in paths
    for dirs, paths in albums_in_dir(self.toppath):
  File "/usr/lib/python2.7/site-packages/beets/importer.py", line 1480, in albums_in_dir
    logger=log):
  File "/usr/lib/python2.7/site-packages/beets/util/__init__.py", line 205, in sorted_walk
    for res in sorted_walk(cur, ignore, ignore_hidden, logger):
  File "/usr/lib/python2.7/site-packages/beets/util/__init__.py", line 190, in sorted_walk
    if (ignore_hidden and not hidden.is_hidden(cur)) or not ignore_hidden:
  File "/usr/lib/python2.7/site-packages/beets/util/hidden.py", line 78, in is_hidden
    path = path.decode('utf-8')
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 76: invalid start byte

Setup

  • OS: Archlinux
  • Python version: 2.7.12
  • beets version: Archlinux package is 1.3.19, but beet --version says 1.3.18
  • locales:
$ locale
LANG=fr_FR.UTF8
LC_CTYPE="fr_FR.UTF8"
LC_NUMERIC="fr_FR.UTF8"
LC_TIME="fr_FR.UTF8"
LC_COLLATE="fr_FR.UTF8"
LC_MONETARY="fr_FR.UTF8"
LC_MESSAGES="fr_FR.UTF8"
LC_PAPER="fr_FR.UTF8"
LC_NAME="fr_FR.UTF8"
LC_ADDRESS="fr_FR.UTF8"
LC_TELEPHONE="fr_FR.UTF8"
LC_MEASUREMENT="fr_FR.UTF8"
LC_IDENTIFICATION="fr_FR.UTF8"
LC_ALL=

My configuration (output of beet config) is:

lastgenre:
    count: 5
    force: no
    source: album
    min_weight: 10
    auto: yes
    whitelist: yes
    separator: ', '
    fallback:
    canonical: no
embedart:
    ifempty: yes
    compare_threshold: 0
    auto: yes
    remove_art_file: no
    maxwidth: 0
fetchart:
    enforce_ratio: 5%
    auto: yes
    minwidth: 0
    sources:
    - filesystem
    - coverart
    - itunes
    - amazon
    - albumart
    google_engine: 001442825323518660753:hrh5ch1gjzm
    cautious: no
    maxwidth: 0
    store_source: no
    google_key: REDACTED
    fanarttv_key: REDACTED
    cover_names:
    - cover
    - front
    - art
    - album
    - folder
directory: /pool/musique

import:
    incremental: yes
    log: /pool/beets_import.log

plugins: fromfilename embedart fetchart lastgenre thumbnails
library: /pool/musiclibrary.blb
thumbnails:
    auto: yes
    dolphin: no
    force: no
@jrobeson

This comment has been minimized.

Show comment
Hide comment
@jrobeson

jrobeson Aug 13, 2016

Contributor

@sampsyo: this is the one of the issues i uncovered in #2158

Contributor

jrobeson commented Aug 13, 2016

@sampsyo: this is the one of the issues i uncovered in #2158

@jrobeson

This comment has been minimized.

Show comment
Hide comment
@jrobeson

jrobeson Aug 13, 2016

Contributor

@sampsyo : ignore that, since there's truly a utf-8 locale here.

Contributor

jrobeson commented Aug 13, 2016

@sampsyo : ignore that, since there's truly a utf-8 locale here.

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Aug 13, 2016

It appears that the files on which beets is crashing just happen to have non-utf8 junk in their filenames (so I guess they can be considered as broken):

$ ls -lah /tmp/macaron/Liquid\ Tension\ Experiment\ 2\ 1999/
total 1018M
drwxr-x--- 1 armael 2006 4,0K 30 mai    2013 .
drwxr-x--- 1 armael 2006  64K 13 août  00:26 ..
-rw-r----- 1 armael 2006  51M 29 oct.   2012 '01. Liquid Tension Experiment '$'\226'' Acid Rain.flac'
-rw-r----- 1 armael 2006  51M 30 mai    2013 '01. Liquid Tension Experiment – Acid Rain.flac'
-rw-r----- 1 armael 2006  56M 29 oct.   2012 '02. Liquid Tension Experiment '$'\226'' Biaxident.flac'
-rw-r----- 1 armael 2006  56M 30 mai    2013 '02. Liquid Tension Experiment – Biaxident.flac'
-rw-r----- 1 armael 2006  32M 29 oct.   2012 '03. Liquid Tension Experiment '$'\226'' 914.flac'
-rw-r----- 1 armael 2006  32M 30 mai    2013 '03. Liquid Tension Experiment – 914.flac'
-rw-r----- 1 armael 2006  72M 29 oct.   2012 '04. Liquid Tension Experiment '$'\226'' Another Dimension.flac'
-rw-r----- 1 armael 2006  72M 30 mai    2013 '04. Liquid Tension Experiment – Another Dimension.flac'
-rw-r----- 1 armael 2006 126M 29 oct.   2012 '05. Liquid Tension Experiment '$'\226'' When The Water Breaks.flac'
-rw-r----- 1 armael 2006 126M 30 mai    2013 '05. Liquid Tension Experiment – When The Water Breaks.flac'
-rw-r----- 1 armael 2006  89M 29 oct.   2012 '06. Liquid Tension Experiment '$'\226'' Chewbacca.flac'
-rw-r----- 1 armael 2006  89M 30 mai    2013 '06. Liquid Tension Experiment – Chewbacca.flac'
-rw-r----- 1 armael 2006  64M 29 oct.   2012 '07. Liquid Tension Experiment '$'\226'' Liquid Dreams.flac'
-rw-r----- 1 armael 2006  64M 30 mai    2013 '07. Liquid Tension Experiment – Liquid Dreams.flac'
-rw-r----- 1 armael 2006  22M 29 oct.   2012 '08. Liquid Tension Experiment '$'\226'' Hourglass.flac'
-rw-r----- 1 armael 2006  22M 30 mai    2013 '08. Liquid Tension Experiment – Hourglass.flac'
-rw-r----- 1 armael 2006  77K 30 mai    2013 back.jpeg
-rw-r----- 1 armael 2006  67K 30 mai    2013 front.jpeg
-rw-r----- 1 armael 2006 1,6K 30 mai    2013 'Liquid Tension Experiment - Liquid Tension Experiment 2.cue'
-rw-r----- 1 armael 2006 4,3K 30 mai    2013 'Liquid Tension Experiment - Liquid Tension Experiment 2.log'
-rw-r----- 1 armael 2006  402 30 mai    2013 'Liquid Tension Experiment - Liquid Tension Experiment 2.m3u'

Armael commented Aug 13, 2016

It appears that the files on which beets is crashing just happen to have non-utf8 junk in their filenames (so I guess they can be considered as broken):

$ ls -lah /tmp/macaron/Liquid\ Tension\ Experiment\ 2\ 1999/
total 1018M
drwxr-x--- 1 armael 2006 4,0K 30 mai    2013 .
drwxr-x--- 1 armael 2006  64K 13 août  00:26 ..
-rw-r----- 1 armael 2006  51M 29 oct.   2012 '01. Liquid Tension Experiment '$'\226'' Acid Rain.flac'
-rw-r----- 1 armael 2006  51M 30 mai    2013 '01. Liquid Tension Experiment – Acid Rain.flac'
-rw-r----- 1 armael 2006  56M 29 oct.   2012 '02. Liquid Tension Experiment '$'\226'' Biaxident.flac'
-rw-r----- 1 armael 2006  56M 30 mai    2013 '02. Liquid Tension Experiment – Biaxident.flac'
-rw-r----- 1 armael 2006  32M 29 oct.   2012 '03. Liquid Tension Experiment '$'\226'' 914.flac'
-rw-r----- 1 armael 2006  32M 30 mai    2013 '03. Liquid Tension Experiment – 914.flac'
-rw-r----- 1 armael 2006  72M 29 oct.   2012 '04. Liquid Tension Experiment '$'\226'' Another Dimension.flac'
-rw-r----- 1 armael 2006  72M 30 mai    2013 '04. Liquid Tension Experiment – Another Dimension.flac'
-rw-r----- 1 armael 2006 126M 29 oct.   2012 '05. Liquid Tension Experiment '$'\226'' When The Water Breaks.flac'
-rw-r----- 1 armael 2006 126M 30 mai    2013 '05. Liquid Tension Experiment – When The Water Breaks.flac'
-rw-r----- 1 armael 2006  89M 29 oct.   2012 '06. Liquid Tension Experiment '$'\226'' Chewbacca.flac'
-rw-r----- 1 armael 2006  89M 30 mai    2013 '06. Liquid Tension Experiment – Chewbacca.flac'
-rw-r----- 1 armael 2006  64M 29 oct.   2012 '07. Liquid Tension Experiment '$'\226'' Liquid Dreams.flac'
-rw-r----- 1 armael 2006  64M 30 mai    2013 '07. Liquid Tension Experiment – Liquid Dreams.flac'
-rw-r----- 1 armael 2006  22M 29 oct.   2012 '08. Liquid Tension Experiment '$'\226'' Hourglass.flac'
-rw-r----- 1 armael 2006  22M 30 mai    2013 '08. Liquid Tension Experiment – Hourglass.flac'
-rw-r----- 1 armael 2006  77K 30 mai    2013 back.jpeg
-rw-r----- 1 armael 2006  67K 30 mai    2013 front.jpeg
-rw-r----- 1 armael 2006 1,6K 30 mai    2013 'Liquid Tension Experiment - Liquid Tension Experiment 2.cue'
-rw-r----- 1 armael 2006 4,3K 30 mai    2013 'Liquid Tension Experiment - Liquid Tension Experiment 2.log'
-rw-r----- 1 armael 2006  402 30 mai    2013 'Liquid Tension Experiment - Liquid Tension Experiment 2.m3u'

@sampsyo sampsyo added the bug label Aug 13, 2016

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Aug 13, 2016

Member

Thanks for the report; this is definitely a bug.

Here are the technical details: we should never do path.decode('utf8'), or any other encoding, with default error handling. This line of code should have been a red flag. It's the sort of thing that a future abstraction for paths should prevent altogether.

I'll attempt a fix for this now, but it will be useful to have confirmation from our build bots…

Member

sampsyo commented Aug 13, 2016

Thanks for the report; this is definitely a bug.

Here are the technical details: we should never do path.decode('utf8'), or any other encoding, with default error handling. This line of code should have been a red flag. It's the sort of thing that a future abstraction for paths should prevent altogether.

I'll attempt a fix for this now, but it will be useful to have confirmation from our build bots…

@sampsyo sampsyo closed this in 162bf6a Aug 13, 2016

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Aug 13, 2016

Member

Fixed (probably) by treating paths correctly "in the beets way": the internal representation is bytes, and we convert with syspath when invoking OS APIs.

Member

sampsyo commented Aug 13, 2016

Fixed (probably) by treating paths correctly "in the beets way": the internal representation is bytes, and we convert with syspath when invoking OS APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment