New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beets is crashing due to ascii encoding issues #2393

Open
fschoell opened this Issue Jan 16, 2017 · 13 comments

Comments

Projects
None yet
6 participants
@fschoell

fschoell commented Jan 16, 2017

Problem

beets is crashing due to some ascii encoding issues when it should display the musicbrainz search result

Running this command in verbose (-vv) mode:

$ beet -vv import System\ of\ a\ Down\ -\ Hypnotize/

Led to this problem:

user configuration: /home/user/.config/beets/config.yaml
data directory: /home/user/.config/beets
plugin paths: 
Sending event: pluginload
library database: /home/user/.config/beets/library.db
library directory: /home/user/Music
Sending event: library_opened
Sending event: import_begin
Sending event: import_task_created
Sending event: import_task_start
Looking up: /some/folder/System of a Down - Hypnotize
Tagging System of a Down - Hypnotize
No album ID found.
Search terms: System of a Down - Hypnotize
Album might be VA: False
Sending event: albuminfo_received
Candidate: System of a Down - Hypnotize
Success. Distance: 0.01
Sending event: albuminfo_received
Candidate: System of a Down - Hypnotize
Success. Distance: 0.00
Sending event: albuminfo_received
Candidate: System of a Down - Hypnotize
Success. Distance: 0.01
Sending event: albuminfo_received
Candidate: System of a Down - Hypnotize
Success. Distance: 0.01
Sending event: albuminfo_received
Candidate: System of a Down - Hypnotize
Success. Distance: 0.01
Sending event: albuminfo_received
Candidate: System Of A Down - Hypnotize
Success. Distance: 0.04
Sending event: albuminfo_received
Candidate: System Of A Down - Hypnotize
Success. Distance: 0.04
Sending event: albuminfo_received
Candidate: System Of A Down - Hypnotize
Success. Distance: 0.04
Sending event: albuminfo_received
Candidate: System Of A Down - Hypnotize
Success. Distance: 0.04
Sending event: albuminfo_received
Candidate: System Of A Down - Hypnotize
Success. Distance: 0.04
Evaluating 10 candidates.

/some/folder/System of a Down - Hypnotize (12 items)
Sending event: before_choose_candidate
Tagging:
    System of a Down - Hypnotize
URL:
    https://musicbrainz.org/release/8a4034a9-7834-3b7e-a6f0-d0791e3731fb
(Similarity: 100.0%) (Vinyl, 2005, US)
Traceback (most recent call last):
  File "/usr/bin/beet", line 11, in <module>
    load_entry_point('beets==1.4.3', 'console_scripts', 'beet')()
  File "/usr/lib/python3.6/site-packages/beets/ui/__init__.py", line 1209, in main
    _raw_main(args)
  File "/usr/lib/python3.6/site-packages/beets/ui/__init__.py", line 1196, in _raw_main
    subcommand.func(lib, suboptions, subargs)
  File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 930, in import_func
    import_files(lib, paths, query)
  File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 907, in import_files
    session.run()
  File "/usr/lib/python3.6/site-packages/beets/importer.py", line 319, in run
    pl.run_parallel(QUEUE_SIZE)
  File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 445, in run_parallel
    six.reraise(exc_info[0], exc_info[1], exc_info[2])
  File "/usr/lib/python3.6/site-packages/six.py", line 686, in reraise
    raise value
  File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 312, in run
    out = self.coro.send(msg)
  File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 171, in coro
    task = func(*(args + (task,)))
  File "/usr/lib/python3.6/site-packages/beets/importer.py", line 1282, in user_query
    task.choose_match(session)
  File "/usr/lib/python3.6/site-packages/beets/importer.py", line 777, in choose_match
    choice = session.choose_match(self)
  File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 698, in choose_match
    itemcount=len(task.items), choices=choices
  File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 601, in choose_candidate
    show_change(cur_artist, cur_album, match)
  File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 365, in show_change
    print_(u'%s%s -> %s' % (lhs, ' ' * pad, rhs))
  File "/usr/lib/python3.6/site-packages/beets/ui/__init__.py", line 143, in print_
    sys.stdout.write(txt)
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 61: ordinal not in range(128)

Setup

  • OS: Arch Linux
  • Python version: 3.6.0
  • beets version: 1.4.3
  • plugins: embedart fetchart lastgenre convert
@greenisagoodcolor

This comment has been minimized.

Show comment
Hide comment
@greenisagoodcolor

greenisagoodcolor Jan 16, 2017

Hello! I have a similar bug issue:

/home/msm/untagged music/flac_clk/Ambient/Rod Modell, Michael Mantra - 1998-2007/Rod Modell - 2007 - Plays Michael Mantra (2 items)
Tagging:
Rod Modell - Plays Michael Mantra
URL:
https://musicbrainz.org/release/9463c2d6-5361-43a8-9e93-01a49009f5b0
(Similarity: 100.0%) (CD, 2007, IT, Silentes)

/home/msm/untagged music/flac_clk/Ambient/Thomas Koner 1990-1996/Thomas Koner - 1990 - Nunatak Gongamur (11 items)
Traceback (most recent call last):
File "/usr/bin/beet", line 11, in
load_entry_point('beets==1.4.3', 'console_scripts', 'beet')()
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 1209, in main
raw_main(args)
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 1196, in raw_main
subcommand.func(lib, suboptions, subargs)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 930, in import_func
import_files(lib, paths, query)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 907, in import_files
session.run()
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 319, in run
pl.run_parallel(QUEUE_SIZE)
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 445, in run_parallel
six.reraise(exc_info[0], exc_info[1], exc_info[2])
File "/usr/lib/python3.6/site-packages/six.py", line 686, in reraise
raise value
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 312, in run
out = self.coro.send(msg)
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 171, in coro
task = func(*(args + (task,)))
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 1282, in user_query
task.choose_match(session)
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 777, in choose_match
choice = session.choose_match(self)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 698, in choose_match
itemcount=len(task.items), choices=choices
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 601, in choose_candidate
show_change(cur_artist, cur_album, match)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 263, in show_change
print
(u"Tagging:\n {0.artist} - {0.album}".format(match.info))
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 143, in print

sys.stdout.write(txt)
UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 21: ordinal not in range(128)

greenisagoodcolor commented Jan 16, 2017

Hello! I have a similar bug issue:

/home/msm/untagged music/flac_clk/Ambient/Rod Modell, Michael Mantra - 1998-2007/Rod Modell - 2007 - Plays Michael Mantra (2 items)
Tagging:
Rod Modell - Plays Michael Mantra
URL:
https://musicbrainz.org/release/9463c2d6-5361-43a8-9e93-01a49009f5b0
(Similarity: 100.0%) (CD, 2007, IT, Silentes)

/home/msm/untagged music/flac_clk/Ambient/Thomas Koner 1990-1996/Thomas Koner - 1990 - Nunatak Gongamur (11 items)
Traceback (most recent call last):
File "/usr/bin/beet", line 11, in
load_entry_point('beets==1.4.3', 'console_scripts', 'beet')()
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 1209, in main
raw_main(args)
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 1196, in raw_main
subcommand.func(lib, suboptions, subargs)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 930, in import_func
import_files(lib, paths, query)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 907, in import_files
session.run()
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 319, in run
pl.run_parallel(QUEUE_SIZE)
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 445, in run_parallel
six.reraise(exc_info[0], exc_info[1], exc_info[2])
File "/usr/lib/python3.6/site-packages/six.py", line 686, in reraise
raise value
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 312, in run
out = self.coro.send(msg)
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 171, in coro
task = func(*(args + (task,)))
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 1282, in user_query
task.choose_match(session)
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 777, in choose_match
choice = session.choose_match(self)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 698, in choose_match
itemcount=len(task.items), choices=choices
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 601, in choose_candidate
show_change(cur_artist, cur_album, match)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 263, in show_change
print
(u"Tagging:\n {0.artist} - {0.album}".format(match.info))
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 143, in print

sys.stdout.write(txt)
UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 21: ordinal not in range(128)

@jrobeson

This comment has been minimized.

Show comment
Hide comment
@jrobeson

jrobeson Jan 17, 2017

Contributor

what's the output of the locale command?

Contributor

jrobeson commented Jan 17, 2017

what's the output of the locale command?

@fschoell

This comment has been minimized.

Show comment
Hide comment
@fschoell

fschoell Jan 17, 2017

Gave me an error message, fixed my locales and now the error is gone =)

fschoell commented Jan 17, 2017

Gave me an error message, fixed my locales and now the error is gone =)

@sampsyo sampsyo added the needinfo label Jan 17, 2017

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Jan 17, 2017

Member

It looks like both of these are instances of The Fundamental Problem With Python 3: the user's locale configuration can make it impossible to print some strings to stdout.

@jrobeson already knows all this, but as a little background, the problem is that sys.stdout on Python 3 is a Unicode stream, so you're supposed to print Unicode strings (i.e., str objects) to it. That makes perfect sense, but Python 3 is also set up to automatically encode those str objects to bytes to send them to the OS—and that encoding can crash depending on the locale Python has gleaned from the OS settings.

On Python 2, we used to be able to bypass this either by avoiding unencodeable characters (i.e., using the replace policy) or by ignoring the encoding Python guessed and just using UTF-8 (we have an encoding override setting). On Python 3, we no longer have that option because we can't send encoded bytes directly to the OS. We're stuck waiting for Python itself to change its defaults.

Anyway, @greenisagoodcolor, seeing that locale information would still be useful.

Member

sampsyo commented Jan 17, 2017

It looks like both of these are instances of The Fundamental Problem With Python 3: the user's locale configuration can make it impossible to print some strings to stdout.

@jrobeson already knows all this, but as a little background, the problem is that sys.stdout on Python 3 is a Unicode stream, so you're supposed to print Unicode strings (i.e., str objects) to it. That makes perfect sense, but Python 3 is also set up to automatically encode those str objects to bytes to send them to the OS—and that encoding can crash depending on the locale Python has gleaned from the OS settings.

On Python 2, we used to be able to bypass this either by avoiding unencodeable characters (i.e., using the replace policy) or by ignoring the encoding Python guessed and just using UTF-8 (we have an encoding override setting). On Python 3, we no longer have that option because we can't send encoded bytes directly to the OS. We're stuck waiting for Python itself to change its defaults.

Anyway, @greenisagoodcolor, seeing that locale information would still be useful.

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Jan 17, 2017

Member

One other thought for @jrobeson: should we consider using sys.stdout.buffer.write on Python 3 to restore the old behavior (i.e., _out_encoding is respected and bad characters are replaced)?

Member

sampsyo commented Jan 17, 2017

One other thought for @jrobeson: should we consider using sys.stdout.buffer.write on Python 3 to restore the old behavior (i.e., _out_encoding is respected and bad characters are replaced)?

@jrobeson

This comment has been minimized.

Show comment
Hide comment
@jrobeson

jrobeson Jan 17, 2017

Contributor

We certainly could do that, but I was hoping to see how prevalent the non utf8 situation was on *nix first, and in what cases we'd run into them.

Contributor

jrobeson commented Jan 17, 2017

We certainly could do that, but I was hoping to see how prevalent the non utf8 situation was on *nix first, and in what cases we'd run into them.

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Jan 17, 2017

Member

Good point. It looks like both of these cases are the most predictable scenario: broken locale settings that no one noticed were broken until this crash happened.

Member

sampsyo commented Jan 17, 2017

Good point. It looks like both of these cases are the most predictable scenario: broken locale settings that no one noticed were broken until this crash happened.

@jrobeson

This comment has been minimized.

Show comment
Hide comment
@jrobeson

jrobeson Jan 17, 2017

Contributor

yeah, but broken how? I'm very suprised to see non utf8 setups on arch. I"d like to know the reasoning behind them.

Contributor

jrobeson commented Jan 17, 2017

yeah, but broken how? I'm very suprised to see non utf8 setups on arch. I"d like to know the reasoning behind them.

@greenisagoodcolor

This comment has been minimized.

Show comment
Hide comment
@greenisagoodcolor

greenisagoodcolor Jan 18, 2017

@jrobeson output of locale command:

[msm@msm-x220 ~]$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

greenisagoodcolor commented Jan 18, 2017

@jrobeson output of locale command:

[msm@msm-x220 ~]$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

@jrobeson

This comment has been minimized.

Show comment
Hide comment
@jrobeson

jrobeson Jan 18, 2017

Contributor

@greenisagoodcolor : you probably shouldn't be using the C locale on a modern system. Any python program that uses the click cli library (not us yet), will warn you to switch to something else. I'd suggest you use a locale that ends with .utf8 like en_US.UTF-8 or de_DE.utf-8, etc.

Any character that isn't in the ASCII set will either be missing or be replaced with a question mark or some other similiar character.

Don't close the issue though, as we still want to have some sort of mitigation measure in place though.

Contributor

jrobeson commented Jan 18, 2017

@greenisagoodcolor : you probably shouldn't be using the C locale on a modern system. Any python program that uses the click cli library (not us yet), will warn you to switch to something else. I'd suggest you use a locale that ends with .utf8 like en_US.UTF-8 or de_DE.utf-8, etc.

Any character that isn't in the ASCII set will either be missing or be replaced with a question mark or some other similiar character.

Don't close the issue though, as we still want to have some sort of mitigation measure in place though.

sampsyo added a commit that referenced this issue Jan 19, 2017

sampsyo added a commit that referenced this issue Jan 19, 2017

@sampsyo

This comment has been minimized.

Show comment
Hide comment
@sampsyo

sampsyo Jan 22, 2017

Member

@greenisagoodcolor Could you perhaps give #2398 a try? You should be able to check out the stdout-bytes branch from git. Let us know if that makes the problem go away.

Member

sampsyo commented Jan 22, 2017

@greenisagoodcolor Could you perhaps give #2398 a try? You should be able to check out the stdout-bytes branch from git. Let us know if that makes the problem go away.

@hrehfeld

This comment has been minimized.

Show comment
Hide comment
@hrehfeld

hrehfeld Feb 27, 2017

Just to confirm, I run into the same issue with the same locale print as @greenisagoodcolor . It's an archlinuxarm box.

hrehfeld commented Feb 27, 2017

Just to confirm, I run into the same issue with the same locale print as @greenisagoodcolor . It's an archlinuxarm box.

@prg318

This comment has been minimized.

Show comment
Hide comment
@prg318

prg318 Jun 5, 2017

potentially related #2585

prg318 commented Jun 5, 2017

potentially related #2585

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment