Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upBeets is crashing due to ascii encoding issues #2393
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
greenisagoodcolor
Jan 16, 2017
Hello! I have a similar bug issue:
/home/msm/untagged music/flac_clk/Ambient/Rod Modell, Michael Mantra - 1998-2007/Rod Modell - 2007 - Plays Michael Mantra (2 items)
Tagging:
Rod Modell - Plays Michael Mantra
URL:
https://musicbrainz.org/release/9463c2d6-5361-43a8-9e93-01a49009f5b0
(Similarity: 100.0%) (CD, 2007, IT, Silentes)
/home/msm/untagged music/flac_clk/Ambient/Thomas Koner 1990-1996/Thomas Koner - 1990 - Nunatak Gongamur (11 items)
Traceback (most recent call last):
File "/usr/bin/beet", line 11, in
load_entry_point('beets==1.4.3', 'console_scripts', 'beet')()
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 1209, in main
raw_main(args)
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 1196, in raw_main
subcommand.func(lib, suboptions, subargs)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 930, in import_func
import_files(lib, paths, query)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 907, in import_files
session.run()
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 319, in run
pl.run_parallel(QUEUE_SIZE)
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 445, in run_parallel
six.reraise(exc_info[0], exc_info[1], exc_info[2])
File "/usr/lib/python3.6/site-packages/six.py", line 686, in reraise
raise value
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 312, in run
out = self.coro.send(msg)
File "/usr/lib/python3.6/site-packages/beets/util/pipeline.py", line 171, in coro
task = func(*(args + (task,)))
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 1282, in user_query
task.choose_match(session)
File "/usr/lib/python3.6/site-packages/beets/importer.py", line 777, in choose_match
choice = session.choose_match(self)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 698, in choose_match
itemcount=len(task.items), choices=choices
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 601, in choose_candidate
show_change(cur_artist, cur_album, match)
File "/usr/lib/python3.6/site-packages/beets/ui/commands.py", line 263, in show_change
print(u"Tagging:\n {0.artist} - {0.album}".format(match.info))
File "/usr/lib/python3.6/site-packages/beets/ui/init.py", line 143, in print
sys.stdout.write(txt)
UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 21: ordinal not in range(128)
greenisagoodcolor
commented
Jan 16, 2017
|
Hello! I have a similar bug issue: /home/msm/untagged music/flac_clk/Ambient/Rod Modell, Michael Mantra - 1998-2007/Rod Modell - 2007 - Plays Michael Mantra (2 items) /home/msm/untagged music/flac_clk/Ambient/Thomas Koner 1990-1996/Thomas Koner - 1990 - Nunatak Gongamur (11 items) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
what's the output of the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
fschoell
commented
Jan 17, 2017
|
Gave me an error message, fixed my locales and now the error is gone =) |
sampsyo
added
the
needinfo
label
Jan 17, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Jan 17, 2017
Member
It looks like both of these are instances of The Fundamental Problem With Python 3: the user's locale configuration can make it impossible to print some strings to stdout.
@jrobeson already knows all this, but as a little background, the problem is that sys.stdout on Python 3 is a Unicode stream, so you're supposed to print Unicode strings (i.e., str objects) to it. That makes perfect sense, but Python 3 is also set up to automatically encode those str objects to bytes to send them to the OS—and that encoding can crash depending on the locale Python has gleaned from the OS settings.
On Python 2, we used to be able to bypass this either by avoiding unencodeable characters (i.e., using the replace policy) or by ignoring the encoding Python guessed and just using UTF-8 (we have an encoding override setting). On Python 3, we no longer have that option because we can't send encoded bytes directly to the OS. We're stuck waiting for Python itself to change its defaults.
Anyway, @greenisagoodcolor, seeing that locale information would still be useful.
|
It looks like both of these are instances of The Fundamental Problem With Python 3: the user's locale configuration can make it impossible to print some strings to stdout. @jrobeson already knows all this, but as a little background, the problem is that On Python 2, we used to be able to bypass this either by avoiding unencodeable characters (i.e., using the Anyway, @greenisagoodcolor, seeing that locale information would still be useful. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Jan 17, 2017
Member
One other thought for @jrobeson: should we consider using sys.stdout.buffer.write on Python 3 to restore the old behavior (i.e., _out_encoding is respected and bad characters are replaced)?
|
One other thought for @jrobeson: should we consider using |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jrobeson
Jan 17, 2017
Contributor
We certainly could do that, but I was hoping to see how prevalent the non utf8 situation was on *nix first, and in what cases we'd run into them.
|
We certainly could do that, but I was hoping to see how prevalent the non utf8 situation was on *nix first, and in what cases we'd run into them. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Jan 17, 2017
Member
Good point. It looks like both of these cases are the most predictable scenario: broken locale settings that no one noticed were broken until this crash happened.
|
Good point. It looks like both of these cases are the most predictable scenario: broken locale settings that no one noticed were broken until this crash happened. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jrobeson
Jan 17, 2017
Contributor
yeah, but broken how? I'm very suprised to see non utf8 setups on arch. I"d like to know the reasoning behind them.
|
yeah, but broken how? I'm very suprised to see non utf8 setups on arch. I"d like to know the reasoning behind them. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
greenisagoodcolor
Jan 18, 2017
@jrobeson output of locale command:
[msm@msm-x220 ~]$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=
greenisagoodcolor
commented
Jan 18, 2017
|
@jrobeson output of locale command: [msm@msm-x220 ~]$ locale |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jrobeson
Jan 18, 2017
Contributor
@greenisagoodcolor : you probably shouldn't be using the C locale on a modern system. Any python program that uses the click cli library (not us yet), will warn you to switch to something else. I'd suggest you use a locale that ends with .utf8 like en_US.UTF-8 or de_DE.utf-8, etc.
Any character that isn't in the ASCII set will either be missing or be replaced with a question mark or some other similiar character.
Don't close the issue though, as we still want to have some sort of mitigation measure in place though.
|
@greenisagoodcolor : you probably shouldn't be using the C locale on a modern system. Any python program that uses the click cli library (not us yet), will warn you to switch to something else. I'd suggest you use a locale that ends with .utf8 like en_US.UTF-8 or de_DE.utf-8, etc. Any character that isn't in the ASCII set will either be missing or be replaced with a question mark or some other similiar character. Don't close the issue though, as we still want to have some sort of mitigation measure in place though. |
added a commit
that referenced
this issue
Jan 19, 2017
added a commit
that referenced
this issue
Jan 19, 2017
sampsyo
referenced this issue
Jan 19, 2017
Merged
On Python 3, use more flexible encoding to send bytes to stdout #2398
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sampsyo
Jan 22, 2017
Member
@greenisagoodcolor Could you perhaps give #2398 a try? You should be able to check out the stdout-bytes branch from git. Let us know if that makes the problem go away.
|
@greenisagoodcolor Could you perhaps give #2398 a try? You should be able to check out the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hrehfeld
Feb 27, 2017
Just to confirm, I run into the same issue with the same locale print as @greenisagoodcolor . It's an archlinuxarm box.
hrehfeld
commented
Feb 27, 2017
|
Just to confirm, I run into the same issue with the same locale print as @greenisagoodcolor . It's an archlinuxarm box. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
prg318
commented
Jun 5, 2017
•
|
potentially related #2585 |
fschoell commentedJan 16, 2017
Problem
beets is crashing due to some ascii encoding issues when it should display the musicbrainz search result
Running this command in verbose (
-vv) mode:Led to this problem:
Setup