Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicode tags are not written to stdout properly with python 3 #232

Closed
lazka opened this issue Sep 6, 2015 · 10 comments
Closed

unicode tags are not written to stdout properly with python 3 #232

lazka opened this issue Sep 6, 2015 · 10 comments
Labels
bug

Comments

@lazka
Copy link
Member

@lazka lazka commented Sep 6, 2015

Originally reported by: Brad Lanam (Bitbucket: bll123, GitHub: bll123)


A title with unicode characters such as:

TIT2=O.2.23 unicodeははは

is not written to the output correctly.

However, it does work with (ActiveState) python 2.7.5.


@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 6, 2015

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


I can't reproduce.

  • Which Python version (3.4 on Windows?)
  • Got any example code or files?
  • How do you know it's not written correctly?

thanks

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 6, 2015

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


Ah, never mind, I was missing the "stdout" part. I'll have a look.

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 6, 2015

Original comment by Brad Lanam (Bitbucket: bll123, GitHub: bll123):


Python 3.4 on windows 7-64.

You'll probably have to 'exec' mutagen-inspect from python and inspect the
output within python as cmd.exe is a disaster. I'm exec'ing
mutagen-inspect from tcl in my case -- so identical code, only the version
of python changes.

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 6, 2015

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


OK, thanks

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 9, 2015

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


We currently encode using sys.stdout.encoding as the Python print() can't handle unicode on WIndows/surrogates on linux. If we'd use standard print() it would just fail.

What is your use case?

If you want to use it in cmd.exe we could look into using something like https://pypi.python.org/pypi/win_unicode_console

If you want to get the output you can set PYTHONIOENCODING to utf-8 and it will write utf-8 encoded text to stdout instead of trying to encode to the limited console codepage.

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 9, 2015

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


I'll try to make mutagen's print work with win_unicode_console.

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 9, 2015

Original comment by Brad Lanam (Bitbucket: bll123, GitHub: bll123):


I just want the output in its proper unicode form. Not sure how else to
state that.
I do not use cmd.exe.
My code works on linux and windows with python 2.7.
What's different between python 2.7 and 3?

The tcl code is like:

#!tcl

    set oenc [encoding system]
    encoding system utf-8
    set data [exec python mutagen-inspect $fn]
    encoding system $oenc

So PYTHONIOENCODING is necessary for python 3? Would it affect python 2.7?

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 9, 2015

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


tools: force utf-8 codepage on Windows. See #232

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 9, 2015

Original comment by Brad Lanam (Bitbucket: bll123, GitHub: bll123):


Tested with PYTHONIOENCODING and it seems to work.
I need to do some more testing with both versions of python.

It's interesting, if you do: python mutagen-inspect > out.txt , then run notepad on out.txt to see if the output is any good, python 3.4 does not work, even with PYTHONIOENCODING set. Windows is strange.

@lazka

This comment has been minimized.

Copy link
Member Author

@lazka lazka commented Sep 9, 2015

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


I've changed things so that mutagen will force utf-8 now and ignore the active code page.

That means chcp and SetConsoleOutputCP will be ignored, but at least mutagen will output unicode.

(PYTHONIOENCODING should no longer be needed)

@lazka lazka added minor bug labels Apr 7, 2016
@lazka lazka closed this Apr 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.