Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange output with non-ascii string #201

Closed
tkamenoko opened this issue Jan 16, 2019 · 7 comments
Closed

Strange output with non-ascii string #201

tkamenoko opened this issue Jan 16, 2019 · 7 comments

Comments

@tkamenoko
Copy link

Platform: Windows 10 64bit
Python: 3.7
Shell: Powershell 6.1.1

Test code:

#test_green.py
from unittest import TestCase


class OutputTests(TestCase):
    def test_success(self):
        self.assertEqual("lang: 日本語", "lang: 日本語")

    def test_fail(self):
        self.assertEqual("lang: 日本語", "lang: English")

Output by Green:

test_green
OutputTests
F test_fail
. test_success

Failure in test_green.OutputTests.test_fail
// Traceback here
AssertionError: 'lang: Ri Ben Yu ' != 'lang: English'
- lang: Ri Ben Yu
+ lang: English

Ran 2 tests in 0.740s

FAILED (failures=1, passes=1)

Output by builtin unittest:

F.

======================================================================
FAIL: test_fail (test_green.OutputTests)
----------------------------------------------------------------------
//Traceback here
AssertionError: 'lang: 日本語' != 'lang: English'
- lang: 日本語
+ lang: English

----------------------------------------------------------------------
Ran 2 tests in 0.002s

FAILED (failures=1)

Test finished as expected, but that is wrong output(日本語->Ri Ben Yu ).

@CleanCut
Copy link
Owner

I can't reproduce that on my Mac. 🤷‍♂️

What happens if you try setting an encoding with a line like this at the top of your file? (If you are not using utf8, then replace that with your actual encoding):

# -*- coding: utf8 -*-

@MinchinWeb
Copy link
Contributor

Also, what is your codepage set to?

@tkamenoko
Copy link
Author

@CleanCut Setting encoding comment has no effect.

@MinchinWeb default 932(shift-jis), but same result on 65001(utf-8).

In my opinion, this is caused by unidecode package.

green/green/output.py

Lines 145 to 148 in ce9383b

if self._ascii_only_output:
# Windows doesn't actually want unicode, so we get
# the closest ASCII equivalent
text = text_type(unidecode(text))

To reproduce, run this code.

from unidecode import unidecode
print("日本語") # -> 'Ri Ben Yu'

@MinchinWeb
Copy link
Contributor

I think you figured out what's going on, and I can fill in the why: unidecode was brought in as Unicode output was causing a number of hard to debug and reproduce issues on Windows. With unidecode, green would at least output something, rather than crash.

So green is working as it was designed to. But do we need a command line or environmental variable to turn this off?

@CleanCut
Copy link
Owner

I agree. This is expected, but not necessarily ideal.

If we added an option to disable this behavior, would anyone use it? @tkamenoko @MinchinWeb

@tkamenoko
Copy link
Author

Yes, I think so. Current behavior is not documented and may lead to confusion. The same output as builtin unittest is required.

@CleanCut
Copy link
Owner

Fix is in 2.14.0, just released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants