Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows下中文用户名造成的加载失败问题,Loading failure caused by Chinese user name under windows #428

Closed
dyangrun opened this issue Feb 3, 2020 · 10 comments
Labels
bug We've confirmed that this is an BUG welcome contribution

Comments

@dyangrun
Copy link

dyangrun commented Feb 3, 2020

Describe the bug
使用windows操作系统,且用户名为中文,运行测试用例失败,显示加载.bc失败
Using windows operating system and user name in Chinese, running test case failed, showing failure to load .bc

Log/Screenshots

C:\Users\杨子锐>py I:\源码资源\pythonTest\taichiTest\test.py
[Release mode]
[T 02/03/20 14:54:22.687] [logging.cpp:taichi::Logger::Logger@68] Taichi core started. Thread ID = 9768
[Taichi version 0.4.2, cpu only, commit 832915b0]
[I 02/03/20 14:54:22.725] [memory_pool.cpp:taichi::Tlang::MemoryPool::MemoryPool@14] Memory pool created. Default buffer size per allocator = 1024 MB
[I 02/03/20 14:54:22.726] [taichi_llvm_context.cpp:taichi::Tlang::TaichiLLVMContext::TaichiLLVMContext@57] Creating llvm context for arch: x86_64
[I 02/03/20 14:54:22.778] [C:\Users\鏉ㄥ瓙閿怽AppData\Local\Programs\Python\Python38\lib\site-packages\taichi\lang\impl.py:materialize@124] Materializing layout...
[D 02/03/20 14:54:22.779] [snode.cpp:taichi::Tlang::SNode::create_node@48] Non-power-of-two node size 640 promoted to 1024.
[D 02/03/20 14:54:22.780] [snode.cpp:taichi::Tlang::SNode::create_node@48] Non-power-of-two node size 320 promoted to 512.
[W 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@168] Bitcode loading error message:
Invalid bitcode signature
[E 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@170] Bitcode C:\Users\鏉ㄥ瓙閿怽AppData\Local\Programs\Python\Python38\Lib\site-packages\taichi\core\../lib/runtime_x86_64.bc load failure.
[E 02/03/20 14:54:22.782] Received signal 22 (SIGABRT)

To Reproduce

import taichi as ti
ti.cfg.debug = True
ti.cfg.arch = ti.x86_64 # Run on GPU by default

n = 320
pixels = ti.var(dt=ti.f32, shape=(n * 2, n))

@ti.func
def complex_sqr(z):
  return ti.Vector([z[0] * z[0] - z[1] * z[1], z[1] * z[0] * 2])

@ti.kernel
def paint(t: ti.f32):
  for i, j in pixels: # Parallized over all pixels
    c = ti.Vector([-0.8, ti.sin(t) * 0.2])
    z = ti.Vector([float(i) / n - 1, float(j) / n - 0.5]) * 2
    iterations = 0
    while z.norm() < 20 and iterations < 50:
      z = complex_sqr(z) + c
      iterations += 1
    pixels[i, j] = 1 - iterations * 0.02

gui = ti.GUI("Fractal", (n * 2, n))

for i in range(1000000):
  paint(i * 0.03)
  gui.set_image(pixels)
  gui.show()

If you have local commits (e.g. compile fixes before you reproduce the bug), please make sure you first make a PR to fix the build errors and then report the bug.

其实这不是个bug,是个环境问题,主要是因为正好我遇到了,也许之后也会有人遇到,所以这里提交一个Issue,也许其他语言的windows用户名也会出现类似的问题。

Non-English windows username can check if it fails to load

@dyangrun dyangrun added the potential bug Something that looks like a bug but not yet confirmed label Feb 3, 2020
@yuanming-hu
Copy link
Member

Thanks so much for reporting! Yeah, a lot of Chinese users report Invalid bitcode signature on Windows yet I was struggling to reproduce it. I guess it's probably caused by the Chinese characters in paths.

Could anyone help confirm that without the Chinese characters, [Taichi version 0.4.2] works correctly on Windows? Thanks in advance.

@dyangrun
Copy link
Author

dyangrun commented Feb 4, 2020

Thanks so much for reporting! Yeah, a lot of Chinese users report Invalid bitcode signature on Windows yet I was struggling to reproduce it. I guess it's probably caused by the Chinese characters in paths.

Could anyone help confirm that without the Chinese chararcters, [Taichi version 0.4.2] works correctly on Windows? Thanks in advance.

我试了,如果使用英文用户名是没有问题,只有中文用户名有这个问题
English user name is wokr fine in [Taichi version 0.4.2]
不过还有一个载入dll错误的问题,按照#370的方法解决了

@yuanming-hu
Copy link
Member

太好了,原来真的是这个导致的,谢谢你的实验。早期设计的时候我确实没有考虑到这个问题。根据你的输出

[W 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@168] Bitcode loading error message:
Invalid bitcode signature
[E 02/03/20 14:54:22.781] [taichi_llvm_context.cpp:taichi::Tlang::module_from_bitcode_file@170] Bitcode C:\Users\鏉ㄥ瓙閿怽AppData\Local\Programs\Python\Python38\Lib\site-packages\taichi\core\../lib/runtime_x86_64.bc load failure.

看起来应该是C:\Users\杨子锐\由于编码的问题被转换成了C:\Users\鏉ㄥ瓙閿怽A,导致runtime_x86_64.bc文件并没有被正确找到,于是就Invalid bitcode signature了。我对Windows下的中文路径编码并不是很熟悉,不知你有没有什么好的解决方法?

谢谢!

@yuanming-hu yuanming-hu added bug We've confirmed that this is an BUG and removed potential bug Something that looks like a bug but not yet confirmed labels Feb 4, 2020
@archibate
Copy link
Collaborator

archibate commented Feb 5, 2020

我编写了一个简单的测试小程序,结果显示:
"杨子锐"的UTF-8编码,以GBK来解码就是"鏉ㄥ瓙閿",而"\"则变成了"怽"。

我在网上查阅资料得知:
中文windows系统中默认采用的是GBK编码格式。
而python在向C++程序传递字符串时,却采用了UTF-8格式。

考虑将 python/taichi/core/util.py:39 修改为:

lib_dir = os.path.join(package_root(), 'lib')
if get_os_name() == 'win':
    lib_dir = lib_dir.encode('gbk') # 这里要能通过某个Windows的API获取路径编码格式就更好了
else:
    lib_dir = lib_dir.encode('utf-8')
core.set_lib_dir(lib_dir)

希望能解决问题。

@yuanming-hu
Copy link
Member

原来如此。指定GBK确实能够解决Windows下汉字的问题,不知道有没有更系统性的方法,对俄文之类的语言也能支持?

@archibate
Copy link
Collaborator

archibate commented Feb 6, 2020

可以使用locale模块检测当前系统采用的默认编码格式,无论是否Windows:

>>> import locale
>>> locale.getdefaultlocale()
('zh_CN', 'cp936')

其中CP936就是GBK,用起来完全一样:

>>> '二三三'.encode('cp936')
b'\xb6\xfe\xc8\xfd\xc8\xfd'
>>> '二三三'.encode('gbk')
b'\xb6\xfe\xc8\xfd\xc8\xfd'

cp=code page,是Windows系统对不同国家地区语言的一种编号,好比linux中的LC_*环境变量。比如936=GBK,65001=UTF-8。Windows用户可以通过cmd命令chcp查阅和修改当前终端的code page。

考虑这样写:

import locale
...
def locale_encode():
    try:    encoding = locale.getdefaultlocale()[1]
    except: encoding = 'utf-8'
    return x.encode(encoding)
...
core.set_lib_dir(locale_encode(lib_dir))

@yuanming-hu
Copy link
Member

@archibate Cool! 这样听起来就比较系统的解决了这个问题。可否开一个PR加入这个解决方案?

archibate added a commit to archibate/taichi that referenced this issue Feb 7, 2020
when a string is passed from python to C++, convert it into locale encoding, or it will fail if you try to pass it to OS API as file path.
yuanming-hu pushed a commit that referenced this issue Feb 7, 2020
when a string is passed from python to C++, convert it into locale encoding, or it will fail if you try to pass it to OS API as file path.
@dyangrun
Copy link
Author

dyangrun commented Feb 7, 2020

看起来这个问题已经解决了~
fix this

@archibate
Copy link
Collaborator

抱歉翻出来一个老帖,请问一下你当时是不是在用 Python 3.5?

@dyangrun
Copy link
Author

抱歉翻出来一个老帖,请问一下你当时是不是在用 Python 3.5?

抱歉,刚刚才看到邮件。不是3.5,我当时用的是3.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug We've confirmed that this is an BUG welcome contribution
Projects
None yet
Development

No branches or pull requests

3 participants