Can't import taichi (from 0.5.15) using Ubuntu 20.04 in VM #958

zhai-xiao · 2020-05-12T06:30:03Z

Taichi fail to load in Ubuntu 20.04 using VM since 0.5.15, tried with both VMware Player and VirtualBox. Version 0.5.14 and before are fine.

archibate · 2020-05-12T07:19:08Z

Thank for the patient! This issue is because of the recent added OpenGL backend in v0.5.15.
It's heard that VM does not have good OpenGL support. Could you run glxinfo | grep OpenGL and verify that? And, do you have the same problem on your physical machine (not in VM)?

zhai-xiao · 2020-05-12T08:10:40Z

Sure, VMs are not famous for their OpenGL compatibility. I got 3.3 on VMware but only 2.1 in VirtualBox. Sadly I don't have a proper environment on my physical machine. Could you please don't assume a gl version and make it work with CPU only, just like before? The glxinfo from VMs are attached

archibate · 2020-05-12T11:02:30Z

I got 3.3 on VMware but only 2.1 in VirtualBox.

While Taichi requires 4.3 to work.
We should detect the version before calling into glfwCreateWindow and return false at that situation:

taichi/taichi/backends/opengl/opengl_api.cpp

Lines 517 to 519 in e0ef399

    
           bool is_opengl_api_available() { 
        
             return initialize_opengl(true); 
        
           }

But the problem comes:
We need an OpenGL context to call glGetString(GL_VERSION).
We need glfwCreateWindow to get an OpenGL context.
We need glGetString(GL_VERSION) to determine weather to call glfwCreateWindow.
https://stackoverflow.com/questions/46510889/how-can-i-know-which-opengl-version-is-supported-by-my-system

Could you please don't assume a gl version and make it work with CPU only, just like before?

Possible temporary solution: remove L455-L456 from ~/.local/lib/python3.8/site-packages/taichi/core/util.py:

    if ti_core.with_opengl():
        supported_archs.append('opengl')

A related issue: glfw/glfw#766

zhai-xiao · 2020-05-12T12:17:31Z

Thanks for the reply. I know it probably requires the compute shader for OpenGL to really shine. However, having OpenGL 4.3 is almost impossible currently for most VMs at my best knowledge, so I'd like to fall back on x64 for now.
Removing L455-456 in util.py fixes the line 'import taichi as ti', but when I do 'ti.init(arch=ti.x64)', I still get pretty much the same error. In the callstack I can still see OpenGL being initialized. I'm not quite sure what's going on behind the scene but it seems that ti_core.with_opengl() is still true even if arch=ti.x64 is passed in.

archibate · 2020-05-12T13:30:58Z

Thank for the information, I found another with_opengl in L271-L272 from ~/.local/lib/python3.8/site-packages/taichi/lang/__init__.py:

if ti_core.with_opengl():
        archs.append(opengl)

but it seems that ti_core.with_opengl() is still true even if arch=ti.x64 is passed in.

Note that with_opengl is no expected to return false with arch=ti.x64 specified, it basically detects if the OpenGL driver is available, and return false only when driver unavailable, instead of a manual specifed arch.
However, with_opengl crashed into segment fault when detecting OpenGL availability...
It would be straightforward if we can catch that SIGSEGV, and return false on that condition.
python-pseudo code:

def with_opengl():
   try:
      return initialize_opengl()
   except SegmentFault:
      return False

taichi/taichi/core/logging.cpp

Lines 157 to 162 in e0ef399

    
           void signal_handler(int signo) { 
        
             // It seems that there's no way to pass exception to Python in signal 
        
             // handlers? 
        
             auto sig_name = signal_name(signo); 
        
             logger.error(fmt::format("Received signal {} ({})", signo, sig_name), false); 
        
             exit(-1);

https://docs.python.org/3/library/faulthandler.html#module-faulthandler

This also applies to CUDA backend, which is commonly reported to be crash on start up (@yuanming-hu), what do you think?

Btw, you can run TI_LOG_LEVEL=trace python test.py to print more details about the internal process.

k-ye · 2020-05-12T14:22:03Z

BTW, is it possible to detect OpenGL version inside with_opengl(), something like this? Report true only if version >= 4.3?

archibate · 2020-05-12T15:03:43Z

BTW, is it possible to detect OpenGL version inside with_opengl(), something like this? Report true only if version >= 4.3?

Thank for the suggestion, I hope so, but we can't call glGetInteger before glfwCreateWindow. I think we will stick to the catch-segmentation-fault approach, which is also helpful for CUDA.

We need an OpenGL context to call glGetString(GL_VERSION).
We need glfwCreateWindow to get an OpenGL context.
We need glGetString(GL_VERSION) to determine weather to call glfwCreateWindow (cause segfault)

yuanming-hu · 2020-05-12T15:54:42Z

A probably easier solution: can we simply have a .taichiconfig to disable OpenGL manually in certain environments?

archibate · 2020-05-12T15:56:15Z

Yes, we can, if you mean, users without environment manually add TI_WITH_OPENGL=0 in .bashrc?

yuanming-hu · 2020-05-12T15:59:47Z

Oh, making use of environment variables does sound like a good solution for *nix users! Let's use something like TI_ENABLE_OPENGL?

We should also consider how to make taichi work out-of-box without setting anything like an envvar. On the other hand, we don't want to set TI_ENABLE_OPENGL=0 by default. Do you have an idea on how to achieve both?

zhai-xiao · 2020-05-12T21:06:24Z

Thanks for all the timely replies. You guys are amazing!

archibate · 2020-05-13T13:05:49Z

You're welcome, thank you for pointing out the bug and valuable informations!

@yuanming-hu Can we release #962 with v0.6.4 tonight? So that @TroyZhai could try out TI_ENABLE_OPENGL=0 and see if it works.
Also note that this is a temporary solution given that it's hard to figure out why. We must find out an ultimate solution for this issue at some point.

yuanming-hu · 2020-05-13T15:46:32Z

Sure - I have meetings in the morning but I'll release v0.6.4 in a couple of hours.

yuanming-hu · 2020-05-13T23:16:04Z

@TroyZhai We just now released v0.6.4. When you get a chance, could you upgrade and run with TI_ENABLE_OPENGL=0? Please let us know if that works.

No rush on this at all. Thank you!

zhai-xiao · 2020-05-14T00:43:15Z

@TroyZhai We just now released v0.6.4. When you get a chance, could you upgrade and run with TI_ENABLE_OPENGL=0? Please let us know if that works.

No rush on this at all. Thank you!

Great news! I can confirm that it works as expected on my VMs when I set "export TI_ENABLE_OPENGL=0". Thanks all!

yuanming-hu · 2020-05-14T01:28:32Z

Awesome!

I'm closing this thanks to the hard work by @archibate.

archibate · 2020-05-14T01:39:48Z

Cool! But how about to add this usage to doc? Potentially a chapter called Troubleshooting, contains TI_ENABLE_OPENGL and TI_USE_UNIFIED_MEMORY, etc., so that these will solve more people's problem.

yuanming-hu · 2020-05-14T01:52:56Z

Sounds good! Should we mode the following items in the README file there as well?

On Ubuntu 19.04+, please sudo apt install libtinfo5.

On Windows, please install Microsoft Visual C++ Redistributable if you haven't.

A chapter named Installation sounds good. We can address all compatibility issues there. Maybe we can put it before Hello world?

These text in Hello world should also be moved there:

First of all, let’s install Taichi via pip:
# Python 3.6+ needed
python3 -m pip install taichi

zhai-xiao added the potential bug Something that looks like a bug but not yet confirmed label May 12, 2020

archibate self-assigned this May 12, 2020

archibate added opengl OpenGL backend dependency linux Linux platform labels May 12, 2020

archibate mentioned this issue May 12, 2020

[opengl] add TI_ENABLE_OPENGL env var to disable OpenGL #962

Merged

yuanming-hu closed this as completed May 14, 2020

Eydcao mentioned this issue Jun 2, 2020

[Bug] Cannot load taichi with opengl 4.6 > 4.3 on Ubuntu 20 (AMD card) #1106

Closed

archibate mentioned this issue Jun 4, 2020

[opengl] [Bug] Fix crash with old GLX #1134

Closed

This was referenced Jun 25, 2020

[Bug] [OpenGL] with_opengl crashed with segfault after upgrading #1325

Closed

[Bug] [linux] [opengl] Use RTLD_LOCAL to prevent LLVM symbol conflict with GLX #1326

Merged

archibate mentioned this issue Jul 16, 2020

[Linux] [llvm] Fix LLVM symbol leakage to prevent conflict with other libs using LLVM like GLX #1508

Merged

archibate mentioned this issue Aug 10, 2020

[LLVM] Add TI_WITH_LLVM option to disable LLVM backends #1659

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't import taichi (from 0.5.15) using Ubuntu 20.04 in VM #958

Can't import taichi (from 0.5.15) using Ubuntu 20.04 in VM #958

zhai-xiao commented May 12, 2020

archibate commented May 12, 2020 •

edited

Loading

zhai-xiao commented May 12, 2020

archibate commented May 12, 2020 •

edited

Loading

zhai-xiao commented May 12, 2020

archibate commented May 12, 2020 •

edited

Loading

k-ye commented May 12, 2020

archibate commented May 12, 2020

yuanming-hu commented May 12, 2020

archibate commented May 12, 2020

yuanming-hu commented May 12, 2020

zhai-xiao commented May 12, 2020

archibate commented May 13, 2020 •

edited

Loading

yuanming-hu commented May 13, 2020

yuanming-hu commented May 13, 2020

zhai-xiao commented May 14, 2020

yuanming-hu commented May 14, 2020

archibate commented May 14, 2020

yuanming-hu commented May 14, 2020

Can't import taichi (from 0.5.15) using Ubuntu 20.04 in VM #958

Can't import taichi (from 0.5.15) using Ubuntu 20.04 in VM #958

Comments

zhai-xiao commented May 12, 2020

archibate commented May 12, 2020 • edited Loading

zhai-xiao commented May 12, 2020

archibate commented May 12, 2020 • edited Loading

zhai-xiao commented May 12, 2020

archibate commented May 12, 2020 • edited Loading

k-ye commented May 12, 2020

archibate commented May 12, 2020

yuanming-hu commented May 12, 2020

archibate commented May 12, 2020

yuanming-hu commented May 12, 2020

zhai-xiao commented May 12, 2020

archibate commented May 13, 2020 • edited Loading

yuanming-hu commented May 13, 2020

yuanming-hu commented May 13, 2020

zhai-xiao commented May 14, 2020

yuanming-hu commented May 14, 2020

archibate commented May 14, 2020

yuanming-hu commented May 14, 2020

archibate commented May 12, 2020 •

edited

Loading

archibate commented May 12, 2020 •

edited

Loading

archibate commented May 12, 2020 •

edited

Loading

archibate commented May 13, 2020 •

edited

Loading