Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RhiResult(-3) error when running NGP example on Windows with Vulkan #7674

Open
arrrmin opened this issue Mar 28, 2023 · 5 comments
Open

RhiResult(-3) error when running NGP example on Windows with Vulkan #7674

arrrmin opened this issue Mar 28, 2023 · 5 comments
Assignees
Labels
bug We've confirmed that this is an BUG vulkan Vulkan backend

Comments

@arrrmin
Copy link

arrrmin commented Mar 28, 2023

Describe the bug
I get a Dispatch error : RhiResult(-3) when trying to run taichi_ngp.py on Windows using Vulkan.

To Reproduce
Tried running the script taichi_ngp.py on Windows 11 with python 3.10.9 and taichi 1.5.0.
CPU: Ryzen 9 7900X
GPU: AMD 7900XT

Log/Screenshots

(taichi) C:\Users\armin\projects\external\taichi>C:/Users/armin/Programs/miniconda3/envs/taichi/python.exe c:/Users/armin/projects/external/taichi/python/taichi/examples/rendering/taichi_ngp.py
[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.10.9
[W 03/28/23 14:37:12.283 28600] [cuda_driver.cpp:taichi::lang::CUDADriverBase::load_lib@36] nvcuda.dll lib not found.
[Taichi] Starting on arch=vulkan
Loading model from ./npy_models/lego.npy
[E 03/28/23 14:37:12.541 28600] [runtime.cpp:taichi::lang::gfx::GfxRuntime::launch_kernel@576] Dispatch error : RhiResult(-3)


Traceback (most recent call last):
  File "c:\Users\armin\projects\external\taichi\python\taichi\examples\rendering\taichi_ngp.py", line 1111, in <module>
    main(cmd_args)
  File "c:\Users\armin\projects\external\taichi\python\taichi\examples\rendering\taichi_ngp.py", line 1081, in main
    ngp.load_model(npy_file)
  File "c:\Users\armin\projects\external\taichi\python\taichi\examples\rendering\taichi_ngp.py", line 332, in load_model
    self.hash_embedding.from_numpy(
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\util.py", line 311, in wrapped
    return func(*args, **kwargs)
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\field.py", line 357, in from_numpy
    self._from_external_arr(arr)
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\util.py", line 311, in wrapped
    return func(*args, **kwargs)
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\field.py", line 347, in _from_external_arr
    ext_arr_to_tensor(arr, self)
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\kernel_impl.py", line 1023, in wrapped
    return primal(*args, **kwargs)
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\kernel_impl.py", line 950, in __call__
    return self.runtime.compiled_functions[key](*args)
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\kernel_impl.py", line 853, in func__
    raise e from None
  File "C:\Users\armin\Programs\miniconda3\envs\taichi\lib\site-packages\taichi\lang\kernel_impl.py", line 850, in func__
    t_kernel(launch_ctx)
RuntimeError: [runtime.cpp:taichi::lang::gfx::GfxRuntime::launch_kernel@576] Dispatch error : RhiResult(-3)

Additional comments

(taichi) C:\Users\armin\projects\external\taichi>ti diagnose
[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.10.9

*******************************************
**      Taichi Programming Language      **
*******************************************

Docs:   https://docs.taichi-lang.org/
GitHub: https://github.com/taichi-dev/taichi/
Forum:  https://forum.taichi.graphics/

Taichi system diagnose:

python: 3.10.9 | packaged by conda-forge | (main, Jan 11 2023, 15:15:40) [MSC v.1916 64 bit (AMD64)]
system: win32
executable: C:\Users\armin\Programs\miniconda3\envs\taichi\python.exe
platform: Windows-10-10.0.22621-SP0
architecture: 64bit WindowsPE
uname: uname_result(system='Windows', node='Armin-PC', release='10', version='10.0.22621', machine='AMD64')
locale: en_AT.cp1252
PATH: C:\Users\armin\Programs\miniconda3\envs\taichi;C:\Users\armin\Programs\miniconda3\envs\taichi\Library\mingw-w64\bin;C:\Users\armin\Programs\miniconda3\envs\taichi\Library\usr\bin;C:\Users\armin\Programs\miniconda3\envs\taichi\Library\bin;C:\Users\armin\Programs\miniconda3\envs\taichi\Scripts;C:\Users\armin\Programs\miniconda3\envs\taichi\bin;C:\Users\armin\Programs\miniconda3\condabin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\Git\cmd;C:\Users\armin\AppData\Local\Microsoft\WindowsApps;C:\Users\armin\AppData\Local\gitkraken\bin;C:\Users\armin\AppData\Local\Programs\Microsoft VS Code\bin;C:\Users\armin\Programs\miniconda3;C:\Users\armin\Programs\miniconda3\Scripts;C:\Users\armin\Programs\miniconda3\Library\bin;.;C:\Users\armin\AppData\Local\Programs\Julia-1.8.5\bin;C:\Users\armin\Programs\miniconda3\envs\taichi\Lib\site-packages\taichi\_lib
PYTHONPATH: ['C:\\Users\\armin\\Programs\\miniconda3\\envs\\taichi\\Scripts\\ti.exe', 'C:\\Users\\armin\\Programs\\miniconda3\\envs\\taichi\\python310.zip', 'C:\\Users\\armin\\Programs\\miniconda3\\envs\\taichi\\DLLs', 'C:\\Users\\armin\\Programs\\miniconda3\\envs\\taichi\\lib', 'C:\\Users\\armin\\Programs\\miniconda3\\envs\\taichi', 'C:\\Users\\armin\\Programs\\miniconda3\\envs\\taichi\\lib\\site-packages']

`lsb_release` not available: [WinError 2] The system cannot find the file specified


import: <module 'taichi' from 'C:\\Users\\armin\\Programs\\miniconda3\\envs\\taichi\\lib\\site-packages\\taichi\\__init__.py'>

cc: False
cpu: True
metal: False
opengl: True
[W 03/28/23 14:28:12.986 29280] [cuda_driver.cpp:taichi::lang::CUDADriverBase::load_lib@36] nvcuda.dll lib not found.
cuda: False
vulkan: True

`glewinfo` not available: [WinError 2] The system cannot find the file specified

`nvidia-smi` not available: [WinError 2] The system cannot find the file specified
[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.10.9

[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.10.9
[Taichi] Starting on arch=x64

[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.10.9
[Taichi] Starting on arch=opengl

[W 03/28/23 14:28:15.777 32208] [cuda_driver.cpp:taichi::lang::CUDADriverBase::load_lib@36] nvcuda.dll lib not found.
[W 03/28/23 14:28:15.778 32208] [misc.py:adaptive_arch_select@772] Arch=[<Arch.cuda: 5>] is not supported, falling back to CPU
[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.10.9
[Taichi] Starting on arch=x64

[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.10.9

*******************************************
**      Taichi Programming Language      **
*******************************************

Docs:   https://docs.taichi-lang.org/
GitHub: https://github.com/taichi-dev/taichi/
Forum:  https://forum.taichi.graphics/

                                TAICHI EXAMPLES
+-----------------------------------------------------------------------------+
| 0: ad_gravity           | 25: laplace             | 50: physarum            |
| 1: circle_packing_image | 26: laplace_equation    | 51:                     |
|                         |                         | poisson_disk_sampling   |
| 2: comet                | 27: mandelbrot_zoom     | 52: print_offset        |
| 3: cornell_box          | 28: marching_squares    | 53: rasterizer          |
| 4: diff_sph             | 29: mass_spring_3d_ggui | 54: regression          |
| 5: euler                | 30: mass_spring_game    | 55: sdf_renderer        |
| 6: explicit_activation  | 31:                     | 56: simple_derivative   |
|                         | mass_spring_game_ggui   |                         |
| 7: export_mesh          | 32: mciso_advanced      | 57: simple_texture      |
| 8: export_ply           | 33: mgpcg               | 58: simple_uv           |
| 9: export_videos        | 34: mgpcg_advanced      | 59: snow_phaseField     |
| 10: fem128              | 35: minimal             | 60: stable_fluid        |
| 11: fem128_ggui         | 36: minimization        | 61: stable_fluid_ggui   |
| 12: fem99               | 37: mpm128              | 62: stable_fluid_graph  |
| 13: fractal             | 38: mpm128_ggui         | 63: taichi_bitmasked    |
| 14: fractal3d_ggui      | 39: mpm3d               | 64: taichi_dynamic      |
| 15: fullscreen          | 40: mpm3d_ggui          | 65: taichi_logo         |
| 16: game_of_life        | 41: mpm88               | 66: taichi_ngp          |
| 17: gui_image_io        | 42: mpm88_graph         | 67: taichi_sparse       |
| 18: gui_widgets         | 43: mpm99               | 68: texture_graph       |
| 19: implicit_fem        | 44:                     | 69: tutorial            |
|                         | mpm_lagrangian_forces   |                         |
| 20:                     | 45: nbody               | 70:                     |
| implicit_mass_spring    |                         | two_stream_instability  |
| 21:                     | 46: odop_solar          | 71: vortex_rings        |
| initial_value_problem   |                         |                         |
| 22: jacobian            | 47: oit_renderer        | 72: waterwave           |
| 23:                     | 48: patterns            |                         |
| karman_vortex_street    |                         |                         |
| 24: keyboard            | 49: pbf2d               |                         |
+-----------------------------------------------------------------------------+
42
Running example minimal ...
[Taichi] Starting on arch=x64
42.0
>>> Running time: 0.22s

Consider attaching this log when maintainers ask about system information.
>>> Running time: 6.75s
@bobcao3
Copy link
Collaborator

bobcao3 commented Mar 28, 2023

-3 means error not supported, so we accidentally used a unsupported / not activated feature I think...

We will look into it

@bobcao3
Copy link
Collaborator

bobcao3 commented Mar 30, 2023

We have located the bug, it appears to be an issue with the maxComputeWorkGroupCount values on AMD GPUs. We should be able to fix this quickly

@chenzhekl
Copy link

Any updates for this issue? This seems to have affected Intel cards as well.

@chenzhekl
Copy link

chenzhekl commented Sep 15, 2023

Here is a minimal script to reproduce the issue:

import numpy as np
import taichi as ti

ti.init(arch=ti.gpu)


np_arr = np.ones((100000000,), dtype=np.float32)
ti_field = ti.field(dtype=ti.f32, shape=(np_arr.shape[0],))


@ti.kernel
def run(dst: ti.template(), src: ti.types.ndarray()):
    for I in dst:
        dst[I] = src[I]


run(ti_field, np_arr)

Once the length of np_arr exceeds a point, we get RhiResult(-3) for the vulkan backend.

If this issue is low priority on your list, could you please advise me how I may contribute a PR for the issue? @bobcao3

Thanks

@dme49
Copy link

dme49 commented Jun 20, 2024

No PR, sorry, but if somebody wants to pick this up (e.g. as part of PR #7333), the following patch fixes this for me (linux, mesa radv vulkan, amd). The problem is in the const-range case of spir-v range-for codegen, which can currently ask for an unbounded number of workgroups. This patch just applies a fixed cap that matches the dynamic case. I'm assuming the performance implications don't matter, but you may know better.

diff --git a/taichi/codegen/spirv/spirv_codegen.cpp b/taichi/codegen/spirv/spirv_codegen.cpp
index e1e1124fd..b9167f6e0 100644
--- a/taichi/codegen/spirv/spirv_codegen.cpp
+++ b/taichi/codegen/spirv/spirv_codegen.cpp
@@ -2000,7 +2000,17 @@ class TaskCodegen : public IRVisitor {
           ir_->i32_type(), stmt->begin_value, false);  // Named Constant
       total_elems = ir_->int_immediate_number(ir_->i32_type(), num_elems,
                                               false);  // Named Constant
-      task_attribs_.advisory_total_num_threads = num_elems;
+      // To avoid exceeding device limits, we must cap total_num_threads so
+      // that the eventual num_workgroups = total_num_threads/block_dim is in
+      // range.  Use the same kMaxNumThreadsGridStrideLoop cap as the dynamic
+      // case; that's probably a bit conservative for typical (num_elems,
+      // block_dim) combinations, but there's little to be gained by adapting
+      // to the actual limit even if that were readily available.
+      task_attribs_.advisory_total_num_threads = std::min(
+          kMaxNumThreadsGridStrideLoop, num_elems);
+      TI_DEBUG("num_elems={} block_dim={} -> advisory_total_num_threads={}",
+               num_elems, stmt->block_dim,
+               task_attribs_.advisory_total_num_threads);
     } else {
       spirv::Value end_expr_value;
       if (stmt->end_stmt) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug We've confirmed that this is an BUG vulkan Vulkan backend
Projects
Status: In Progress
Development

No branches or pull requests

4 participants