Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot run the test shell script #12

Closed
unw9527 opened this issue Jul 21, 2021 · 6 comments
Closed

Cannot run the test shell script #12

unw9527 opened this issue Jul 21, 2021 · 6 comments

Comments

@unw9527
Copy link

unw9527 commented Jul 21, 2021

Hello. Thanks for your work. However, when I try to run the test script of coseg-alien, it gives me an error message like this:

name: coseg-alien
0:   0%|                    | 0/37 [00:00<?, ?it/s]/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc(40): error: calling a constexpr __host__ function("floor") from a __global__ function("func_bc2f95c82b48131a_0") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

1 error detected in the compilation of "/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc".
[e 0721 14:21:37.874137 48:C15 parallel_compiler.cc:261] [Error] source file location: /home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc
[e 0721 14:21:37.874176 48:C15 parallel_compiler.cc:264] Compile fused operator(18/56) failed: [Op(0x5561911a3de0:0:0:1:i0:o1:s0,array->0x5561911a3e80),Op(0x5561911a2360:0:0:1:i1:o1:s0,broadcast_to->0x5561911a2410),Op(0x5561911a37f0:0:0:1:i2:o1:s0,binary.mod->0x5561911a30a0),] 

Reason: [f 0721 14:21:37.873882 48:C15 log.cc:387] Check failed ret(256) == 0(0) Run cmd failed: cd /home/xxx/.cache/jittor/default/g++ && /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc '/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc'     -std=c++14 -Xcompiler -fPIC  -Xcompiler -march=native  -Xcompiler -fdiagnostics-color=always  -I/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -DHAS_CUDA -I'/home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include' -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -lstdc++ -ldl -shared  -x cu --cudart=shared -ccbin='/usr/bin/g++' --use_fast_math  -w  -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -arch=compute_61  -code=sm_61  -o '/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.so'
0:   0%|                    | 0/37 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "train_seg.py", line 162, in <module>
    test(net, test_dataset, writer, 0, args)
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/__init__.py", line 257, in inner
    ret = func(*args, **kw)
  File "train_seg.py", line 64, in test
    preds = np.argmax(outputs.data, axis=1)
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.data)).

Types of your inputs are:
 self   = Var,

The function declarations are:
 inline DataView data()

Failed reason:[f 0721 14:21:38.068584 96 parallel_compiler.cc:316] Error happend during compilation, see error above.

Any ideas on why this happens? I have downloaded the data of coseg-alien via the shell script provided.

@lzhengning
Copy link
Owner

Hi @unw9527 ,

Could you please upgrade jittor and clean the cache by rm -r ~/.cache/jittor ?

If there are still problems, please let me know.

@unw9527
Copy link
Author

unw9527 commented Jul 21, 2021

Hi @lzhengning ,
Thanks for your reply. I did what you said above and now jittor's version is 1.2.3.73 (originally 1.2.3.71). And I clean the cache as well.
But it gives me another error message now, as follows.

nvcc fatal   : Value 'c++14' is not defined for option 'std'
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor_utils/__init__.py", line 152, in do_compile
    return cc.cache_compile(cmd, cache_path, jittor_path)
RuntimeError: [f 0721 21:17:00.171152 12 log.cc:387] Check failed ret(256) == 0(0) Run cmd failed: cd /home/xxx/.cache/jittor/default/g++ && /usr/local/cuda/bin/nvcc /home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src/misc/nan_checker.cu      -std=c++14 -Xcompiler -fPIC  -Xcompiler -march=native  -Xcompiler -fdiagnostics-color=always  -I/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -DHAS_CUDA -I'/usr/local/cuda/include' -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -I/home/xxx/.cache/jittor/default/g++  -O2  -x cu --cudart=shared -ccbin='/usr/bin/g++'   -w  -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -c  -o /home/xxx/.cache/jittor/default/g++/obj_files/nan_checker.cu.o
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "train_seg.py", line 10, in <module>
    import jittor as jt
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/__init__.py", line 18, in <module>
    from . import compiler
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/compiler.py", line 1106, in <module>
    compile(cc_path, cc_flags+opt_flags, files, 'jittor_core'+extension_suffix)
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/compiler.py", line 93, in compile
    jit_utils.run_cmds(cmds, cache_path, jittor_path, "Compiling "+base_output)
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor_utils/__init__.py", line 193, in run_cmds
    for i,_ in enumerate(p.imap_unordered(do_compile, cmds)):
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/multiprocessing/pool.py", line 748, in next
    raise value
RuntimeError: [f 0721 21:17:00.171152 12 log.cc:387] Check failed ret(256) == 0(0) Run cmd failed: cd /home/xxx/.cache/jittor/default/g++ && /usr/local/cuda/bin/nvcc /home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src/misc/nan_checker.cu      -std=c++14 -Xcompiler -fPIC  -Xcompiler -march=native  -Xcompiler -fdiagnostics-color=always  -I/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -DHAS_CUDA -I'/usr/local/cuda/include' -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -I/home/xxx/.cache/jittor/default/g++  -O2  -x cu --cudart=shared -ccbin='/usr/bin/g++'   -w  -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -c  -o /home/xxx/.cache/jittor/default/g++/obj_files/nan_checker.cu.o

I found that this might be caused by the low version of CUDA. After I ran the command python3 -m jittor_utils.install_cuda suggested by Jittor team, this error message was gone and it gave me an error message like before:

[i 0722 13:02:24.137653 08 compiler.py:869] Jittor(1.2.3.73) src: /home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor
[i 0722 13:02:24.143181 08 compiler.py:870] g++ at /usr/bin/g++(5.4.0)
[i 0722 13:02:24.143247 08 compiler.py:871] cache_path: /home/xxx/.cache/jittor/default/g++
[i 0722 13:02:24.155516 08 install_cuda.py:37] cuda_driver_version: [11, 2]
[i 0722 13:02:24.161784 08 __init__.py:286] Found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc(11.2.152) at /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc.
[i 0722 13:02:24.214301 08 __init__.py:286] Found gdb(7.11.1) at /usr/bin/gdb.
[i 0722 13:02:24.221481 08 __init__.py:286] Found addr2line(2.26.1) at /usr/bin/addr2line.
[i 0722 13:02:24.239643 08 compiler.py:958] py_include: -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m
[i 0722 13:02:24.258129 08 compiler.py:960] extension_suffix: .cpython-37m-x86_64-linux-gnu.so
[i 0722 13:02:24.422251 08 compiler.py:1098] OS type:ubuntu OS key:ubuntu
[i 0722 13:02:24.423282 08 __init__.py:178] Total mem: 62.83GB, using 16 procs for compiling.
[i 0722 13:02:24.563519 08 jit_compiler.cc:22] Load cc_path: /usr/bin/g++
[i 0722 13:02:24.652271 08 init.cc:55] Found cuda archs: [61,]
[i 0722 13:02:24.666418 08 __init__.py:286] Found mpicc(1.10.2) at /usr/bin/mpicc.
[i 0722 13:02:24.704353 08 compiler.py:667] handle pyjt_include/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/mpi/inc/mpi_warper.h
[i 0722 13:02:24.724936 08 compile_extern.py:347] Downloading nccl...
[i 0722 13:02:24.785298 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cublas.h
[i 0722 13:02:24.797011 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublas.so
[i 0722 13:02:24.797106 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublasLt.so.11
[i 0722 13:02:25.036328 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cudnn.h
[i 0722 13:02:25.056544 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn.so.8
[i 0722 13:02:25.056619 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_infer.so.8
[i 0722 13:02:25.059087 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_train.so.8
[i 0722 13:02:25.059688 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_infer.so.8
[i 0722 13:02:25.083144 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_train.so.8
[i 0722 13:02:25.096104 08 compiler.py:667] handle pyjt_include/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/cudnn/inc/cudnn_warper.h
[i 0722 13:02:25.351712 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/curand.h
[i 0722 13:02:25.374880 08 compile_extern.py:20] found /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcurand.so
[i 0722 13:02:25.400630 08 cuda_flags.cc:26] CUDA enabled.
name: coseg-alien
0:   0%|                         | 0/37 [00:00<?, ?it/s]/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc(40): error: calling a constexpr __host__ function("floor") from a __global__ function("func_bc2f95c82b48131a_0") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

1 error detected in the compilation of "/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc".
[e 0722 13:02:28.692090 60:C8 parallel_compiler.cc:261] [Error] source file location: /home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc
[e 0722 13:02:28.692354 60:C8 parallel_compiler.cc:264] Compile fused operator(18/56) failed: [Op(0x55ab6354f100:0:0:1:i0:o1:s0,array->0x55ab6354e9a0),Op(0x55ab6354dcf0:0:0:1:i1:o1:s0,broadcast_to->0x55ab6354d5c0),Op(0x55ab6354e300:0:0:1:i2:o1:s0,binary.mod->0x55ab6354e390),] 

Reason: [f 0722 13:02:28.691857 60:C8 log.cc:387] Check failed ret(256) == 0(0) Run cmd failed: cd /home/xxx/.cache/jittor/default/g++ && /home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc '/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.cc'     -std=c++14 -Xcompiler -fPIC  -Xcompiler -march=native  -Xcompiler -fdiagnostics-color=always  -I/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/src -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -I/home/xxx/anaconda3/envs/subdivnet/include/python3.7m -DHAS_CUDA -I'/home/xxx/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include' -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -lstdc++ -ldl -shared  -x cu --cudart=shared -ccbin='/usr/bin/g++' --use_fast_math  -w  -I'/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/extern/cuda/inc'  -arch=compute_61  -code=sm_61  -o '/home/xxx/.cache/jittor/default/g++/jit/_opkey0:array_T:int32__JIT:1__JIT_cuda:1__index_t:int32___opkey1:broadcast_to_Tx:int32__DI...hash:bc2f95c82b48131a_op.so'
0:   0%|                         | 0/37 [00:08<?, ?it/s]
Traceback (most recent call last):
  File "train_seg.py", line 162, in <module>
    test(net, test_dataset, writer, 0, args)
  File "/home/xxx/anaconda3/envs/subdivnet/lib/python3.7/site-packages/jittor/__init__.py", line 257, in inner
    ret = func(*args, **kw)
  File "train_seg.py", line 64, in test
    preds = np.argmax(outputs.data, axis=1)
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.data)).

Types of your inputs are:
 self   = Var,

The function declarations are:
 inline DataView data()

Failed reason:[f 0722 13:02:34.479193 08 parallel_compiler.cc:316] Error happend during compilation, see error above.

@lzhengning
Copy link
Owner

I reproduced this error in the latest jittor. This seems to be a bug that was introduced recently, and will be fixed soon.

Can you try to install jittor by python3.7 -m pip install jittor==1.2.3.48? I have tested this version and it works.

@unw9527
Copy link
Author

unw9527 commented Jul 23, 2021

It works. Thanks.

@lzhengning
Copy link
Owner

Closed because the latest jittor has fixed the bugs.

@1170300814
Copy link

no! they are not fix this bug

@1170300814 1170300814 mentioned this issue Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants