Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile the problem encountered by Pytorch3D on Windows #1127

Closed
RobinLiuZX opened this issue Mar 16, 2022 · 9 comments
Closed

Compile the problem encountered by Pytorch3D on Windows #1127

RobinLiuZX opened this issue Mar 16, 2022 · 9 comments
Assignees
Labels
installation Installation questions or issues

Comments

@RobinLiuZX
Copy link

If you do not know the root cause of the problem / bug, and wish someone to help you, please
post according to this template:

馃悰 Bugs / Unexpected behaviors

**C:\Users\11750.conda\envs\ROBIN\lib\site-packages\torch\include\pybind11\cast.h(1429): error: too few arguments for template template parameter "Tuple"
detected during instantiation of class "pybind11::detail::tuple_caster<Tuple, Ts...> [with Tuple=std::pair, Ts=<T1, T2>]"
(1507): here

C:\Users\11750.conda\envs\ROBIN\lib\site-packages\torch\include\pybind11\cast.h(1503): error: too few arguments for template template parameter "Tuple"
detected during instantiation of class "pybind11::detail::tuple_caster<Tuple, Ts...> [with Tuple=std::pair, Ts=<T1, T2>]"
(1507): here

2 errors detected in the compilation of "D:/HEU/pytorch3d-0.6.1/pytorch3d/csrc/ball_query/ball_query.cu".
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc.exe' failed with exit status 1**
My mistake is as mentioned above.How can this problem solve this problem? Respected author!

@bottler bottler self-assigned this Mar 16, 2022
@bottler bottler added the installation Installation questions or issues label Mar 16, 2022
@bottler
Copy link
Contributor

bottler commented Mar 16, 2022

See #1024 . This is a known problem we have with newer versions of visual studio.

@bottler
Copy link
Contributor

bottler commented Mar 16, 2022

Actually, I think one or both of the following two changes will fix it, in both iou_box3d.h and ball_query.h:

  • remove inline
  • change at::Tensor tor torch::Tensor

@dalton-omens
Copy link

I'm having the same issue, I downgraded Visual Studio to 16.7.26 (my cl.exe is now 14.27.29110), made the changes to those headers you mentioned, and removed "-std=c++14" from nvcc_args, and the error persists. CUDA 11.6, torch==1.10.2+cu113. Trying to get my CUDA downgraded to 11.3 because of the minor version mismatch but it doesn't seem like that would be the problem.

@RobinLiuZX
Copy link
Author

Thank you for your enthusiastic reply and patient answer all the time, I will seriously solve this problem @bottler

@bottler
Copy link
Contributor

bottler commented Mar 17, 2022

I was expecting the VS downgrade to fix it even without changing the code, because that has worked for others.

(I don't think CUDA version is relevant. On one of the other issues I mused that it might be, but several reports have the same problem with very different CUDA versions.)

@dalton-omens
Copy link

I looked through the other linked issues; in #1024 OP never confirmed that downgrading the compiler is what fixed this specific issue (Tuple template). Another user in that thread mentioned that the issue persisted regardless of compiler version. In #920 the user seemed to fix the tuple issue by messing around with torch/CUDA versions, no confirmation of downgrading VS. #876 does not have any mention of this Tuple error, but it is where the original downgrading VS solution comes from. There might be private conversations I can't see, but it seems like downgrading VS was solving other problems unless I'm missing something.

@bottler
Copy link
Contributor

bottler commented Mar 17, 2022

Sorry, yes, I may be confused about versions. This error looks like a bad interaction between nvcc and pybind11. (We only use pybind11 in ext.cpp and no other translation units.) Anpother idea. Several times in the past, we have had to be careful with headers to make sure that torch/extension.h is not included in any .cu files, and I think one of the reasons for that has been to avoid nvcc being presented with pybind11.

There are three .cu files which include utils/pytorch3d_cutils.h which includes torch/extension.h, and they are all from the last few months. I wonder if avoiding these solves the problem.

@bottler
Copy link
Contributor

bottler commented Mar 17, 2022

Specifically:

  • remove #include <utils/pytorch3d_cutils.h> from ball_query/ball_query.cu, iou_box3d/iou_box3d.cu and sample_farthest_points/sample_farthest_points.cu
  • replace MAX_THREADS_PER_BLOCK with 1024 in sample_farthest_points/sample_farthest_points.cu.

@dalton-omens
Copy link

I can confirm that those changes to the .cu files have fixed this issue. Thanks for the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installation Installation questions or issues
Projects
None yet
Development

No branches or pull requests

4 participants