-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regarding CPU implementation of correlation function. #39
Comments
To make inference on CPUs work, you will have to convert the following CUDA code to something that runs on CPUs instead. pytorch-pwc/correlation/correlation.py Lines 8 to 33 in cf0d2f2
pytorch-pwc/correlation/correlation.py Lines 35 to 103 in cf0d2f2
There is nothing you need to be familiar with in terms of CuPy really, the |
i get problem in the code above when trace model from Pytorch to TorchScript
|
This fixes issue sniklaus#39 and allows Tensorboard's SummaryWriter.add_graph to work although there are some TracerWarnings. `TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!`
I would like to ask, is there a PyTorch implementation of this CUDA_C code, because I really haven't changed the Python version, if you have done this work before, I hope you can help me solve this problem |
Hi, thanks for the implementation. In my use case, I need to perform the inference on CPU. Inspecting your code in the file
correlation.py
, I kind of get that for that, we need to call the extern C functions ourselves instead of invoking CuPy functions to do it for us. In your code, each C function is called like this:cupy_launch('kernel_Correlation_rearrange', cupy_kernel('kernel_Correlation_rearrange', { 'input': second, 'output': rbot1 }))( grid=tuple([ int((n + 16 - 1) / 16), second.shape[1], second.shape[0] ]), block=tuple([ 16, 1, 1 ]), args=[ n, second.data_ptr(), rbot1.data_ptr() ] )
I am not familiar with CuPy code, so it will be helpful if you could explain these function calls a bit and give any clue about how to do the equivalent stuff on CPU. I understand that the args in each call are the arguments passed to the C function, but I am not sure what
grid
andblock
signify here. Probably, they may not be needed when CuPy is not used. As I only need to run on CPU at test time, I guess I don't need to care about theupdateGrad
functions.I will appreciate your help/suggestion regarding this.
The text was updated successfully, but these errors were encountered: