[WIP] Tracing / Scripting #138
Conversation
As this is very much work in progress, here are some quick notes: - demo/trace_model.py has the current state Lots of todos: - there are warnings to be investigated, in particularly for loops and boxlist sizes - it needs some PyTorch JIT fixes - clean up, including reverting unneeded changes - round off the displaying - do a C++ app - try a different model/config
This comment has been minimized.
This comment has been minimized.
I think the indentation in here is off |
This comment has been minimized.
This comment has been minimized.
Thanks, yes! |
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed. If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
@t-vi Can you tell me why this error shows when I am running your demo trace_model.py ? Thanks. The error is: OSError: /home/eric/Disk100G/githubProject/maskrcnn-benchmark/maskrcnn_benchmark/csrc/custom_ops/libmaskrcnn_benchmark_customops.so: undefined symbol: _ZN5torch3jit8ListType9ofTensorsEv |
@Eric-Zhang1990 that symbol (aka |
@t-vi Wondelful work you had done,can you give a tutorial to reproduce your work?I met nearly same error as @Eric-Zhang1990 did. Thanks.
|
@xxradon I think it's premature to expect this to work without bumps, unfortunately. But of course, I appreciate that you are trying and I'll be very happy for suggestions how to improve the build process. One key aspect is that you have to make 100% sure that your libtorch exactly matches the one you used to build and in Python and the headers need to be the right ones, too. Try pointing to it with |
@t-vi Thanks , your suggestion is right,I can run trace_model.py without mistake and got the end_to_end_model.pt file.And when I tested trace_model.cpp,built was ok,but when I ran the demo there are some mistakes : And I know why this error happen,because the libmaskrcnn_benchmark_customops.so is not added into link file,but I added LINK_DIRECTORIES and target_link_libraries in CMakelist.txt,it not worked either. |
Oh dear. I forgot to include that. I didn't actually use cmake, just
and then I called it with It less than fancy, sorry about that. |
Thanks for your reply,but I got the same mistake...Can you try use CMakelist.txt instead,or give a more specific tutorial? |
Can you actually read the .pt from torch? |
@t-vi Now I can trace and save model, but when I run your trace_model.cpp file, it shows an error about nms, which is: Input types:Dynamic, Dynamic, float |
@Eric-Zhang1990 Did you find and link to the libmaskrcnn_benchmark_customops.so (see the g++ above - I'm looking to provide cmake)... |
With the latest set of changes you should get the custom ops built for you. The C++ demo now has a CMakeLists.txt, but it isn't built for now (for lack of a good place to put the binary). |
With recent JIT improvements, we can use the op directly
Thank you, Francisco, for the hint!
self.spatial_scale = spatial_scale | ||
self.sampling_ratio = sampling_ratio | ||
|
||
def forward(self, input, rois): | ||
if torch._C._get_tracing_state(): # we cannot currently trace through the autograd function | ||
return torch.ops.maskrcnn_benchmark.roi_align_forward( |
fmassa
Nov 13, 2018
Contributor
question: do we need to register the backwards for the op in C++ then so that we can perform training properly using the same codepath?
question: do we need to register the backwards for the op in C++ then so that we can perform training properly using the same codepath?
t-vi
Nov 13, 2018
Author
The current obstacle here is that we cannot trace through the Python autograd function.
I'm not aware of a way to register the derivative so we could go through the op directly during training as well.
The current obstacle here is that we cannot trace through the Python autograd function.
I'm not aware of a way to register the derivative so we could go through the op directly during training as well.
t-vi
Nov 14, 2018
Author
Peter responded elsewhere, and "not quite yet", but it is a to-do on his list.
Peter responded elsewhere, and "not quite yet", but it is a to-do on his list.
@t-vi and for any person wanting to use this on GPU I managed to export an end to end model. So for anyone wanting to use this here are the instructions. To install follow the instructions in install.md but do not build the project.
now, go to the demo folder and put two images test1.jpg and test2.jpg in this folder. create a new file or replace trace_model.py with this code:
This will trace the model with CUDA enabled. Two outputs are provided. The first one is just the tracing of the model. If you want to output an image in opencv to plot the results use the example already provided in the cpp folder. The speed is 0.28 s in average on a 2080 ti, much faster than the CPU version which runs at about 2 s. If you want to only use the model you can access the contents mask, scores, bboxes and labels by importing the "traced.pt" model and using:
Hope this helps someone! |
thank you for your sharing! if possible, could you share your c++ infer code for us, anyway, thank you so much again for your excellence sharing |
@sukisleep I actually work on a C++ framework, so it will have some differences when you want to implement it on your own code. Here is what I did: First, download the appropriate version of the c++ pytorch. Since the python side of things only works on pytorch 1.0.1 and since i'm using CUDA 10 that would be "https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-1.0.1.zip". Once you have that you need to choose your version of c++. My framework is in c++14 so it wont work with the downloaded package out of the box. To fix the error go to libtorch/share/cmake and in each of those files if you find something like: or
First you will need CUDA and OpenCV of course, so you need these lines:
You also need to import libtorch, so use this line. Now let's include the relevant directories of our libs:
And finally, let's link the libraries. Now you can compile your code with pytorch libraries. So in your File.h you want to import the following:
in File.h you want to define some variables:
Now for the cpp, File.cpp
And once that is done the rest is very straight foward.
And that is all, you have your output in c++. |
nicolasCruzW21 Thank you so much for the very detailed description. Thanks. |
@nicolasCruzW21 Glad you got it worked out. Can you elabrate how to get the libmask_rcnn_customops.so ? Do u have full project with CMakeLists.txt and file about inference in C++? |
Any update on the JIT tracing of BoxList? |
Thanks for your work! I am a fresh. But when I install the environment according to INSTALL.md, when i compile the
my sets are: |
Thanks for the discuss section,I can get the result by libtorch,but I have a problem when I test the image which has none objects(my work is object detection,is this the code bug or my fault? |
I solved it by make a new dir name |
hi @t-vi |
and also meet the problem: |
I meet the same problem for two reasons,first my |
it seems that boxlist does not support tracing. @imranparuk |
hi @nicolasCruzW21
to solve it, u can set hope it can help u. |
@jojojo29 I have the same problem in nms by cuda, for some reason, I must build it by Cmake, have you already got the nms by cuda?could you share us the way to get nms both by cpu and gpu by cmake? |
@xxradon Hi, I just came across this earlier. Can you elaborate a little bit more on how you fix the libtorch problem? Many thanks. |
So given that torchvision is the place to go for things like those explored here, I'm closing this PR. |
Thanks a lot for all your work and help Thomas! |
@t-vi, what is your pytorch vision? Torch 1.3.1 |
how can you push mask rcnn on Android. It is amazing. Can you share keys for this or repo. Thanks |
For anyone still intrested in this, I made a repo compatible with pytorch 1.5 to export the cuda enabled versions of the models rather than the cpu versions. Also fixed the issue of crashing when no objects are in the image. The code is ugly since this is a port of the original 1.0.1 version. Thanks to @t-vi again for all the help and developing the vast majority of the code. You can find the code here. https://github.com/nicolasCruzW21/maskrcnn-Tracing.git I hope it's useful |
@t-vi were you able to run the mask-rcnn conversion as hinted on https://lernapparat.de/pytorch-android/ ? if so could you kindly give us a hint, what would be the way to do this? |
Yes, that was fun in 2018. I'd recommend using the TorchVision provided code or @nicolasCruzW21 branch over anything in this PR. |
Great work. |
With this patch you can get a traced/scripted MaskRCNN model. To facilitate discussion - maybe also to have a test-case for the remaining features to be wanted in the jit - I put this out in it's current state, rather than working in private towards a mergeable patch.
I appreciate your feedback but note that it's not quite ready yet.
We have:
So all of this is very raw, and there are classes of hacks described in the issue #27, in particular
Lots of Todos:
As for my use case: It also would work on Android if PyTorch was there yet.