Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade pytorch version #1

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Conversation

jimjam-slam
Copy link
Owner

No description provided.

@cyrillkuettel
Copy link

@jimjam-slam Looks promising, I'd like to try it out myself. If you can, provide a download link to the model and properties (input size, expected output etc.)

@jimjam-slam
Copy link
Owner Author

@jimjam-slam Looks promising, I'd like to try it out myself. If you can, provide a download link to the model and properties (input size, expected output etc.)

Thanks! I'll try and get it to you tonight :)

@jimjam-slam
Copy link
Owner Author

jimjam-slam commented Feb 13, 2023

(I might also try to move the plugin back to PyTorch 1.10.0 and re-train my model in the next few days to see if that ameliorates the inference-time issues I'm having. As you can see in commits 3a8f309..c9bcb1a, I experimented with trying to follow the instructions in the warning, but I'm beginning to get out of my depth modifying the Objective-C in TorchModule.mm.)

@cyrillkuettel
Copy link

You're using a object detection model right? So the structure of the output tensor is slightly different. I think here lies the root cause of the issue.

- (NSArray<NSNumber*>*)detectImage:(void*)imageBuffer {
    try {
        at::Tensor tensor = torch::from_blob(imageBuffer, { 1, 3, input_height, input_width }, at::kFloat);
        c10::InferenceMode guard;
        auto outputTuple = _impl.forward({ tensor }).toTuple();
        auto outputTensor = outputTuple->elements()[0].toTensor();
        float* floatBuffer = outputTensor.data_ptr<float>();
        

Notice also c10::InferenceMode guard instead of this AutoNonVariableTypeMode stuff, which is I think what we need. What it does is makes some performance optimizations, because "only" inference is needed.

I'm looking forward to try it out myself.

@jimjam-slam
Copy link
Owner Author

@cyrillkuettel Thanks! I was so close! 😆

Here's a link to the model: https://drive.google.com/file/d/1q1Kd-tWAtO24um-fMPeLyjY4oVDxPK8f/view?usp=sharing

It is indeed an object detection model—it detects four-sided (tetrahedral dice) from roughly overhead images and classifies them according to the vertex facing up (toward the camera). It has four classes corresponding to the faces of the die.

I'm training it on images that are 1440x1080, although I don't recall the training notebook actually asking for image dimensions (perhaps they're being inferred, or perhaps the PyTorch training scripts I'm using have them set somewhere and I'm accidentally overriding the image dimensions).

Here's a sample prediction from one of my images in the notebook:

[{'boxes': tensor([[ 679.0196, 1102.9960,  839.3671, 1259.9280],
          [ 455.1194,  517.3795,  686.8600,  760.6394],
          [ 303.9699,  637.7855,  579.9921,  898.0324],
          [ 270.4033, 1128.6268,  470.5105, 1362.5807],
          [ 686.2780,  646.7851,  881.2689,  833.4600],
          [ 451.2747,  202.4965,  756.0739,  535.2960],
          [ 275.8917, 1133.9680,  466.4841, 1364.4468],
          [ 405.4605,   30.4028,  628.8203,  283.3058],
          [ 449.3770,  527.9981,  681.7755,  771.1302],
          [ 400.0968,   30.0571,  628.3748,  289.8716],
          [ 304.8271,  637.5167,  587.3419,  903.3002],
          [ 449.9048,  202.5068,  760.8394,  544.9302],
          [ 299.5206,  619.2365,  613.8334,  894.6346],
          [ 685.5987,  647.9404,  882.2284,  828.3013],
          [ 677.8384, 1100.2793,  842.8384, 1260.5930],
          [ 684.3727,  644.7953,  881.7263,  830.9077],
          [ 403.6640,   25.6778,  627.4812,  287.7680],
          [ 440.2120,  205.8120,  761.3144,  539.5913],
          [ 445.1892,  197.7191,  757.5585,  569.7745],
          [ 308.2182,  625.9140,  588.1372,  914.2650],
          [ 452.2595,  522.1450,  683.6755,  775.7518],
          [ 677.2131, 1100.3662,  842.1466, 1259.1844],
          [ 436.5467,  537.7557,  683.0253,  813.1878]], device='cuda:0'),
  'labels': tensor([4, 1, 3, 1, 2, 2, 4, 2, 4, 3, 2, 3, 4, 4, 2, 3, 4, 4, 1, 1, 3, 3, 2],
         device='cuda:0'),
  'scores': tensor([0.8661, 0.7896, 0.7790, 0.7025, 0.6438, 0.6340, 0.5694, 0.5649, 0.5066,
          0.4572, 0.3635, 0.3559, 0.3555, 0.3157, 0.3099, 0.2483, 0.2300, 0.1740,
          0.1014, 0.0967, 0.0613, 0.0606, 0.0530], device='cuda:0')}]

@cyrillkuettel
Copy link

cyrillkuettel commented Feb 14, 2023

@jimjam-slam Alright, thanks I'll give it a try tomorrow. I'm guessing dn-set2-d4-test-full-CPU-scripted.pt is the one I need for mobile.

  • input width/height: Most commonly it's 224x224 or 512x512, but not always. It might be inferred. I'm sure there's a way to get this from python.
  • I forgot to ask, did you use any normalization? If that's the case I'll have to adapt the mean / rgb values.

@jimjam-slam
Copy link
Owner Author

@cyrillkuettel Yeah, dn-set2-d4-test-full-CPU-scripted.pt is the one I would expect to use with this Flutter plugin. The full export code is:

# `model` is already trained on gpu
torch.save(model, "./d4/dn-set2-d4-test-full.pt")

# remap to cpu
cpu_device = torch.device("cpu")
saved_model = torch.load("./d4/dn-set2-d4-test-full.pt",
    map_location = cpu_device)

ts_model = torch.jit.script(saved_model)
torch.jit.save(ts_model, "./d4/dn-set2-d4-test-full-CPU-scripted.pt")

from torch.utils.mobile_optimizer import optimize_for_mobile

# script and optimize (but still for torch, not torch lite? not sure)
scripted_module = torch.jit.script(saved_model)
optimized_model = optimize_for_mobile(scripted_module)
optimized_model.save("./d4/dn-set2-d4-test-full-CPU-scripted-optimized.pt")

# save for pytorch lite produces error
# optimized_model._save_for_lite_interpreter("./d4/dn-set2-d4-test-lite.ptl")

I didn't add any image re-scaling or normalisation code, but I did find this comment in the original notebook:

Note that we do not need to add a mean/std normalization nor image rescaling in the data transforms, as those are handled internally by the Mask R-CNN model.

I'll dig out the link to the original notebook!

@cyrillkuettel
Copy link

cyrillkuettel commented Feb 14, 2023

I tried to load it but unfortunately I encountered various stubborn build issues on android.
I'm specifically stuck at this one, which happens on build, if I include torchvision.

https://github.com/cyrillkuettel/flutter_pytorch_mobile

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':app:mergeDebugNativeLibs'.
> A failure occurred while executing com.android.build.gradle.internal.tasks.MergeNativeLibsTask$MergeNativeLibsTaskWorkAction
   > 2 files found with path 'lib/arm64-v8a/libfbjni.so' from inputs:
      - /home/cyrill/.gradle/caches/transforms-3/dd0f51b3138949e6b07ed71e59ea7dbd/transformed/jetified-pytorch_android-1.11/jni/arm64-v8a/libfbjni.so
      - /home/cyrill/.gradle/caches/transforms-3/7df84e79f791405b8ae94b4909f0f50f/transformed/jetified-torchvision_ops-0.13.0/jni/arm64-v8a/libfbjni.so
     If you are using jniLibs and CMake IMPORTED targets, see
     https://developer.android.com/r/tools/jniLibs-vs-imported-targets```
     

@jimjam-slam
Copy link
Owner Author

I haven't tried an Android build yet, unfortunately, although I'd like to be able to do both for my app. I'll try to dive into it, but it might not be until the weekend! Thanks for pushing through this, though!

@cyrillkuettel
Copy link

Actually, forget what I said previously about the output tensor. I looked at the thing again. The model output is a dictionary. (That follows form the sample prediction you've provided above.)

Since we know it contains a dictionary with labels 'boxes', 'labels' and 'scores', we can extract the values from them.
Here is the code:

https://github.com/pytorch/ios-demo-app/blob/0086dc22468205ca5a841c89e625ebc1e286aa2c/D2Go/D2Go/Inference/InferenceModule.mm#L47-L51

@jimjam-slam
Copy link
Owner Author

jimjam-slam commented Feb 15, 2023

Mmm, absolutely! If using c10::InferenceMode works and I can get it producing some inference output on mobile (currently just null, but I haven't had time to implement the fix with c10::InferenceMode yet), I'm feeling pretty optimistic! 🤞🏻

@jimjam-slam
Copy link
Owner Author

(I incidentally only heard about D2Go a few days ago, so I'm keen to give that a whirl too!)

@akheelnazim
Copy link

Goodluck @jimjam-slam @cyrillkuettel . I'm a beginner at all this trying to use my YOLOv8 model on my flutter application. Encountered some issues with the plugin and am just following along with this deep conversation for a working plugin 😆

@jimjam-slam
Copy link
Owner Author

Good luck, @bigbaliboy! I'm still hoping to have some success, but man, PyTorch versioning sucks. Just hard to get the bandwidth for a side project! 😅 I'm also hoping to give Detectron a go (or, if all else fails, make native apps).

@akheelnazim
Copy link

akheelnazim commented Apr 3, 2023

@jimjam-slam I have my Yolov8 model running with the flutter_vision package as a tflite model, and my yolov5 model running with Flutter_pytorch package as a torchscript file. Neither of those has iOS support yet but works fine on Android. The latter of these is expecting someone for a PR on the iOS part. Good luck with the Detectron and keep us updated on how it goes 🙌🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants