VideoPose3D on Android #160
Comments
I was quietly following this issue. Did you find a solution? |
I haven't had much time to look into it. I was playing around with TFLite Pose Estimation (https://www.tensorflow.org/lite/models/pose_estimation/overview) and it runs faster than Detectron/Detectron2 which is better for mobile, and you can play around with it to make it more or less accurate. I believe it'll require some post-processing to get into the right format for input into VideoPose3D (I think this repo might provide some insight: https://github.com/darkAlert/VideoPose3d_with_Detectron2 or maybe the inference scripts in this repo). The next step I was thinking of following the process described in the PyTorch mobile documentation with the TemporalModelBase in the model.py file, but haven't gotten around to it. |
I've been playing with the TFLite 2D model for a while and am going to try plugging it into VideoPose3D. I'll post updates here. UPDATE: I've loaded VideoPose3D into Android as a Torch Script, along with the Posenet library. I'll be drawing some visualizations and trying to improve runtime. UPDATE: so far I've swapped out Detectron for Posenet and got the visualization working on my computer. See my repository here. You need to download the Posenet lite model. From root directory of my repo, you can run python3 run_3d_vis.py -d custom -k myvideos -arc 3,3,3,3,3 -c checkpoint --evaluate pretrained_h36m_detectron_coco.bin --render --viz-subject (relative path to video) --viz-action custom --viz-camera 0 --viz-video (relative path to video) --viz-output (relative path to desired output location) --viz-size 6 to get visualization. You can also now do live real-time inference (with Posenet still for the 2D tracker) using webcam on PC (see instructions on the repo). I think the accuracy of Posenet can be tuned up, right now it's a little glitchy. |
I made the app, but the 3d inference is still running too slow. Does anyone have any ideas to speed it up? I will try (a) only running it every couple frames, (b) try to figure out how to run it on the GPU, and (c) figuring out how to fuse some of the convolutional/batchnorm layers of the model using torch.quantization. Also, is torch in Android guaranteed to use max number of threads? I saw something online about org.pytorch.Module.setNumThreads, but I can't find that method. |
Great job. Testing it out right now, looks very promising. Performance-wise, looks like the documented approach for better performance on mobile is close to option (c): https://pytorch.org/tutorials/recipes/mobile_perf.html?highlight=mobile. |
Does anyone know if this model would generally run faster or slower than the model outline in the 3d-pose-baseline paper by Martinez et al.? I'm wondering if should try 3d-pose-baseline to speed things up, but I don't have enough background knowledge to know if it's lighter. All I know is that in the paper, they claimed 3d-pose-baseline inferred in 2ms on a Titan Xp. |
How would one go about running this model for use in the wild on an Android device ?
I imagine that it would be a similar process than described here: https://pytorch.org/mobile/android/ .
I just wanted to know how feasible this would be or if it has already been done. I'd imagine I would have to use a more lightweight 2D pose inference than Detectron or Detectron2 to keep the runtime reasonable.
Thanks.
The text was updated successfully, but these errors were encountered: