-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Add example for optical flow visualizaition and RAFT #5316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 0352a47 (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
🚧 1 ongoing upstream failure:These were probably caused by upstream breakages that are not fixed yet.
This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @oke-aditya this looks great so far! Made a quick review as discussed offline
gallery/plot_optical_flow.py
Outdated
from torchvision.utils import flow_to_image | ||
img = flow_to_image(flow) | ||
|
||
img = img.squeeze(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to show the flow image and frame_1 side by side :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly wanted to do that :( Faced some errors, will try again.
gallery/plot_optical_flow.py
Outdated
|
||
""" | ||
|
||
# sphinx_gallery_thumbnail_path = "../../gallery/assets/optical_flow_thumbnail.png" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need to remove the thumbnail, I'm not sure we have the rights of the optical_flow_thumbnail.png
image that was uploaded. But I'm sure we can use on of the images that this example will generate!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's pending. Haven't worked much on this yet, will try to complete over this week.
Not much well over the weekend 😢 will try to finish this in few days |
@oke-aditya thank you so much for your work so far. I took care of polishing the example a bit, here are the latest rendered docs: https://1177465-73328905-gh.circle-artifacts.com/0/docs/auto_examples/plot_optical_flow.html#sphx-glr-auto-examples-plot-optical-flow-py LMK what you think of it! |
Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fantastic, awesome. Few comments ❤️
# GIF using ffmpeg with e.g.: | ||
# | ||
# ` | ||
# ffmpeg -f image2 -framerate 30 -i predicted_flow_%d.jpg -loop -1 flow.gif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can put the rendered GIF here? (If sphinx can render it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure sphinx can. It's a good idea though, I'll do it once I find a way to reduce the size of the gif to a decent size (the ones i have right now are 300MB...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we can.
https://gifs-as-documentation.readthedocs.io/en/latest/
We need to compress the GIF though :( Maybe possible by running in very small video size say 120x60? For 3 seconds?
|
||
|
||
plt.rcParams["savefig.bbox"] = "tight" | ||
# sphinx_gallery_thumbnail_number = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find the thumbnail. You, sure it's there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this tells sphinx-gallery to use the second image as the thumbnail instead of the first one: https://1177465-73328905-gh.circle-artifacts.com/0/docs/generated/torchvision.models.optical_flow.raft_large.html#torchvision.models.optical_flow.raft_large
# [-1, 1]. The frames we got from :func:`~torchvision.io.read_video` are int | ||
# images with values in [0, 255], so we will have to pre-process them. We also | ||
# reduce the image sizes for the example to run faster. Image dimension must be | ||
# divisible by 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do the image dimensions need to be divisble by 8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a hardcoded constraint within the model, the feature extractor downsamples the images by 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My doubt is it hard necessicity that image sizes should be divisible by 8. Or its adjusted by the model.
The text description above says. Image sizes must be divisible by 8. Meaning that the model does not adjust.
Clearer way can be. Image sizes divisible by 8 are processed faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really a hardcoded constraint and the model cannot accept images that aren't divisible by 8 (even if it wanted to):
torch._assert((h % 8 == 0) and (w % 8 == 0), "input image H and W should be divisible by 8") |
It has to be exactly divisible by 8 because we first downsample the inputs by 8, predict a downsampled flow, and then upsample the predicted flow by a factor of 8:
vision/torchvision/models/optical_flow/_utils.py
Lines 26 to 32 in 74a1efc
def upsample_flow(flow, up_mask: Optional[Tensor] = None): | |
"""Upsample flow by a factor of 8. | |
If up_mask is None we just interpolate. | |
If up_mask is specified, we upsample using a convex combination of its weights. See paper page 8 and appendix B. | |
Note that in appendix B the picture assumes a downsample factor of 4 instead of 8. | |
""" |
If the image wasn't a multiple of 8 to begin with, we wouldn't be able to upsample the flow to the right dimensions.
The fact that it's 8
is somewhat arbitrary (we could downsample by 4 and upsample by 4) but a) this would not follow the paper and b) we would still require images to be divisible by an integer N in general.
Co-authored-by: Aditya Oke <47158509+oke-aditya@users.noreply.github.com>
This reverts commit 2c7a468.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your help @oke-aditya , LGTM, I'll merge when the docs are built!
Lessssss goooooooooooooooo. Yay |
Summary: * Start adding example * Add thumbnail and text * Replace video * Improve * Change default weights of RAFT model builders * WIP * WIP * update handle_legacy_interface input * lots of stuff * Oops, wrong default * Typo * NITs * Reduce image size * Update gallery/plot_optical_flow.py * Remove link to profile * Update gallery/plot_optical_flow.py * Address comments * Nits * Revert "Remove link to profile" This reverts commit 2c7a468. Reviewed By: NicolasHug Differential Revision: D34140253 fbshipit-source-id: e3a129d641335a38ac5c5e3299824e1794c7bb52 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com> Co-authored-by: Aditya Oke <47158509+oke-aditya@users.noreply.github.com> Co-authored-by: Nicolas Hug <nicolashug@fb.com> Co-authored-by: Nicolas Hug <contact@nicolas-hug.com> Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
Closes #5309
Thanks a lot to @NicolasHug for taking it over.