-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommendation: Multiple segmentation runs for each frame #65
Comments
We have basically three fields where we could apply techniques to allow for multiple passes:
Let's go by the third topic first: This would basically downsample the original input image until it fits into the net completely. With that detection mask you'd use a bigger resolution to enhance on the resolution in the areas where the mask tells you about detections. Priorities could be ordered in a way that branches of that detection tree with previous detections from a previous run are favoured when evaluating the next section to analyze. Thus you could potentially process the whole image in full detail, but shortcut at any point where you don't have any more regions where previously stuff was detected. Every now and then you'd keep processing running until all areas are completed and if there's any new area popping up you could use this to update your order for processing, thus including newly found sections. The second is my suggestion from my comment in #58: Sweep over the whole image in such a way that you cover the whole image, BUT also ensure there's some overlap (~30-50%) between sections left to allow for situations when using only non-overlapping areas would cause a mal-detection. The first point from above finally could be used to enhance the resolution of the final mask. All in all this would require the segmentation step to be parallelized and to allow for "save points" in processing that could be cancelled from further processing. Additionally we'd probably need some sort of task queueing for this? |
Yes. Scaling could also be great, especially with high res inputs. This should be something that scales with the input resolution. To me this all sounds like something we can do with affine transforms. |
Calculating multiple slightly shifted segmentation masks would let us make a higher resolution combined mask. The segmentation input would be shifted by a few pixels up or down each time it is executed (usually in a square grid). Next the masks are upscaled and shifted by the same amount and then averaged. This is a form of image super resolution. This all could be done in parallel.
This same technique can also be extended to include seperate larger jumps sideways to increase the span of the mask. Each larger jump would then be followed by the small super resolution related jumps.
The idea to cover the whole image was previously mentioned by @BenBE in #58 (comment)
This can also be extended to slightly change the sampling positions between frames, and then average two or more frames.
The text was updated successfully, but these errors were encountered: