Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new stereo image generation technique (polylines) #56

Merged
merged 1 commit into from
Dec 28, 2022
Merged

new stereo image generation technique (polylines) #56

merged 1 commit into from
Dec 28, 2022

Conversation

semjon00
Copy link
Collaborator

@semjon00 semjon00 commented Dec 25, 2022

Hello!

This MR adds two new over-engineered techniques for gap filling (for creating stereo images). They solve the problem of "ghost" pixels that sometimes appear from beneath objects. The performance cost is quite big, so maybe it would not be a good default?

@semjon00
Copy link
Collaborator Author

semjon00 commented Dec 25, 2022

Some discussion points

(1)
@thygate I think it would be great to showcase stereo and anaglyph images in the repository readme examples (I'd like to finish this technique first so images look better), what do you think?

(2)
My VR device is Google Cardboard-like set (VRB01), to which I stream SBS (stereo) image directly from my computer. This is quite convenient (after the setup is done), especially given the price. I think it would be great to mention Cardboard (and similar) as a viable 3D rig to take advantage of this repository.

(3)
@sina-masoud-ansari I would like to rework calculate_total_divergence:

def calculate_total_divergence(ipd, monitor_w, image_width):
    divergence_cm = ipd * 0.12
    divergence = divergence_cm * monitor_w * (image_width / 1920)
    print(f'divergence: {divergence} px')
    return divergence

I'd like to know what is the meaning of the constant 0.12. Also, it seems odd to have 1920 (resolution width?) here. For my rig and eyes, I eyeballed that ipd=7.5 and screen width = 88nworks best to me, but I am pretty sure my phone is not 88 cm long. Do you think we could simply have a setting called "divergence", that will measure the maximal deviation for a point in percentages of the image width? Or something more complicated, but still general?

By the way, thank you for suggesting the stereo code for this repo, I probably wouldn't be here if I didn't see that.

(4)
FYI: @BlueSkyDefender has an impressive Depth3D repository, where he solves a very similar task - to create an image with divergence from the original image and its depthmap.

@thygate
Copy link
Owner

thygate commented Dec 25, 2022

I'm not on my desktop right now, so i'll reply in more detail later, but I just wanted to mention @sina-masoud-ansari is not the author of the stereo code, he found the repo where it came from. It's listed in the readme.

check : #45

And yes, looking great, we can certainly add your showcase images and link to your viewer to the readme once you're done, nice =)

I will certainly check that blueskydefender repo as soon as i get back, and I will be implementing support for midas 3.1 asap.

@sina-masoud-ansari
Copy link

sina-masoud-ansari commented Dec 27, 2022

@semjon00 thanks for taking the time to make these changes, I look forward to trying them out :) Unfortunately I don't know the detail of the stereo algo. but something dynamic like you suggested makes sense.

old divergence settings replaced with a new one
@semjon00 semjon00 marked this pull request as ready for review December 28, 2022 16:09
@semjon00
Copy link
Collaborator Author

Alright, this took a while - much more than anticipated, as usual, but this time especially so. Anyways, I no longer have sane ideas to optimize this, so it is ready for review I guess?

@thygate
Copy link
Owner

thygate commented Dec 28, 2022

Exited to try it out, I will review and merge asap..

@semjon00
Copy link
Collaborator Author

Not sure about the techniques' names though - do you think it may be reasonable to rename them so they hint at their speed/quality ratio? Maybe end-user would not benefit much from knowing how these algorithms work...

@thygate
Copy link
Owner

thygate commented Dec 28, 2022

I liked your previous naming scheme for the naive_ methods : hard_horizontal, soft_horizontal, but this is fine too, users will have to read the instructions or experiment to find the best option for them. It makes sense to have the techniques in the dropdown in order of increasing quality and processing time. So I'm certainly ok with the current selection : ['none', 'naive', 'naive_interpolating', 'polylines_soft', 'polylines_sharp']

Very good idea again, using polylines, I'm very excited to try it out..

I see you also improved the numba optimizations too, you have clearly put a lot of time in this, thank you !

@thygate
Copy link
Owner

thygate commented Dec 28, 2022

I also like the divergence settings instead of the ipd and monitor width, I briefly thought of incorporating them, but eventually kept it in place as I could not come up with a good replacement. It felt a bit massaged anyway, magic numbers and all, so I'm happy with this setting.

Merging now.

@thygate thygate merged commit a79d487 into thygate:main Dec 28, 2022
@thygate
Copy link
Owner

thygate commented Dec 28, 2022

It really is a big improvement in quality. Really nice work, thanks again !

I think we could make polylines_sharp the default, as it runs fast enough even on my old cpu..

Do you have any specific software to link for streaming the sbs images to the google cardboard ?

@thygate
Copy link
Owner

thygate commented Dec 28, 2022

The current images in the readme are not really a showcase for the stereo images, I've not been able to generate better showcases that i'm satisfied with so far ..

So if anyone has good showcases to demonstrate the stereo image generation in sbs and anaglyph, feel free to post them ..

@semjon00 semjon00 deleted the stereo branch December 28, 2022 23:17
@semjon00 semjon00 changed the title [WIP] new stereo image generation technique (polylines) new stereo image generation technique (polylines) Dec 28, 2022
@semjon00
Copy link
Collaborator Author

semjon00 commented Dec 29, 2022

Sorry for late response, thanks for merging and changing the README. This may be added:

  • SBS Stereo images can easily be viewed in 3D on VR devices, even cheap ones that use a smartphone - like Google Cardboard. To view an SBS image, you may simply display it on the phone screen and then insert the phone into the headset. A more convenient option may be to stream the picture from the computer screen to the phone using Sunshine. You may want to change resolution to match phone's aspect ratio. If you decide to buy a headset, pay attention to the lens' size - usually headsets with larger lenses work the best.

Somewhat verbose, but I think this would be helpful; feel free to edit/trim.

Huh, maybe polylines are not that slow afterall. But I still have this itch at the back of my head that there may be a more accurate way of sampling polylines that I hadn't thought about. I think I just found a new proven assumption: polyline in polylines_soft can never intersect itself. Oh no, I must give that problem another shot.

@semjon00
Copy link
Collaborator Author

semjon00 commented Dec 29, 2022

@thygate
Here are some images that I generated from CC0 photos I found on the internet:
anaglyph-sbs-demo

anaglyph-sbs-demo-1st-half
o06
o03

@thygate
Copy link
Owner

thygate commented Jan 11, 2023

Added your demo images to the readme, also included the extra info on viewing sbs on smartphone.

@MavenDE
Copy link

MavenDE commented Mar 23, 2023

Right eye should see more of the right side of view and left eye more of the left side. It is the opposite here in the generated right image.

@semjon00
Copy link
Collaborator Author

I performed a test to see if images are positioned correctly (that is, left image should be on the left). For this, I created an image where left side and right side were swapped. When I looked at the reversed image I noticed that it surely looked wrong. The depth effect looks correct on the original image.

Please note that this mode does not perfectly recreate 3d image, as there are blurry edges. Pixels, for which no information is known, are interpolated from their neighbors, that are known. It would be cool to have there edges being generated in a more intelligent way.

Here is a reversed image of the mushrooms.
rev

@semjon00
Copy link
Collaborator Author

@MavenDE sorry, forgor to add a mention

@WorldofDepth
Copy link

@semjon00, these inpainting / gap filling methods look great! Is there a standalone implementation of them, for simple generation of a stereopair from a 2D image + depth map, with adjustable deviation?

@semjon00
Copy link
Collaborator Author

semjon00 commented Mar 31, 2023

@WorldofDepth great question! Right now it is not the case, but I think I will prepare a MR to refactor these methods into a separate file. This way it will be easy to copy this functionality into other projects.

@semjon00 semjon00 mentioned this pull request Apr 3, 2023
@semjon00
Copy link
Collaborator Author

@WorldofDepth the code that generates the stereo images is now inside its own separate file and has some documentation. It should now be easy to use it inside other projects.

@VanyVa
Copy link

VanyVa commented Aug 16, 2023

@semjon00, Is there a similar tool for video? video + depth get stereoscopic video.

@semjon00
Copy link
Collaborator Author

@VanyVa Yes, please see the newest version

@ansj11
Copy link

ansj11 commented Jan 2, 2024

I am using this code to generate stereoscopic image. but the generated image has some sketch and distortion, what is the reason? Is my code version out-of-date? By the way, how can i use the newest version to get stereoscopic video?
image

@semjon00
Copy link
Collaborator Author

semjon00 commented Jan 3, 2024

@ansj11
Some distortion is impossible to avoid: any stereoimage generation algorithm will need to generate occluded regions for every half virtually from the thin air. As much as it hurts, my baby (stereoimage code in this repo) is relatively dumb compared with what is actually possible.
Please do check the version though. As of 2024-01-03, it should be v0.4.4.

@ansj11
Copy link

ansj11 commented Jan 4, 2024

@ansj11 Some distortion is impossible to avoid: any stereoimage generation algorithm will need to generate occluded regions for every half virtually from the thin air. As much as it hurts, my baby (stereoimage code in this repo) is relatively dumb compared with what is actually possible. Please do check the version though. As of 2024-01-03, it should be v0.4.4.

I am using latest thygate:main, but code is quite different with this MR.

@semjon00
Copy link
Collaborator Author

semjon00 commented Jan 4, 2024

@ansj11 Yes, it is ¯_(ツ)_/¯

@ijingo
Copy link

ijingo commented Apr 18, 2024

Hello!

This MR adds two new over-engineered techniques for gap filling (for creating stereo images). They solve the problem of "ghost" pixels that sometimes appear from beneath objects. The performance cost is quite big, so maybe it would not be a good default?

hi @semjon00 I was curious about origin of the algorithm. Is it from a specific paper or another resource? I'd like to understand its background better.

Thanks for your great work!

@semjon00
Copy link
Collaborator Author

@ijingo Hello! No, just an ad-hoc thing that I did, using no papers and no rigorous approach. It was better than "the old one", but it is by no means good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants