Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

som questions about MSI feature and training #2

Closed
visonpon opened this issue Nov 14, 2020 · 1 comment
Closed

som questions about MSI feature and training #2

visonpon opened this issue Nov 14, 2020 · 1 comment

Comments

@visonpon
Copy link

Hi @jamestompkin, thanks for your generous sharing, and I want to ask some questions:
First, I came to notice that in another paper immersive light field video with a layered mesh representation, they also use MSI features to train, are they the same one?

Second, as you guys use ODS images to train and the paper above use almost 50 images from different angles to train, I wonder whether can use some fisheye cameras(e.g, four located at the vertex of a square) to capture a dynamic scene and use your networks to train since fisheye cameras can also capture the same scene using fewer cameras compare to the above paper

Third, using four fisheye cameras can capture more details than your two ODS images, I wonder if these inputs can get better results?

@jamestompkin
Copy link
Contributor

Hi visonpon, thanks for your questions, and sorry for the long delay.

  1. MSI representations: Yes, they are very similar. Broxton et al. then go on to extract an additional 'layered mesh' after inference.

  2. Fisheye cameras: Yes, so long as you can calibrate your cameras and work out their projection into the ODS views. For dynamic scenes, our network was trained only on moving cameras and not moving objects, but you might be able to augment the training set there.

  3. One way to consider ODS is that it is really a transmission and viewing format for stereo 360, rather than a capture format. There is no real-world camera that directly captures ODS; ODS must be created somehow by warping and stitching input camera views. Our two ODS images at inference time are typically constructed from multiple input cameras, possibly including fisheye cameras if we wish. At training time, we render ODS synthetically; these are 'perfect' in their representation of the ODS ray space, which isn't always possible with a real multi-camera system due to the physical geometry of the camera layout and scene.

We can try and make ODS images from four fish eye input cameras views if you'd like, and the overall quality will relate to how well those four input cameras can capture the ODS ray space (or, through computation like warping, be processed to represent the ODS ray space with as little error as possible).

Hope that helps,
James

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants