You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Second, as you guys use ODS images to train and the paper above use almost 50 images from different angles to train, I wonder whether can use some fisheye cameras(e.g, four located at the vertex of a square) to capture a dynamic scene and use your networks to train since fisheye cameras can also capture the same scene using fewer cameras compare to the above paper
Third, using four fisheye cameras can capture more details than your two ODS images, I wonder if these inputs can get better results?
The text was updated successfully, but these errors were encountered:
Hi visonpon, thanks for your questions, and sorry for the long delay.
MSI representations: Yes, they are very similar. Broxton et al. then go on to extract an additional 'layered mesh' after inference.
Fisheye cameras: Yes, so long as you can calibrate your cameras and work out their projection into the ODS views. For dynamic scenes, our network was trained only on moving cameras and not moving objects, but you might be able to augment the training set there.
One way to consider ODS is that it is really a transmission and viewing format for stereo 360, rather than a capture format. There is no real-world camera that directly captures ODS; ODS must be created somehow by warping and stitching input camera views. Our two ODS images at inference time are typically constructed from multiple input cameras, possibly including fisheye cameras if we wish. At training time, we render ODS synthetically; these are 'perfect' in their representation of the ODS ray space, which isn't always possible with a real multi-camera system due to the physical geometry of the camera layout and scene.
We can try and make ODS images from four fish eye input cameras views if you'd like, and the overall quality will relate to how well those four input cameras can capture the ODS ray space (or, through computation like warping, be processed to represent the ODS ray space with as little error as possible).
Hi @jamestompkin, thanks for your generous sharing, and I want to ask some questions:
First, I came to notice that in another paper immersive light field video with a layered mesh representation, they also use MSI features to train, are they the same one?
Second, as you guys use ODS images to train and the paper above use almost 50 images from different angles to train, I wonder whether can use some fisheye cameras(e.g, four located at the vertex of a square) to capture a dynamic scene and use your networks to train since fisheye cameras can also capture the same scene using fewer cameras compare to the above paper
Third, using four fisheye cameras can capture more details than your two ODS images, I wonder if these inputs can get better results?
The text was updated successfully, but these errors were encountered: