New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting No Depth Input - RGB SLAM #7
Comments
Hi, Thanks for trying out the code! SplaTAM requires depth input for running SLAM & reconstruction. Our dataloaders, by default, expect both an rgb and depth folder. We haven't tested the offline setting in the NeRFCapture App. We have only tested our scripts, which interface with NeRFCapture in Online mode. I just checked capturing an offline dataset with NeRFCapture. It looks like both the RGB and depth pngs are saved under the images folder. You would need to move the |
I think the offline mode in the Nerfcapture app is broken as pointed out by the app's developer here jc211/NeRFCapture#10 (comment). I tried renaming the files yesterday but that does not seem to cut it: the tensor dimensions were off and the depth PNG files themselves looked wrong: they do not seem to store the full depth range. This is probably a bit different to how the online mode works. I'm actually trying to get this to work using our own iOS data collection app (not related to Nerfcapture or SplaTAM), see here for details, but I'm not sure if we got the depth conversion correct yet. If I understood this comment correctly:
So to summarize, if an app/script would export RGBD data where the depth PNGs have a depth scale is 6553.5, and the camera intrinsics are correctly set in transforms.json, it probably should work? |
Hello, I am running a program using WSL and I don't know how to keep WSL and my phone on the same network segment. Therefore, I used nerfcapture for offline data collection. However, after the collection was completed, I found that the folder on my phone only contains color images and transformer. json, and does not include depth maps. My phone is an Apple 14, is it unable to capture depth maps? |
@Nik-V9 I certainly check the images dir, there is only rgb image. Does the data need collect by iPhone with lidar? Maybe here could be improved to use something like MiDAS or MVSnet to get the depth? |
Yes, you need a LiDAR-equipped iPhone for the demo. Using a depth estimation network would make the method up to scale (not metric) since monocular depth wouldn't have scale. Your camera tracking performance would be influenced by the accuracy and multi-view consistency of the depth estimation network. An RGB-only SLAM method using 3D Gaussians is currently future research and might be one of the things we might consider. |
Yes, this is correct. Looks like the offline mode is broken. So far, we have only used the online mode.
This looks like a cool app.
Yes! We need depth and intrinsics in addition to RGB for SLAM. The depth scale doesn't specifically have to be 6553.5 (as long as the pixel intensity to meter scaling is known). That's what our iPhone dataloader is currently hardcoded to:
|
where does the 6553.5 number come from? I'm also trying to get this working, I see you use a depth_scale of 10 and this magic number of 6553.5 but I don't fully understand. What is encoded into the depth image, to see the actual metric value, I would need to divide by the 6553.5 and multiply by 10? |
Hi @pablovela5620, the 6553.5 is the scaling factor for the depth png image. When you load the depth image, you need to divide the pixel values by this number to get metric depth. By default, the iPhone depth image has a pixel intensity of 65535, corresponding to 1 meter. When we save the depth image, we divide this by 10 and save the depth image. |
Because they try to save the depth array like this: def save_depth_as_png(depth, filename, png_depth_scale):
depth = depth * png_depth_scale
depth = depth.astype(np.uint16)
depth = Image.fromarray(depth)
depth.save(filename) when doing it like this, you need to consider that the range of uint16 is from 0 to 65535(2 ^16 -1). So, I guess what they did must be first clamp actual depth value to [0, 10.0], than multiply it by 6553.5, then convert it to uint16 without the risk of overflowing (but lose some accuracy from float to int). def _preprocess_depth(self, depth: np.ndarray):
r"""Preprocesses the depth image by resizing, adding channel dimension, and scaling values to meters. Optionally
converts depth from channels last :math:`(H, W, 1)` to channels first :math:`(1, H, W)` representation.
Args:
depth (np.ndarray): Raw depth image
Returns:
np.ndarray: Preprocessed depth
Shape:
- depth: :math:`(H_\text{old}, W_\text{old})`
- Output: :math:`(H, W, 1)` if `self.channels_first == False`, else :math:`(1, H, W)`.
"""
depth = cv2.resize(
depth.astype(float),
(self.desired_width, self.desired_height),
interpolation=cv2.INTER_NEAREST,
)
if len(depth.shape) == 2:
depth = np.expand_dims(depth, -1)
if self.channels_first:
depth = datautils.channels_first(depth)
return depth / self.png_depth_scale |
Hi there, at first thx for your great work, I install the env and here is my sitution:
I cannot use iPhone to connect the server cause it's not on same WIFI but on same local network.
So I decide to capture the dataset offline by nerfcapture, I can get a dir of images and transform.json
Then I tried to change the dataset to fit the
python scripts/splatam.py configs/iphone/splatam.py
For instance, I change the dir of
images/0
torgb/0.png
, then keep others same.But I directly run the ``python scripts/splatam.py configs/iphone/splatam.py`, I got a error of :
This should not happend, cause I saw the dataset convert python script, the depth is an option not a must.
I also tried to run other code, I get:
So, any idea?
The text was updated successfully, but these errors were encountered: