SUN RGB-D is not in millimeters #12

jbrownkramer · 2024-03-09T01:19:31Z

I was trying to apply this model to my own data and not getting good results. I ran the NYUv2 dataset through my code, and the results seem to be in line with those reported in the ViT-Lens paper.

Digging into it, the issue is - at least partly - that the NYUv2 data is not in millimeters. Here is the matlab code for converting the png files to mm that is in the SUNRGBDtoolbox (https://rgbd.cs.princeton.edu/):

depthVis = imread(data.depthpath);
imsize = size(depthVis);
depthInpaint = bitor(bitshift(depthVis,-3), bitshift(depthVis,16-3));

In other words, the data in the png files is a circular shift left by 3 bits of the depth in mm (which for most data is just multiplying by 8).

I mention this because the code in #9 seems to indicate that it is assumed that the data is in mm. It might be important if other datasets get used that are in mm and not the SUN RGB-D format.

The text was updated successfully, but these errors were encountered:

StanLei52 · 2024-03-09T03:51:36Z

Thank you for pointing this out -- it is important to figure this out for a more general depth model. As such, could you please also check LanguageBind and their uploaded NYU-D -- I will look into their preprocessing pipeline instead of following ImageBind if it works on your own data.

jbrownkramer · 2024-03-11T19:17:53Z

I will look into LanguageBind.

I will say this: I updated the processing on my pipeline to match the circular shift, quantization, and camera intrinsics as the NYU data. The results on our data are still not very good. My suspicion is that SUN RGB-D has no people in it, and the text labels I am trying to match are about the locations of people in the scene.

jbrownkramer · 2024-03-15T18:18:06Z

Below is the transformation pipeline in LanguageBind. The starting format is depth in mm (NOT DISPARITY). I ran their inference example from the git homepage and max_depth is configured to 10. So in summary: read in the data in mm, convert to meters, clamp between .01 and 10 meters. Divide by 10 meters. Resize and center crop to 224, and normalize by OPENAI_DATASET_MEAN, OPENAI_DATASET_STD.

I tried running on the SUN RGB-D versions of the NYUv2 data directly and LanguageBind gave bad outputs. When I did a circular shift (to put it back into mm) it gave good results, so they are doing some preprocessing to convert the NYU data to mm first.

class DepthNorm(nn.Module):
    def __init__(
        self,
        max_depth=0,
        min_depth=0.01,
    ):
        super().__init__()
        self.max_depth = max_depth
        self.min_depth = min_depth
        self.scale = 1000.0  # nyuv2 abs.depth

    def forward(self, image):
        # image = np.array(image)
        depth_img = image / self.scale  # (H, W)   in meters
        depth_img = depth_img.clip(min=self.min_depth)
        if self.max_depth != 0:
            depth_img = depth_img.clip(max=self.max_depth)
            depth_img /= self.max_depth   #  0-1
        else:
            depth_img /= depth_img.max()
        depth_img = torch.from_numpy(depth_img).unsqueeze(0).repeat(3, 1, 1)  # assume image
        return depth_img.to(torch.get_default_dtype())

def get_depth_transform(config):
    config = config.vision_config
    transform = transforms.Compose(
        [
            DepthNorm(max_depth=config.max_depth),
            transforms.Resize(224, interpolation=transforms.InterpolationMode.BICUBIC),
            transforms.CenterCrop(224),
            transforms.Normalize(OPENAI_DATASET_MEAN, OPENAI_DATASET_STD),  # assume image
            # transforms.Normalize((0.5, ), (0.5, ))  # 0-1 to norm distribution
            # transforms.Normalize((0.0418, ), (0.0295, ))  # sun rgb-d  imagebind
            # transforms.Normalize((0.02, ), (0.00295, ))  # nyuv2
        ]
    )
    return transform

StanLei52 · 2024-03-16T11:34:19Z

Got it, thanks @jbrownkramer! I will look into this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SUN RGB-D is not in millimeters #12

SUN RGB-D is not in millimeters #12

jbrownkramer commented Mar 9, 2024

StanLei52 commented Mar 9, 2024 •

edited

jbrownkramer commented Mar 11, 2024

jbrownkramer commented Mar 15, 2024

StanLei52 commented Mar 16, 2024

SUN RGB-D is not in millimeters #12

SUN RGB-D is not in millimeters #12

Comments

jbrownkramer commented Mar 9, 2024

StanLei52 commented Mar 9, 2024 • edited

jbrownkramer commented Mar 11, 2024

jbrownkramer commented Mar 15, 2024

StanLei52 commented Mar 16, 2024

StanLei52 commented Mar 9, 2024 •

edited