Tracking loss in code seems not consistent with the paper. #90

identxxy · 2024-05-09T09:35:21Z

To my understanding, in the code, the tracking loss is opacity * L1 loss * a complicated mask (silhouetee) , but in the paper, it is simply L1 loss.

In the code

MonoGS/utils/slam_utils.py

Line 63 in b9f8b66

def get_loss_tracking_rgb(config, image, depth, opacity, viewpoint):

def get_loss_tracking_rgb(config, image, depth, opacity, viewpoint):
    gt_image = viewpoint.original_image.cuda()
    _, h, w = gt_image.shape
    mask_shape = (1, h, w)
    rgb_boundary_threshold = config["Training"]["rgb_boundary_threshold"]
    rgb_pixel_mask = (gt_image.sum(dim=0) > rgb_boundary_threshold).view(*mask_shape)
    rgb_pixel_mask = rgb_pixel_mask * viewpoint.grad_mask
    l1 = opacity * torch.abs(image * rgb_pixel_mask - gt_image * rgb_pixel_mask)
    return l1.mean()

where the viewport.grad_mask is computed here

MonoGS/utils/camera_utils.py

Line 114 in b9f8b66

def compute_grad_mask(self, config):

    def compute_grad_mask(self, config):
        edge_threshold = config["Training"]["edge_threshold"]

        gray_img = self.original_image.mean(dim=0, keepdim=True)
        gray_grad_v, gray_grad_h = image_gradient(gray_img)
        mask_v, mask_h = image_gradient_mask(gray_img)
        gray_grad_v = gray_grad_v * mask_v
        gray_grad_h = gray_grad_h * mask_h
        img_grad_intensity = torch.sqrt(gray_grad_v**2 + gray_grad_h**2)

        if config["Dataset"]["type"] == "replica":
            row, col = 32, 32
            multiplier = edge_threshold
            _, h, w = self.original_image.shape
            for r in range(row):
                for c in range(col):
                    block = img_grad_intensity[
                        :,
                        r * int(h / row) : (r + 1) * int(h / row),
                        c * int(w / col) : (c + 1) * int(w / col),
                    ]
                    th_median = block.median()
                    block[block > (th_median * multiplier)] = 1
                    block[block <= (th_median * multiplier)] = 0
            self.grad_mask = img_grad_intensity
        else:
            median_img_grad_intensity = img_grad_intensity.median()
            self.grad_mask = (
                img_grad_intensity > median_img_grad_intensity * edge_threshold
            )

But in the paper, the tracking loss is simply

I would like to ask is my understanding correct or I miss sth?

The text was updated successfully, but these errors were encountered:

WFram · 2024-05-23T16:11:13Z

I would like to ask is my understanding correct or I miss sth?

In my opinion, they use opacity and gradient masks to reduce the impact of immature splats on pose estimates. Because opacity depends on the covariance of ellipsoids that define a single gaussian (that's describes their uncertainty), while gradient masks capture high-contrast parts of an image (e.g., edges).

In the paper it's said:

We further <...> penalise non-edge or low-opacity pixels

In this sense, they are providing L1-loss as the residual, and then apply opacity and gradient mask as it's done in the code.

identxxy · 2024-05-27T12:58:52Z

Thanks so much for your explaination. I just realized that I was reading the old version paper which does not mention the penaliziton. My bad...

identxxy mentioned this issue May 23, 2024

Questions about the pose optimization (paper and code) #93

Open

identxxy closed this as completed May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking loss in code seems not consistent with the paper. #90

Tracking loss in code seems not consistent with the paper. #90

identxxy commented May 9, 2024 •

edited

Loading

WFram commented May 23, 2024

identxxy commented May 27, 2024

Tracking loss in code seems not consistent with the paper. #90

Tracking loss in code seems not consistent with the paper. #90

Comments

identxxy commented May 9, 2024 • edited Loading

WFram commented May 23, 2024

identxxy commented May 27, 2024

identxxy commented May 9, 2024 •

edited

Loading