Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASE] The colors of the edge pixels are darker than the center pixels. How can I rectify this problem? #99

Open
thucz opened this issue May 20, 2024 · 17 comments

Comments

@thucz
Copy link

thucz commented May 20, 2024

Your dataset is really good! I'm trying to use ASE as my training data for the novel view synthesis task.
But I found a problem:
In the circle, the colors of the edge pixels (I'm not saying about the black border) are much darker than the center pixels(I have undistorted the images). So the corresponding pixels in the edges are not fully view-consistent between neighbor views.
Do you know how to rectify this inconsistent brightness problem?

0000026

0000034

@thucz
Copy link
Author

thucz commented May 20, 2024

This problem makes it difficult to produce a good result for the Novel View Synthesis task on this dataset. In the rightmost column, the existing Gaussian Splatting methods may easily produce strange dark borders on this dataset.

4774

@captain-sysadmin
Copy link

Hello!

ASE is designed to produce accurate simulations of Aria output. The RGB camera has a very small fisheye lens on it. This means that we also simulate the Vignette of the Cameras. As you might be aware, most fisheye lenses produce a pronounced variation of brightness, more information can be found Here: (non affiliated link!)

The good news is that the variance in brightness is static. It could be reduced by creating a gradient (by inverting an image similar to this and multiplying it with the ASE image.

I hope that helps!

@thucz
Copy link
Author

thucz commented May 21, 2024

Thanks for your reply! However, I still do not know how to compute this gradient in ASE data accurately.

@thucz
Copy link
Author

thucz commented May 22, 2024

Hello!

ASE is designed to produce accurate simulations of Aria output. The RGB camera has a very small fisheye lens on it. This means that we also simulate the Vignette of the Cameras. As you might be aware, most fisheye lenses produce a pronounced variation of brightness, more information can be found Here: (non affiliated link!)

The good news is that the variance in brightness is static. It could be reduced by creating a gradient (by inverting an image similar to this and multiplying it with the ASE image.

I hope that helps!
Hi!
Is there any specific method to compute this relative illumination value of each pixel? I'm not familiar with fisheye cameras.

@captain-sysadmin
Copy link

let me see if I can generate a gradient, standby!

@thucz
Copy link
Author

thucz commented May 27, 2024

Hi! Do you have any clue about relative illumination computation?

@captain-sysadmin
Copy link

We calculate the distortion and then apply a Vignette, so the relative illumination is a function of combining a "normal" but distorted image with a vignette image.
invert-vignette

This should re-flatten the lens based roll off.

The top right of the image is "up" so depending on how its applied you might need to rotate it to line it up.

@thucz
Copy link
Author

thucz commented May 28, 2024

Thanks for your help. I previously wrote the code about preprocessing ASE fisheye data including undistort and rotation. I beg your help. Which line code should I revise to change the relative illumination?

import matplotlib.colors as colors
import matplotlib.pyplot as plt
import numpy as np
import plotly.graph_objects as go
from pathlib import Path
import os
from PIL import Image
from scipy.spatial.transform import Rotation as R
from projectaria_tools.projects import ase
from projectaria_tools.core import data_provider, calibration
from projectaria_tools.core.image import InterpolationMethod
from readers import read_points_file, read_trajectory_file, read_language_file
import cv2
from tqdm import tqdm
import os, sys, json
from multiprocessing import Pool

def distance_to_depth(K, dist, uv=None):
    if uv is None and len(dist.shape) >= 2:
        # create mesh grid according to d
        uv = np.stack(np.meshgrid(np.arange(dist.shape[1]), np.arange(dist.shape[0])), -1)
        uv = uv.reshape(-1, 2)
        dist = dist.reshape(-1)
        if not isinstance(dist, np.ndarray):
            import torch
            uv = torch.from_numpy(uv).to(dist)
    if isinstance(dist, np.ndarray):
        # z * np.sqrt(x_temp**2+y_temp**2+z_temp**2) = dist
        uvh = np.concatenate([uv, np.ones((len(uv), 1))], -1)
        uvh = uvh.T # N, 3
        temp_point = np.linalg.inv(K) @ uvh # 3, N  
        temp_point = temp_point.T # N, 3
        z = dist / np.linalg.norm(temp_point, axis=1)
    else:
        uvh = torch.cat([uv, torch.ones(len(uv), 1).to(uv)], -1)
        temp_point = torch.inverse(K) @ uvh
        z = dist / torch.linalg.norm(temp_point, dim=1)
    return z

def transform_3d_points(transform, points):
    N = len(points)
    points_h = np.concatenate([points, np.ones((N, 1))], axis=1)
    transformed_points_h = (transform @ points_h.T).T
    transformed_points = transformed_points_h[:, :-1]
    return transformed_points


def aria_export_to_scannet(scene_id):
    src_folder = Path("/group/40033/public_datasets/3d_datasets/aria/ase_data/"+str(scene_id))
    trgt_folder = Path("/group/40033/public_datasets/3d_datasets/aria/ase_preprocessed_data/"+str(scene_id))
    trgt_folder.mkdir(parents=True, exist_ok=True)
    SCENE_ID = src_folder.stem
    print("SCENE_ID:", SCENE_ID)

    scene_max_depth = 0
    scene_min_depth = np.inf
    Path(trgt_folder, "intrinsic").mkdir(exist_ok=True)
    Path(trgt_folder, "pose").mkdir(exist_ok=True)
    Path(trgt_folder, "depth").mkdir(exist_ok=True)
    Path(trgt_folder, "color").mkdir(exist_ok=True)

    rgb_dir = src_folder / "rgb"
    depth_dir = src_folder / "depth"
    # Load camera calibration
    device = ase.get_ase_rgb_calibration()
    # Load the trajectory using read_trajectory_file() 
    trajectory_path = src_folder / "trajectory.csv"
    trajectory = read_trajectory_file(trajectory_path)

    num_frames = len(list(rgb_dir.glob("*.jpg")))
    Path('./debug').mkdir(exist_ok=True)
    for frame_idx in range(num_frames):   
        frame_id = str(frame_idx).zfill(7)
        rgb_path = rgb_dir / f"vignette{frame_id}.jpg"
        depth_path = depth_dir / f"depth{frame_id}.png"
        depth = Image.open(depth_path) # uint16        
        rgb = cv2.imread(str(rgb_path), cv2.IMREAD_UNCHANGED)
        depth = np.array(depth)
        scene_min_depth = min(depth.min(), scene_min_depth)
        inf_value = np.iinfo(np.array(depth).dtype).max
        depth[depth == inf_value] = 0 # consider it as invalid, inplace with 0
        T_world_from_device = trajectory["Ts_world_from_device"][frame_idx] # camera-to-world
        assert device.get_image_size()[0] == 704
        # https://facebookresearch.github.io/projectaria_tools/docs/data_utilities/advanced_code_snippets/image_utilities
        pinhole = calibration.get_linear_camera_calibration(
            # device.get_image_size()[0],
            # device.get_image_size()[1],
            # device.get_focal_lengths()[0],
            512,
            512,
            150,
            "camera-rgb",
            device.get_transform_device_camera() # important to get correct transformation matrix in pinhole_cw90
            )
        # distort image
        rectified_rgb = calibration.distort_by_calibration(np.array(rgb), pinhole, device, InterpolationMethod.BILINEAR)
        # raw_image = np.array(depth) # Will not work
        depth = np.array(depth).astype(np.float32) # WILL WORK
        rectified_depth = calibration.distort_by_calibration(depth, pinhole, device)
        
        rotated_image = np.rot90(rectified_rgb, k=3)
        rotated_depth = np.rot90(rectified_depth, k=3)
        increase_light = True
        if increase_light:
            rotated_image = cv2.cvtColor(rotated_image,cv2.COLOR_BGR2HSV)
            h,s,v = cv2.split(rotated_image)      
            v1 = np.clip(cv2.add(1*v, 30), 0, 255)
            rotated_image = np.uint8(cv2.merge((h,s,v1)))
            rotated_image = cv2.cvtColor(rotated_image,cv2.COLOR_HSV2BGR)

        cv2.imwrite(str(Path(trgt_folder, "color", f"{frame_id}.jpg")), rotated_image)
        # TODO: check this
        plt.imsave(Path(f"./debug/debug_undistort_{frame_id}.png"), np.uint16(rotated_depth), cmap="plasma")
        # Get rotated image calibration
        pinhole_cw90 = calibration.rotate_camera_calib_cw90deg(pinhole)
        principal = pinhole_cw90.get_principal_point()
        cx, cy = principal[0], principal[1]
        focal_lengths = pinhole_cw90.get_focal_lengths()
        fx, fy = focal_lengths 
        K = np.array([ # camera-to-pixel
            [fx, 0, cx],
            [0, fy, cy],
            [0, 0, 1.0]])

        c2w = T_world_from_device 
        c2w_rotation = pinhole_cw90.get_transform_device_camera().to_matrix()
        c2w_final = c2w @ c2w_rotation   # right-matmul!
        cam2world = c2w_final
        # distance-to-depth
        rotated_depth = distance_to_depth(K, rotated_depth).reshape((rotated_depth.shape[0], rotated_depth.shape[1]))#.reshape((dpt.shape[0], dpt.shape[1]))        
        rotated_depth = np.uint16(rotated_depth)

        cv2.imwrite(str(Path(trgt_folder, "depth", f"{frame_id}.png")), rotated_depth) # cmap="gray", vmin=0, vmax=255
        scene_max_depth = max(scene_max_depth, float(depth.max()))
        Path(trgt_folder, "min_depth.txt").write_text(f"{scene_min_depth * 1.0 / 1000}")                
        Path(trgt_folder, "max_depth.txt").write_text(f"{scene_max_depth * 1.0 / 1000}")
        Path(trgt_folder, "intrinsic", "intrinsic_color.txt").write_text(f"""{K[0][0]} {K[0][1]} {K[0][2]} 0.00\n{K[1][0]} {K[1][1]} {K[1][2]} 0.00\n{K[2][0]} {K[2][1]} {K[2][2]} 0.00\n0.00 0.00 0.00 1.00""")
        Path(trgt_folder, "pose", f"{frame_id}.txt").write_text(f"""{cam2world[0, 0]} {cam2world[0, 1]} {cam2world[0, 2]} {cam2world[0, 3]}\n{cam2world[1, 0]} {cam2world[1, 1]} {cam2world[1, 2]} {cam2world[1, 3]}\n{cam2world[2, 0]} {cam2world[2, 1]} {cam2world[2, 2]} {cam2world[2, 3]}\n0.00 0.00 0.00 1.00""")

if __name__ == "__main__":    
    aria_export_to_scannet(scene_id=0)


@captain-sysadmin
Copy link

captain-sysadmin commented May 29, 2024

multiply it together with the rgb image just as its loaded:

        rgb = cv2.imread(str(rgb_path), cv2.IMREAD_UNCHANGED)
        anti_vignette = cv2.imread('path_to_anti_vignette.jpg')
        rgb = cv2.multiply(rgb,anti_vignette,scale=1.0)

that should flatten it out. (again, i'm not sure of the rotation, so you might need to rotate the anit-vignette image left by 90 degrees for it to line up properly. )

You might end up with a white border instead of a black border, but that shouldn't be too hard to remove if needed (you can either crop or change the anti-vignette image I provided.)

@thucz
Copy link
Author

thucz commented May 29, 2024

Many thanks!

@thucz thucz closed this as completed May 29, 2024
@thucz
Copy link
Author

thucz commented May 30, 2024

We calculate the distortion and then apply a Vignette, so the relative illumination is a function of combining a "normal" but distorted image with a vignette image. invert-vignette

This should re-flatten the lens based roll off.

The top right of the image is "up" so depending on how its applied you might need to rotate it to line it up.

Hi! it seems that this anti-vignette is normalized(min value is 0 and max value is 255 with data type np.uint8). I use it to change RGB images but the color overflows. Could you tell me how to reverse it to a true value?

debug

import matplotlib.colors as colors
import matplotlib.pyplot as plt
import numpy as np
# import plotly.graph_objects as go
from pathlib import Path
import os
from PIL import Image
import cv2
import os, sys, json
scene_id = 0
vignette_path = Path("/group/40033/public_datasets/3d_datasets/aria/data/anti_vignette.png")
anti_vignette = cv2.imread(str(vignette_path)) # , cv2.IMREAD_UNCHANGED

src_folder = Path("/group/40033/public_datasets/3d_datasets/aria/ase_data/"+str(scene_id))
rgb_dir = src_folder / "rgb"

frame_idx = 0
frame_id = str(frame_idx).zfill(7)
rgb_path = rgb_dir / f"vignette{frame_id}.jpg"
rgb = cv2.imread(str(rgb_path), cv2.IMREAD_UNCHANGED)

rgb = cv2.multiply(rgb, anti_vignette,scale=1.0)
cv2.imwrite("./debug.jpg", rgb)

@thucz thucz reopened this May 30, 2024
@thucz
Copy link
Author

thucz commented Jun 1, 2024

Hi! @captain-sysadmin It seems that the given anti-vignette is normalized(min value is 0 and max value is 255 with data type np.uint8). If I use this anti-vignette to multiply RGB image, the RGB image will overflow (the value exceeds the data range of [0, 255]). Do you know how to resolve it?

@thucz
Copy link
Author

thucz commented Jun 3, 2024

Hi! Sorry to bother you again. Do you have any clue about this problem? It means a lot to me.

@captain-sysadmin
Copy link

captain-sysadmin commented Jun 3, 2024

Hello!

As you can see, because we are multiplying white(or very near white) with another colour other than black, we quickly overflow and clip.

You can try using cv2.addWeighted instead of multiply. So:

rgb = cv2.multiply(rgb, anti_vignette,scale=1.0)

becomes

alpha = 1.0
beta = 1.0
gamma = 1.0
rgb = cv2.addWeighted(rgb, alpha, anti_vignette, beta, gamma)

changing the alpha and beta would allow you to alter the mix between the two images, and gamma should allow you to control clipping

@thucz
Copy link
Author

thucz commented Jun 3, 2024

Thanks for your reply!
I can get the normal image.
debug (1)

But I still have a question. It seems the given code above will re-weight all three channels (RGB) with the anti-vignette image. So the image looks too bright.

I tried to re-weight the image in HSV space and revise only the V value. The brightness becomes normal. But it will get a checkboard artifact near the edge. Do you know how to alleviate this problem?

debug (2)

rgb = rgb.astype(np.float32) / 255           # go to 32-bit float on 0..1
anti_vignette = anti_vignette.astype(np.float32) / 255
new_rgb = cv2.cvtColor(rgb,cv2.COLOR_BGR2HSV)
h,s,v = cv2.split(new_rgb)
new_v = cv2.addWeighted(v, alpha, anti_vignette[:, :, 0], beta, gamma)
new_rgb = cv2.merge((h,s,new_v))
new_rgb = cv2.cvtColor(new_rgb, cv2.COLOR_HSV2BGR)
new_rgb = np.uint8(np.clip(new_rgb*255, 0, 255))
rgb = new_rgb

@captain-sysadmin
Copy link

I tried to re-weight the image in HSV space and revise only the V value. The brightness becomes normal. But it will get a checkboard artifact near the edge. Do you know how to alleviate this problem?

Off the top of my head you might be able to lower the V of vignette before adding it to the RGB image?

@thucz
Copy link
Author

thucz commented Jun 4, 2024

Each channel of anti_vignette is equal (R=G=B). So V=max(R, G, B) of vignette is vignette[:, :, 0].

anti_vignette =  np.uint8(anti_vignette.astype(np.float32) * 1.0 / 3.0)
rgb = cv2.addWeighted(rgb, alpha, anti_vignette , beta, gamma)

The result is still a little white.

debug (3)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants