首先我们有三个坐标系，分别为世界坐标系，相机坐标系，成像坐标系。分别为3d 3d 2d

P: 相机坐标系到成像坐标系的投影
$$
$\mathbf{P}=\left[\begin{array}{ccc}f & 0 & p_x \\ 0 & f & p_y \\ 0 & 0 & 1\end{array}\right]\left[\begin{array}{lll:l}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0\end{array}\right]$
$$
又可以写成
$$\mathbf{P}=\mathbf{K}[\mathbf{I} \mid \mathbf{0}]$$
其中，$K$ 是 camera intrinsic coordinate 又叫 calibration matrix
$$
\mathbf{K}=\left[\begin{array}{ccc}f & 0 & p_x \\ 0 & f & p_y \\ 0 & 0 & 1\end{array}\right]
$$

世界坐标系到相机坐标系的投影
如果相机坐标系和世界坐标系是一样的 那么投影矩阵是Indentity矩阵
$R ⋅ (X_w - X_c)$
$R$是rotation矩阵
如果写成homogenous coordinates 那么则是：
$$
\left[\begin{array}{c}X_c \\ Y_c \\ Z_c \\ 1\end{array}\right]=\left[\begin{array}{cc}\mathbf{R} & -\mathbf{R C} \\ \mathbf{0} & 1\end{array}\right]\left[\begin{array}{c}X_w \\ Y_w \\ Z_w \\ 1\end{array}\right]
$$


In [13]:
#
import torch

W, H = 6, 4
i, j = torch.meshgrid(torch.linspace(0, W-1, W), torch.linspace(0, H-1, H))


In [17]:
i

tensor([[0., 0., 0., 0.],
        [1., 1., 1., 1.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.]])

In [12]:
i.t()
j.t()

tensor([[0., 0., 0., 0., 0., 0.],
        [1., 1., 1., 1., 1., 1.],
        [2., 2., 2., 2., 2., 2.],
        [3., 3., 3., 3., 3., 3.]])

In [None]:
import torch
import numpy as np

def get_rays(H, W, K, c2w):
    # Create a meshgrid of pixel coordinates
    i, j = torch.meshgrid(torch.linspace(0, W-1, W), torch.linspace(0, H-1, H))
    i = i.t()
    j = j.t()

    # Compute the ray directions in camera coordinates
    dirs = torch.stack([(i - K[0, 2]) / K[0, 0], -(j - K[1, 2]) / K[1, 1], -torch.ones_like(i)], -1)

    # Rotate ray directions from camera frame to the world frame
    # This line performs a batched matrix-vector multiplication
    rays_d = torch.sum(dirs[..., np.newaxis, :3, :3] * c2w[:3, :3], -1)

    # Translate camera frame's origin to the world frame
    # This is the origin of all rays
    rays_o = c2w[:3, -1].expand(rays_d.shape)

    return rays_o, rays_d

# Example usage:
# Define hypothetical values for H, W, K, and c2w
H, W = 480, 640  # Image height and width
K = torch.tensor([[1000, 0, 320, 0], [0, 1000, 240, 0], [0, 0, 1, 0]], dtype=torch.float32)  # Intrinsic camera matrix
c2w = torch.tensor([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [10, 20, 30, 1]], dtype=torch.float32)  # Camera-to-world transformation matrix

# Get the rays
rays_o, rays_d = get_rays(H, W, K, c2w)

# Print the results
print("Rays origin (rays_o):")
print(rays_o)
print("\nRays direction (rays_d):")
print(rays_d)