correct 4x4 projection matrix? #5

iperov · 2024-05-31T12:04:46Z

currently projection matrix is 3x3

np.array([ [1015.0, 0, 0], 
           [0, 1015.0, 0],
           [112.0, 112.0, 1] ], dtype=np.float32)

and when the point is transformed, Z is discarded

pts = pts [..., :2] / pts [..., 2:3]

which is not sutiable for standard graphics transformations like in opengl

can you provide correct 4x4 projection matrix in order to transform homogenous 3D points (x,y,z,1.0) ?

The text was updated successfully, but these errors were encountered:

wang-zidu · 2024-06-03T02:41:28Z

Thank you for your support, this issue is valuable.

The parameters of the perspective projection camera we use are: focal=1015, znear=5, zfar=15. The camera is located at (0, 0, 10) and faces the negative direction of the z-axis. The rendered image size is 224×224.

I completely agree with what you said about the 4x4 projection matrix being fundamental to some rendering calculations. I believe you can find the process you mentioned in /util/nv_diffrast.py, which is the calculation method for transforming homogenous 3D points (x, y, z, 1).

However, in model/recon.py, the purpose of self.persc_proj is merely to obtain the x and y coordinates without involving the rendering process, so z is not needed (otherwise the calculation would be redundant). You can verify that the v2d obtained using self.persc_proj are consistent with the first two dimensions of screen coordinates obtained by transforming the vertex_ndc using /util/nv_diffrast.py. In short, self.persc_proj is just a way to slightly reduce unnecessary calculations, a similar approach is common in HRN, Deep3D, etc. Hope this helps.

iperov · 2024-06-03T04:34:38Z

I tried various 4x4 matrices from these values (focal=1015, znear=5, zfar=15), but none of them match the same result as your code.

reduce unnecessary calculations

for the processor it's nanoseconds, for the programmer and those who will use the repo it's a headache.

wang-zidu · 2024-06-03T09:43:03Z

I tried various 4x4 matrices from these values (focal=1015, znear=5, zfar=15), but none of them match the same result as your code.

reduce unnecessary calculations

for the processor it's nanoseconds, for the programmer and those who will use the repo it's a headache.

I believe you can directly refer to /util/nv_diffrast.py to get the result you want. The camera parameters have also been verified by using PyTorch3D. Let me explain further:

Starting from line 442 of model/recon.py, we assume the homogenous 3D point v = (x,y,z,1) is one of the points in v3d. First, we calculate the perspective projection matrix:

$$ {\rm{Projection - Matrix}} = \begin{bmatrix} {\frac{1}{{\tan (fov/2)}}} & 0 & 0 & 0 \\ 0 & {\frac{1}{{\tan (fov/2)}}} & 0 & 0 \\ 0 & 0 & {\frac{{znear + zfar}}{{znear - zfar}}} & {\frac{{2 \cdot znear \cdot zfar}}{{znear - zfar}}} \\ 0 & 0 & -1 & 0 \end{bmatrix} = \begin{bmatrix} {\frac{{1015}}{{112}}} & 0 & 0 & 0 \\ 0 & {\frac{{1015}}{{112}}} & 0 & 0 \\ 0 & 0 & -2 & -15 \\ 0 & 0 & -1 & 0 \end{bmatrix} $$

Invert the z direction of v to correspond to the camera's coordinate system. (In /util/nv_diffrast.py, you might also find that the y direction of v is inverted as well, which is done to adapt to the renderer.) So we get (x, y, -z, 1), and then perform the perspective projection to obtain:

$$ v' = \begin{bmatrix} {\frac{{1015}}{{112}}} & 0 & 0 & 0 \\ 0 & {\frac{{1015}}{{112}}} & 0 & 0 \\ 0 & 0 & -2 & -15 \\ 0 & 0 & -1 & 0 \end{bmatrix} \cdot \begin{bmatrix} x\\ y\\ - z\\ 1 \end{bmatrix}=\begin{bmatrix} {\frac{{1015}}{{112}}}x\\ {\frac{{1015}}{{112}}}y\\ 2z -15\\ z \end{bmatrix} $$

Homogenize the coordinates:

$$v'' =\begin{bmatrix} {\frac{{1015x}}{{112z}}}\\ {\frac{{1015y}}{{112z}}}\\ {2 - \frac{{15}}{z}}\\ 1 \end{bmatrix} $$

Finally, the coordinate in ndc space is converted to the image plane:

$${v_{image}} =\begin{bmatrix} {\frac{{v'{'_x} + 1}}{2} \cdot 224}\\ {\frac{{v'{'_y} + 1}}{2} \cdot 224} \end{bmatrix} =\begin{bmatrix} {1015\frac{{x}}{z}} + 112\\ {1015\frac{{y}}{z}} + 112 \end{bmatrix}$$

You will find that this result is consistent with the result obtained using self.persc_proj and homogenizing in model/recon.py.

I believe the above process is detailed enough and hope it helps you. The cause of your incorrect results could be due to some axis inversions (such as the common y-flip in images). This is likely because of different coordinate system definitions used in different rendering methods. Usually, you just need to visualize the results and check the steps where inversion is needed.

ElliotQi · 2024-06-26T05:48:18Z

@wang-zidu Hi, thanks for your excellent work. I wonder if this work supports CPU inference. I'm trying to use cpu device but get an error with nvdiffrast (RasterizeCudaContext could not use cpu device)

wang-zidu · 2024-06-26T08:14:06Z

@wang-zidu Hi, thanks for your excellent work. I wonder if this work supports CPU inference. I'm trying to use cpu device but get an error with nvdiffrast (RasterizeCudaContext could not use cpu device)

Thank you for your feedback. nvdiffrast can be replaced with a simpler renderer such as face3d. You can try replacing it, or if you only need the mesh results, you can simply remove the corresponding nvdiffrast content. If I have time, I will address this request as soon as possible and let you know.

wang-zidu · 2024-08-01T08:43:14Z

@wang-zidu Hi, thanks for your excellent work. I wonder if this work supports CPU inference. I'm trying to use cpu device but get an error with nvdiffrast (RasterizeCudaContext could not use cpu device)

Thank you for your support and patience. We have updated a new fast CPU renderer (based on face3d). Now the entire work supports CPU inference (using RetinaFace for face box).

emlcpfx · 2024-08-06T22:42:08Z

I tried various 4x4 matrices from these values (focal=1015, znear=5, zfar=15), but none of them match the same result as your code.

reduce unnecessary calculations

for the processor it's nanoseconds, for the programmer and those who will use the repo it's a headache.

Did you figure out how to ‘export’ the camera in a way that can be imported to a 3D DCC?

jimydavis mentioned this issue Aug 19, 2024

Question on Camera Position / Focal Length #15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

correct 4x4 projection matrix? #5

correct 4x4 projection matrix? #5

iperov commented May 31, 2024

wang-zidu commented Jun 3, 2024

iperov commented Jun 3, 2024

wang-zidu commented Jun 3, 2024

ElliotQi commented Jun 26, 2024

wang-zidu commented Jun 26, 2024

wang-zidu commented Aug 1, 2024

emlcpfx commented Aug 6, 2024

correct 4x4 projection matrix? #5

correct 4x4 projection matrix? #5

Comments

iperov commented May 31, 2024

wang-zidu commented Jun 3, 2024

iperov commented Jun 3, 2024

wang-zidu commented Jun 3, 2024

ElliotQi commented Jun 26, 2024

wang-zidu commented Jun 26, 2024

wang-zidu commented Aug 1, 2024

emlcpfx commented Aug 6, 2024