Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_sensor_poses function recorder_console.py sample #126

Open
serhan-gul opened this issue Dec 11, 2019 · 1 comment
Open

read_sensor_poses function recorder_console.py sample #126

serhan-gul opened this issue Dec 11, 2019 · 1 comment

Comments

@serhan-gul
Copy link

Hi, I have some doubts about the camera_to_image matrix in the read_sensor_poses function of the sample app:
https://github.com/microsoft/HoloLensForCV/blob/master/Samples/py/recorder_console.py

Is camera_to_image representing here the projective transformation to the image domain? Why is there two ways of setting camera_to_image, one of them as an identity matrix and the other with two -1's in the diagonal? Specifically, I mean the following part:

            if identity_camera_to_image:
                camera_to_image = np.eye(4)
            else:
                camera_to_image = np.array(
                    [[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])
@LisaVelten
Copy link

Hi Serhann,

I do not understand the part your are wondering about either. I had a look at the Camera Intrinsics a bit closer and I am confused about the following aspects:

The CameraProjectionTransform has the following parameters:
CamProjT = [2.43247 0 0 0
0 4.31968 0 0
0.0701278 -0.0997288 -1 -1
0 0 0 0]

As far as I understand: these are the Camera Intrinsics for a mapping onto a Unit Plane ranging from -1 to 1 in X- and Y-direction and with a Z-coordinate (imaginary focal length) of -1.

If I query the CameraIntrinsics Property of the VideoMediaFrame I get the Intrinsic Parameters you can find attached
HololensIntrinsics_1280x720

It seems like the CameraProjectionTransfrom is the Camera To Image projection for a mapping onto a unit plane and the "UndistortedProjectionTransform", which is highlighted blue in the attached image, is the equivalent to the Camera Projection Transform for mapping onto a Image Plane ranging from 1280x720. Here I get confused about the Z-coordinate, which should correspond to the focal length. The parameter fy is negativ (-1555.67334) - what is the reason for that?

According to https://docs.microsoft.com/de-de/windows/mixed-reality/locatable-camera, section "Distortion Error" the frames should be already undistorted. As the frames saved by the recorder tool are not "preview" frames - right? Microsoft says: Because only the CameraIntrinsics are made available, applications must assume image frames represent a perfect pinhole camera. This does not make sense to me as you can see in the attached picture that also the RadialDistortion is made available. Or do they mean something else by this?

Well, if we use the images recorded by the Recorder Tool, we need to use the UndistortedProjectionTransform and thus, we do not need to consider the radial distortion. Is that correct?

I wanted to compare the values of the CameraProjectionTransform and UndistortedProjectionTransform to try to make sense of the values. I did it as follows:

For the Unit Plane Mapping the x-coordinate of the Principal Point lies at:

  • cx: 0.0701278
    If the Principal Point lied exactly in the center of the image Plane, the x-coordinate would be at
  • cx_exact: 0
    Thus, there is an offset of 0.0701278, let's call it Delta_x_UnitPlane.

For the Mapping onto the image plane size 1280x720 the x-coordinate of the Principal Point lies at:

  • cx: 595.1027
    If the Principal Point lied exactly in the center of the image Plane, the x-coordinate would be at
  • cx_exact: 640
    Thus, there is an offset of 44.8973, let's call it Delta_x_Regular.

Formula 1: Delta_x_UnitPlane/2.43247 = 0.02882987252
Formula 2: Delta_x_Regular/1557.021 = 0.02883538501

The same can be done for the y-parameters:
Formula 3: -0.0997288/4.31968 = -0.2308708052
Formula 4: 35.926422/-1555.67334 = -0.02309385851

So, the proportions of focal length to offset are about equivalent (Solution of Formula 1 and 2 and Solution of Formula 3 and 4), which is a evidence for the fact that the mapping correspond to the same camera, just that one transform is for the mapping onto a unit plane and the other for the mapping onto a plane of size 1280x720.

What is unclear to me are the signs of the UndistortedProjectionTransform. It does not make sense to me that fy is negative.

Also I do not understand the relation between the "normal Principal Point", which lies at (595.1027, 324.073578) and the Value in the UndistortedProjectionTransform, where the Principal Point is at (595.1027, 395.926422).

The values for the Focal Length stay the same except for the minus sign.

I assume that all this has something to do with the fact that the parameters are for the unprojected images, while the other parameters are for distorted images. However, I cannot make sense of it.

Can someone help with this?

Thanks a lot,
Lisa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants