-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about camera intrinsics #7
Comments
Hello! For the camera intrinsics, yes - the camera intrinsics differ between the data in the old and new versions because I used different cameras. The old version uses a sequence of depth frames I captured with a RealSense F200, while the new version uses a sequence of depth frames from the Microsoft 7-Scenes dataset, which uses a Kinect. Camera intrinsics are usually unique to each camera. However, default camera intrinsics for the Kinect, principle point (320,240) and focal length (585,585), are usually a good enough approximation for the Kinect, so you can start with that. But if you care about achieving maximum accuracy for your 3D data (which in some applications, is important), you can run this calibration procedure with OpenNI with a checkerboard to estimate a more accurate intrinsics for your camera. The camera pose (extrinsics) is a rigid transformation (consisting of a rotation matrix and translation vector) that describes the camera’s location in the world. Here is a pretty solid introduction to extrinsic camera matrices. They are usually estimated from SLAM, SfM, or other camera localization and reconstruction algorithms. When changing the number of frames from 50 to 1, the voxel volume is still generated. The reason why you’re not seeing a point cloud is because of the function call SaveVoxelGrid2SurfacePointCloud("tsdf.ply", voxel_grid_dim_x, voxel_grid_dim_y, voxel_grid_dim_z,
voxel_size, voxel_grid_origin_x, voxel_grid_origin_y, voxel_grid_origin_z,
voxel_grid_TSDF, voxel_grid_weight, 0.2f, 0.0f); And the code will produce a point cloud visualization of the voxel volume from just one depth frame. |
Dear @andyzeng Thank you for your answer. In my own project, the depth image is like this: I look through the information you told me and finally got the camera intrinsic and extrinsic matrices like this: Thank you |
The format of the depth image is likely the problem. You will have to modify the function |
Dear @andyzeng Thank you for your kind answer. Thank you |
Re voxel grid origin: if you imagine your TSDF volume as a 3D box that is axis-aligned in 3D camera space, the voxel grid origin defines the location of the origin corner of the volume. By setting the voxel grid origin to (0,0,0) in camera coordinates, you're translating the 3D box so that the origin corner lies on the camera location - hence giving you different volumetric data. Moving voxel grid origin will move your 3D box. Re trunc_margin: a regular distance field would have values from 0 (close to surface) all the way to infinity (far from surface). trunc_margin defines when to cut the distance field (hence the term "truncated") so that you don't integrate distance values too far away from the surface. For more information on volumetric integration, I would recommend taking a look at this. Re TSDF volume resolution: For your hand example, project the depth data of the hand into camera coordinates, and find a reasonable location to define the voxel grid origin. You can change |
Dear @andyzeng Thank you for your kindness. Thank you |
Re camera coordinates: in order to create a 3D point cloud using a depth image (like this piece of code), you typically use the camera intrinsics in order to project the depth values into 3D coordinate space. This 3D coordinate space is called the camera coordinate space. Now back to the voxel grid origin: keep in mind that the voxel grid origin is not the the center of the voxel volume. If your TSDF voxel grid is 60x60x60 voxels (where voxels coordinates range from (0,0,0) to (60,60,60)), the voxel grid origin is the 3D location of voxel (0,0,0) in camera coordinate space. In other words, it is a "corner" of the voxel grid, not the "middle" of it. For your problem: yes, you will need to set the right voxel grid origin. After using the camera intrinsics to project the depth data of the hand into camera coordinates, you should have a 3D camera coordinate location for each pixel. Your voxel grid will be a 3D bounding box around the 3D locations of the pixels that represent the hand. Find the smallest 3D location of all of those 3D points, and that will be the location of your desired voxel grid origin. With that said, I highly recommend searching online for some academic resources that can introduce you to the basic concepts of 3D vision (such as this or this). The computer vision course that I TA'd at Princeton also has some nice introductory content for 3D vision (see course slides here). You will need a good understanding of these topics before understanding what the code in this repository does. |
Dear @andyzeng
Thank you for sharing your work.
I want to generate my own 3D volume data by changing some of your source code. However, I am confused by some parameters in this work. It will be appreciated that if you can give me some help.
First is the camera intrinsics. I did some research and found it is a 3x3 matrix. Then I looked through your work and found that the old and new version of this work are using different camera intrinsic matrix. Is it because you used different camera between these two work? In my own project, I'm using a Kinect as the depth camera. Can you tell me how can I set the camera intrinsics in my work?
The second is about the camera pose. It seems a 4x4 matrix and changes with every single depth camera. Can you tell me what is this parameter and how to set it?
The last is the input number. In your demo, you set the input number to 50. I changed the number to regenerate a voxel volume. However, when I changed the number to 1, the voxel volume turned to nothing. I am wondering can I generate a voxel volume from a single depth image? And is it the right way to change the input number to 1 if this work can generate a voxel volume from a single depth image?
Thant you
Sincerely yours
Tony
The text was updated successfully, but these errors were encountered: