-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Projecting 3d cuboids to camera images #24
Comments
Hi, Pei |
Hi, However, this is not how I would like to draw cuboids. Assuming top and bottom line as diagonal lines of the top/bottom surfaces could make visualizations wrong. Projected cuboids on the camera images should matched to the corners of cuboids. In other words, I would like to use I attached images what I have now (Red boxes are Beunguk |
It looks like this is because of your projection algorithm is not that accurate? Your projection should cover the same area as projected_lidar_labels?
** We are planning to release a projection lib that takes rolling shutter effect into account depending on the community interest. No ETA yet. |
One more note: We provide all parameters needed for a user to implement their projection algorithm by taking rolling shutter effect into account in the existing dataset. |
Yes, projections should cover the same area as projected lader labels. I tried to figure out the problem, but I couldn't... In the document, What am I missing? What should I consider more to have right results? Is there any kind of sample codes or pseudocode for this? Thanks, |
As I mentioned above, one possibility is rolling shutter effect which might not be the problem in your case as the SDC seems to be moving slowly based on the camera images. One thing to note is the camera frame definition. The camera frame is placed in the center of the camera lens. The x-axis points down the lens barrel out of the lens. The z-axis points up. The y/z plane is parallel to the sensor plane. The coordinate system is right handed. So y/z plane of the camera sensor frame is parallel with image. When you do the intrinsic transform, do something like: |
@beunguk Cheers, |
@peisun1115 However, I get the extremely large x and y value. Is there anything wrong in my method? Cheers |
Perhaps the cuboids ( That said, the differences in images above look to be a bit more than 100ms or so. Looks more like the camera-lidar matching in the actual @peisun1115 What would honestly be really helpful would be nanosecond timestamps for the lidar scans, camera images, and all labels (e.g. perhaps just add a timestamp member to It would also be nice if motorcycles were broken out of the vehicle class, which would be on par with other datasets. |
@pwais Our camera and lidar are well synchronized. We have statistics for all the data released. The maximum error calculated by the timestamps at which lidar/camera scan the same physical point is bounded at [-6ms, 7ms] with >99.999% confidence which is super good (i doubt nuScenes or Lyft have anything close). This is calculated by real data in this dataset (you can calculated it too). Nanotimestamp is not the problem here. Our projected_laser_labels are computed by the data available in this dataset. We did not use any other information. So it is likely that something wrong is in the projection code used. We are planning to release a rolling shutter projection code but unfortunately we are still going through legal. If you have the projection code, i can help to take a look. |
Can you copy-paste your code? i can help to take a look. Very likely, you did not get the coordinates in camera frame correctly used. When you multiple with intrinsic matrix, it needs to be something like given a point in camera frame (x, y, z), // apply distortion model on u_d, u_v |
Wow, the synchronization is really good! Is the camera/lidar frame with delta time([-6ms, 7ms]) recorded directly on the car or somehow processed(e.g manually aligned after raw data is recorded?) |
@peisun1115 I have no doubt that your lidar-camera sync could be the best on the planet, but then what happened in @beunguk 's examples? I'm not hypothesizing a problem with lidar-camera sync, but rather than the @peisun1115 Today, how does one recover the timestamp of a camera image? Is it In order to avoid simple projection problems and other errors as demonstrated in this Github Issue, it sure would be helpful to have a means for exporting the Waymo data to a more well-established format like Kitti (see e.g. in nuscenes https://github.com/nutonomy/nuscenes-devkit/blob/master/python-sdk/nuscenes/scripts/export_kitti.py ). There's probably no expectation that might be made available any time soon, but even the tensorflow/models and TPU teams have made an effort to support MSCOCO format (despite certain drawbacks of that format). |
FILENAME = '/content/waymo-od/tutorial/frames' calibrations = sorted(frame.context.camera_calibrations, key=lambda c: c.name) c = calibrations[0] #only need the front camera laser_labels = frame.laser_labels @peisun1115 This is my code. I find some slices about the 3d coordinate projection. I do have many questions. Could you please help me to check the code and give me a hint? Cheers |
@YanShuo1992 I think the problem is that your code interprets the camera intrinsics incorrectly. The documentation is unfortunately confusing here.
For the demo ( https://colab.research.google.com/github/waymo-research/waymo-open-dataset/blob/master/tutorial/tutorial.ipynb ), I see a
Be mindful of their comment: "Note that this intrinsic corresponds to the images after scaling" -- it appears the units of the parameters they provide are not in pixels. I'm not sure where to look up the image size.. While the Waymo authors don't specify the distortion model exactly, I guess we're supposed to assume the one documented at OpenCV's website: https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html Be careful though, some of the OpenCV code in the calibration module has slight differences between versions. Perhaps we'll get some unambiguous symbol grounding for Waymo's data format when they provide example code for projecting cuboid labels into the camera frame. I might be wrong in my own interpretation, but it's clear that |
Here is an example code snippet that projects a point to image without taking rolling shutter and distortion into account:
I have tested this code on the waymo open dataset and it worked for the example flagged by @beunguk We have documented the data format (including distortion model) here: |
Thank you @peisun1115 . A few questions:
|
@peisun1115 @pwais |
@YanShuo1992
|
@YanShuo1992 Without including distortion, projection points outside of the camera FOV are very likely to work very poorly. That might be the reason. It will be helpful if you can copy/paste your projection results (only for objects inside the camera image's FOV). |
@peisun1115 The red boxes are the projected_lidar_labels in the frame. I find a margin between the objects and the projected_lidar_labels. |
You don't have heading when getting points from the box? Are they 0? How fast is the SDC moving in the scene you selected (check pose difference)? If you do not want to worry about rolling shutter, focus front camera first. Then worry about side camera |
@peisun1115 Sorry, I don't know what SDC is as well. I read the comments in dataset.proto. I assume it is a parameter about the velocity. The larger numbers cause the rolling shutter, is that correct? How will the SDC affect the projection? |
You can try this function to get box corners. |
@peisun1115 Does only the file name with xxx_with_camera_labels.tfrecord contain the corresponding image frame? |
All files contain images (if that is what you meant by 'image frame'). Files with suffix '_with_camera_labels.tfrecord' contain '2D' image labels labeled by humans. All files contain 2D labels projected from lidar (see projected_lidar_labels). |
@peisun1115 Thanks for your quick reply. |
@peisun1115 I think there might be an axis transform / extrinsic rotation missing from your demo code (or implicit and opaque), and perhaps that lead to some confusion in @YanShuo1992 's images. In particular, it appears that the camera extrinsics do not account for e.g. the x-z axis swap and the y-axis inversion that are rolled into the extrinsics published in at least three other open datasets. If one wants to compute pixel-frame 2d point
BUT the above won't work because the extrinsic transform
@peisun1115 It would really be helpful if the documentation of the calibration data in |
I can copy this to our code (dataset.proto) to clarify.
|
I missed this earlier, but the repo I previously linked to has solid support for projecting 3d cuboids to images:
Here's the repo: https://github.com/gdlg/simple-waymo-open-dataset-reader Additional notable features:
Examples using the code in that repo below. Note a couple of things:
|
I am closing this issue as I think we have clarified the lidar->camera projection and you guys are able to make it roughly work. As we mentioned in the thread, we are planning to release a projection lib but we don't have an ETA yet. Please stay tuned. |
Note: the Simple Waymo Open Dataset Reader doesn't check CRC codes (though that might be irrelevant given noise in the Waymo labels). However, if you do need a TFRecord reader that checks CRC codes, you might check out Apache Beam's reader here: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/tfrecordio.py#L67 The cited Beam code was authored by a Google engineer and Google Cloud Sales Engineers are pushing Beam pretty hard onto customers (e.g. Google Dataflow), so it's likely to stay updated. The Tensorflow-free options cited above allow:
|
Sorry. I have a question: when you vis projected_lidar_labels,how do you confirm which camera images? |
Hi,
I'm trying to draw 3d cuboid on 2d camera images, so the all corners could appear in the camera images. I can check there is
projected_lidar_labels
which is 2d bounding boxes for cuboids, but this is not I want to draw. For example, https://www.nuscenes.org/public/images/road.jpg is kind of projected image I would like to make.I tried to use
CameraCalibration
andlaser_labels
in Context to draw cuboids, but I still couldn't figure it out. It seems like cuboids don't align well on objects in camera images.Thanks.
The text was updated successfully, but these errors were encountered: