New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature-Request] Automatically set color cam lens position at RGB-depth alignment #463
Comments
I get the pain/why you describe. Its a real thing and points to a lack of scripter/dev knowledge and you are trying to bridge that gap of knowledge with an API. My casual feedback...the change you describe is a substantial break in existing functionality/behavior. It could break apps that already use the alignment and focusing APIs. It also presumes that everyone wants to have the focus the same value as during calibration. I suspect most do (like you describe), but there are others that need the better/auto focus and give up a slight decline in alignment/depth. The OP change would need to handle the fallbacks you describe + special cases like...
To me it feels fragile, special cases, and breaks existing apps. I believe this is a good candidate for the "DepthAI SDK". A python-only wrapper of functionality hiding the details/complexity of camera alignment. As a C++ dev, I'm happy with the SDK as it is today on this topic. Nothing in the OP helps me and instead introduces behavior changes which cascade back to me as additional code changes and testing. If there really must be a new C++ api, I prefer something new like |
### Auto-scale RGB image with each focus value before RGB-D alignment I'm working on integrating the OAK-D pro camera into the robotic application. So it requires a lot in the accuracy domain. I feel the pain of lacking accuracy when having the Auto-focus feature. I have done some small experiments on the alignment and got a better accuracy result, at least in my test cases. I will explain my procedure till I figured out the RGB alignment error and the solution for it. So in my setup for accuracy evaluation, I made a laser-cut plate, with locating holes on it to place 3d-printed ArUco targets. I also had my camera mounted on a mechanical stand. So that, I could get the GroundTruth in the 3D assembly file. After that, I wrote the script to run the camera and detected the ArUco targets and calculate the positions then compared them to the GroundTruth. The below image shows the data I've collected. The red dots are Target centers, and the green dots are projected Groundtruth. the optimized data will be from this.. the result I summarized are these charts below Also, I used 3 approaches to calculate positions in this experiment. I used spatial calculate on the camera and spatial calculate on host. With the On-host method, I used 2 methods. The first one is by using HFOV, the formula that is provided by the Depthai example, and the last one is using Camera intrinsic matrix re-projection. Finally, here is the stage I realized the RGB alignment error. I have skeptical about this, so I did this test. So I go into the next stage to evaluate my theory about focus scaling. I capture targets with multiple focuses, distances, and positions on the image frame. Focus ranges from 100 to 255 I wrote a script to capture the center of the red dot in 2D through multiple focus values.
After all of these, I re-done my accuracy evaluation procedure, and here is the result. this is my first time commenting on Github, so please let me know if I said something wrong. Hope you could consider this and hopefully find something useful for applying to camera firmware |
Great post! Thank you for all the excellent data here. We will digest and circle back tomorrow. Thanks again! |
@tachiuhy thanks for your data on this topic. :-) I don't understand your first 5 charts about extrinsic matrix. Are you exploring where the 3 oak-d lenses are located (mono left, mono right, rgb)? How do these charts represent these three lenses? How does chart 1 lead to chart 2? What are charts 3,4,5 showing for these three lenses? I remember a brief post from Lux , I think discord, about an idea of adjusting/scale based on rgb focus values. There were open questions on consistency across lenses of same model, across lenses across models, etc. It was interesting to see the loss of data on all sides when the scale+crop is applied in the last animated GIF. My instinct tells me that needs some thinking regarding sideaffects, repercussions. |
Hi @diablodale, thanks for your attention to the post :) I was thinking about explaining the charts since they lack descriptions and un-unique colors. So the 1st and the 2nd chart are 3D scatter that represent the 3D positions of the ArUco targets used in my test. The chart contains 3 axes for x y and z in millimeters. the blue dots are for the GroundTruth I got and the orange dots are for data I collected from the RGB-Depth alignment. (noted that, I assumed the camera coordinate system is at the front glass and on the optical axis of the RGB lens). The 1st chart is the raw data I got, and you can see it is very off from the GroundTruth. So I decided to move the whole dataset with a transformation matrix, which contains both translation and rotation parameters. That led to the second chart. For the next 3 charts, I first did the subtraction for all x y, and z positions of GroundTruth and Captured data to get the delta. Those 3 charts are labeled as deltaX, deltaY, and deltaZ (millimeters). Those chart contain 2 lines. The "delta before" means the offset before applying the transformation, and the last one "delta after" is for offset after the transformation. About the post you mentioned, yes I do think also. But I think there may be ways to calibrate this scale+crop factor at the factory-calibration level for each camera, like what you are doing for distortion and intrinsic parameters. By the way, this scale+crop is not my final conclusion. Since I realized the fundamental of the focusing lens might not follow the PinHole intrinsic model, it should be the "Thin lens equation" model. I'm trying to do some tests for it. So the Scaling I've done might approx the true accuracy, but not yet to be fully accurate. |
Hi both, Sorry about the delay. One of the engineers who works on this is on vacation this week. And the other is down with COVID (and a nasty fever). We'll hopefully circle back next week. Thanks and sorry again about the delay, |
Hello @tachiuhy , On the Chart 1 and Chart 2 tests.
On the focusing issue. From the below Image it kind of becomes flat around 120-150. That is because at the distance you are holding the board it is around that point it is actually in focus. So when you set it to extremes at that point you will see drift. Thoughts ? |
Hi @saching13, Thanks for your advice! :)
Actually, for the horizontal value of the chart you snipped, I should put from 100 to 255 since I capture images from that range. So for the data, you saw that being flat from 120-150 is actually from 230-255. And during my test, 255 is the value that the lens stays focused on the object at the nearest distance. With that said, it is very important to pay attention to the "Optical properties" of the RGB lens, in this case, it is optical magnification. I have some images you can see down below. So, as you said, if we put the object really far away from the camera (and if you put the object around the center of the frame), the shifting would not be much when changing focus. But if you can somehow move the object to the edge of the frame, you will see the rapid change when you change the focus. I mentioned this does not mean we will have to use multiple focus at all the time, but some applications would require different working ranges, and the current RGB alignment can not handle it, I think. Later I will try to capture something in the further distance to prove the point :) For my robotics application for example. In some tasks, I have to change the camera to "extended disparity" mode to get the depth data from a closer distance, and since I have to detect stuff in RGB images, now I have to change the focus to nearer. And since I changed the focus to a nearer distance (let's say around 200), then now the optical magnification of the RGB lens system has been changed (larger magnification), and no longer matches the pre-defined intrinsic matrix/doesn't match the depth array anymore. |
I wonder if we can just do a scale-factor of the image in firmware to account for this. Thoughts @saching13 ? |
Thanks @tachiuhy . Yeah for that close distance to align properly we need to figure a way to change the intrinsics as the lens position changes and then apply it to wrap engine for depth to change accordingly. |
Thanks for your help, I will be keeping using my method for my application rightnow. Hope that my idea helps you somehow. p/s: I love this new OAK-D pro camera :) so hope to see it get more updates and can go further! |
Start with Why: Many users report that object/feature localization isn't accurate and that it's taking into account background as well. One reason for this is if the depth isn't aligned with the color stream. We have recently added optimization for rgb-depth alignment so it should be enabled at all localization demos/examples.
Move to What: In firmware, automatically determine and set lens position for AF color camera when rgb-depth alignment is enabled. Currently, we just have the code below in the majority of code examples which isn't accurate. For OAK-D-Lite, it should be IIRC ~80, as the camera is different.
How: We could use getLensPosition, but not all cameras have this info in eeprom, so we would need to add some fallback logic, so ~15 LoC for each example. IMO we should do this in firmware, so whenever we have rgb-depth alignment enabled, we query
getLensPosition
, and if it'sNone
we find out whether it's OAK-D (lensPos=130
) or OAK-D-Lite (lensPos=~80
) and set the focus of the color camera to the determined value.This would reduce quite some complexity (~15 LoC) from the user which we are trying to achieve, so API is as easy to understand as possible.
The text was updated successfully, but these errors were encountered: