Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature-Request] Automatically set color cam lens position at RGB-depth alignment #463

Open
Erol444 opened this issue Apr 26, 2022 · 12 comments

Comments

@Erol444
Copy link
Member

Erol444 commented Apr 26, 2022

Start with Why: Many users report that object/feature localization isn't accurate and that it's taking into account background as well. One reason for this is if the depth isn't aligned with the color stream. We have recently added optimization for rgb-depth alignment so it should be enabled at all localization demos/examples.

Move to What: In firmware, automatically determine and set lens position for AF color camera when rgb-depth alignment is enabled. Currently, we just have the code below in the majority of code examples which isn't accurate. For OAK-D-Lite, it should be IIRC ~80, as the camera is different.

# For now, RGB needs fixed focus to properly align with depth.
# This value was used during calibration
camRgb.initialControl.setManualFocus(130)

How: We could use getLensPosition, but not all cameras have this info in eeprom, so we would need to add some fallback logic, so ~15 LoC for each example. IMO we should do this in firmware, so whenever we have rgb-depth alignment enabled, we query getLensPosition, and if it's None we find out whether it's OAK-D (lensPos=130) or OAK-D-Lite (lensPos=~80) and set the focus of the color camera to the determined value.

This would reduce quite some complexity (~15 LoC) from the user which we are trying to achieve, so API is as easy to understand as possible.

@diablodale
Copy link
Contributor

I get the pain/why you describe. Its a real thing and points to a lack of scripter/dev knowledge and you are trying to bridge that gap of knowledge with an API.

My casual feedback...the change you describe is a substantial break in existing functionality/behavior. It could break apps that already use the alignment and focusing APIs. It also presumes that everyone wants to have the focus the same value as during calibration. I suspect most do (like you describe), but there are others that need the better/auto focus and give up a slight decline in alignment/depth.

The OP change would need to handle the fallbacks you describe + special cases like...

rgb.setAutoFocusMode(dai::CameraControl::AutoFocusMode::OFF);
rbg.setManualFocus(45);
stereoDepth.setDepthAlign(dai::CameraBoardSocket::RGB);
// what is the focus value here? is it 45? or is it the value you propose from eeprom/fallback?

To me it feels fragile, special cases, and breaks existing apps.

I believe this is a good candidate for the "DepthAI SDK". A python-only wrapper of functionality hiding the details/complexity of camera alignment.

As a C++ dev, I'm happy with the SDK as it is today on this topic. Nothing in the OP helps me and instead introduces behavior changes which cascade back to me as additional code changes and testing.

If there really must be a new C++ api, I prefer something new like setDepthAlignSetCalibrationFocus(CameraBoardSocket camera).
Technically, we could overload the existing setDepthAlign() with setDepthAlign(CameraBoardSocket camera, bool calibrationFocus = false), but this multiplies because there are two setDepthAlign()` signatures; causing 4 APIs (binary compatibilty) or 2 APIs (api compatibility).

@tachiuhy
Copy link

### Auto-scale RGB image with each focus value before RGB-D alignment
Hi everyone,

I'm working on integrating the OAK-D pro camera into the robotic application. So it requires a lot in the accuracy domain. I feel the pain of lacking accuracy when having the Auto-focus feature. I have done some small experiments on the alignment and got a better accuracy result, at least in my test cases. I will explain my procedure till I figured out the RGB alignment error and the solution for it.

So in my setup for accuracy evaluation, I made a laser-cut plate, with locating holes on it to place 3d-printed ArUco targets. I also had my camera mounted on a mechanical stand. So that, I could get the GroundTruth in the 3D assembly file. After that, I wrote the script to run the camera and detected the ArUco targets and calculate the positions then compared them to the GroundTruth. The below image shows the data I've collected. The red dots are Target centers, and the green dots are projected Groundtruth.
Screenshot 2022-06-04 151623
My first approach is optimizing the extrinsic matrix of the camera wrt to its Ideal mounting position.

the optimized data will be from this..
image

to this..
image

the result I summarized are these charts below
image

Also, I used 3 approaches to calculate positions in this experiment. I used spatial calculate on the camera and spatial calculate on host. With the On-host method, I used 2 methods. The first one is by using HFOV, the formula that is provided by the Depthai example, and the last one is using Camera intrinsic matrix re-projection.
image
In my experiments on the calculation approach, the best way to calculate 3d position is by using the re-projection with the intrinsic matrix.

Finally, here is the stage I realized the RGB alignment error. I have skeptical about this, so I did this test.
I have my OAK-D pro captured images in 2 focus values and this is an overlapped image. You can see the shifting in the texture.
image

So I go into the next stage to evaluate my theory about focus scaling. I capture targets with multiple focuses, distances, and positions on the image frame. Focus ranges from 100 to 255
image

I wrote a script to capture the center of the red dot in 2D through multiple focus values.
image
image
image
After this, I concluded:

  • The auto-focus function will cause a scaling effect on images.
  • The further the object from the image center, the more offset it will be.
  • The scaling effect stays the same no matter how far the distance to the object is, as long as it still maintains the same 2D position on the image.
    So I captured the whole range of focus values and wrote a script to match the scale of all of them.
    GIF_A
    After getting all scale values, I plotted on the chart (blue dots) and used the fit function to get the slope function for the Focus value and RGB scale value. Then I used the function to predict the scale value (orange dots)
    image
    I re-applied the scale value into the new set of images, and here is the result.
    GIF_B

After all of these, I re-done my accuracy evaluation procedure, and here is the result.
OAK-D pro evaluate (with correctMat and scale value).xlsx

this is my first time commenting on Github, so please let me know if I said something wrong. Hope you could consider this and hopefully find something useful for applying to camera firmware

@Luxonis-Brandon
Copy link
Contributor

Great post! Thank you for all the excellent data here. We will digest and circle back tomorrow. Thanks again!

@diablodale
Copy link
Contributor

@tachiuhy thanks for your data on this topic. :-)
I few questions, please.

I don't understand your first 5 charts about extrinsic matrix. Are you exploring where the 3 oak-d lenses are located (mono left, mono right, rgb)? How do these charts represent these three lenses? How does chart 1 lead to chart 2? What are charts 3,4,5 showing for these three lenses?

I remember a brief post from Lux , I think discord, about an idea of adjusting/scale based on rgb focus values. There were open questions on consistency across lenses of same model, across lenses across models, etc. It was interesting to see the loss of data on all sides when the scale+crop is applied in the last animated GIF. My instinct tells me that needs some thinking regarding sideaffects, repercussions.

@tachiuhy
Copy link

Hi @diablodale, thanks for your attention to the post :)

I was thinking about explaining the charts since they lack descriptions and un-unique colors. So the 1st and the 2nd chart are 3D scatter that represent the 3D positions of the ArUco targets used in my test. The chart contains 3 axes for x y and z in millimeters. the blue dots are for the GroundTruth I got and the orange dots are for data I collected from the RGB-Depth alignment. (noted that, I assumed the camera coordinate system is at the front glass and on the optical axis of the RGB lens). The 1st chart is the raw data I got, and you can see it is very off from the GroundTruth. So I decided to move the whole dataset with a transformation matrix, which contains both translation and rotation parameters. That led to the second chart.

For the next 3 charts, I first did the subtraction for all x y, and z positions of GroundTruth and Captured data to get the delta. Those 3 charts are labeled as deltaX, deltaY, and deltaZ (millimeters). Those chart contain 2 lines. The "delta before" means the offset before applying the transformation, and the last one "delta after" is for offset after the transformation.

About the post you mentioned, yes I do think also. But I think there may be ways to calibrate this scale+crop factor at the factory-calibration level for each camera, like what you are doing for distortion and intrinsic parameters.

By the way, this scale+crop is not my final conclusion. Since I realized the fundamental of the focusing lens might not follow the PinHole intrinsic model, it should be the "Thin lens equation" model. I'm trying to do some tests for it. So the Scaling I've done might approx the true accuracy, but not yet to be fully accurate.

@Luxonis-Brandon
Copy link
Contributor

Hi both,

Sorry about the delay. One of the engineers who works on this is on vacation this week. And the other is down with COVID (and a nasty fever).

We'll hopefully circle back next week.

Thanks and sorry again about the delay,
-Brandon

@saching13
Copy link
Contributor

Hello @tachiuhy ,
Thanks alot for the detailed tests.

On the Chart 1 and Chart 2 tests.

  1. ArUco targets depth was on RGB camera ? (If yes. Did you set the manual focus on this ?)
  2. Was setDepthAlign(dai::CameraSocketBoard::RGB) was used here ? If yes then the origin of the cmaera will be in RGB camera optical center. Or else it will be rectified right camera.
  3. And also if subpixel mode is not enabled please do enable it. it will provide more granular depth which might help.

On the focusing issue. From the below Image it kind of becomes flat around 120-150. That is because at the distance you are holding the board it is around that point it is actually in focus. So when you set it to extremes at that point you will see drift.
However, if you use a further object and do the same test you will less change in that pixel.
That said. We have noticed that most of the actual device working range falls in between 100-150 approx (which is around +/-20 around the lens position used for calibration at 1m.) So the scaling might not generally be applied and you can set to a manual focus position for your working range.
This has been my understanding based on what I have noticed.

image

Thoughts ?

@tachiuhy
Copy link

tachiuhy commented Jul 7, 2022

Hi @saching13,

Thanks for your advice! :)

  1. Yes, I set the focus to the fixed mode for the RGB lens and then captured it for accuracy evaluation.
  2. I also set the depth alignment to the RGB sensor. And I put the camera's coordinate system in the 3D setup at RGB optical center.
  3. Yes, I did turn on the subpixel mode when doing the capturing.

Actually, for the horizontal value of the chart you snipped, I should put from 100 to 255 since I capture images from that range. So for the data, you saw that being flat from 120-150 is actually from 230-255. And during my test, 255 is the value that the lens stays focused on the object at the nearest distance. With that said, it is very important to pay attention to the "Optical properties" of the RGB lens, in this case, it is optical magnification. I have some images you can see down below.

So, as you said, if we put the object really far away from the camera (and if you put the object around the center of the frame), the shifting would not be much when changing focus. But if you can somehow move the object to the edge of the frame, you will see the rapid change when you change the focus. I mentioned this does not mean we will have to use multiple focus at all the time, but some applications would require different working ranges, and the current RGB alignment can not handle it, I think. Later I will try to capture something in the further distance to prove the point :)

For my robotics application for example. In some tasks, I have to change the camera to "extended disparity" mode to get the depth data from a closer distance, and since I have to detect stuff in RGB images, now I have to change the focus to nearer. And since I changed the focus to a nearer distance (let's say around 200), then now the optical magnification of the RGB lens system has been changed (larger magnification), and no longer matches the pre-defined intrinsic matrix/doesn't match the depth array anymore.

focus 100
100

focus 120
120

focus 150
150

focus 200
200

focus 255
255

overlapsed of 100 and 255
overlapsed

@tachiuhy
Copy link

tachiuhy commented Jul 7, 2022

Hi, I have a few images capturing further distanced scene with specific focus values. You can see the shifting of the scene dramatically.

focus 100
100_RGB

focus 150
150_RGB

focus 200
200_RGB

focus 255
255_RGB

overlapsed of 100 and 150
overlapsed

@Luxonis-Brandon
Copy link
Contributor

I wonder if we can just do a scale-factor of the image in firmware to account for this. Thoughts @saching13 ?

@saching13
Copy link
Contributor

Thanks @tachiuhy .

Yeah for that close distance to align properly we need to figure a way to change the intrinsics as the lens position changes and then apply it to wrap engine for depth to change accordingly.
Quickest approach would be to use a FF with working range at closer distance.

@tachiuhy
Copy link

tachiuhy commented Jul 8, 2022

Thanks for your help,

I will be keeping using my method for my application rightnow. Hope that my idea helps you somehow.

p/s: I love this new OAK-D pro camera :) so hope to see it get more updates and can go further!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants