diff --git a/docs/source/_static/images/components/disp_to_depth.jpg b/docs/source/_static/images/components/disp_to_depth.jpg new file mode 100644 index 000000000..c5bda25f5 Binary files /dev/null and b/docs/source/_static/images/components/disp_to_depth.jpg differ diff --git a/docs/source/_static/images/components/disparity_confidence.jpg b/docs/source/_static/images/components/disparity_confidence.jpg new file mode 100644 index 000000000..3d8d4b3b4 Binary files /dev/null and b/docs/source/_static/images/components/disparity_confidence.jpg differ diff --git a/docs/source/_static/images/components/layered-pointcloud.png b/docs/source/_static/images/components/layered-pointcloud.png new file mode 100644 index 000000000..20ee29129 Binary files /dev/null and b/docs/source/_static/images/components/layered-pointcloud.png differ diff --git a/docs/source/_static/images/components/theoretical_error.jpg b/docs/source/_static/images/components/theoretical_error.jpg new file mode 100644 index 000000000..9b71d37a5 Binary files /dev/null and b/docs/source/_static/images/components/theoretical_error.jpg differ diff --git a/docs/source/components/bootloader.rst b/docs/source/components/bootloader.rst index 1cee9e809..44ffb43fe 100644 --- a/docs/source/components/bootloader.rst +++ b/docs/source/components/bootloader.rst @@ -97,10 +97,10 @@ to the pipeline. Instead of transferring the whole package, only Pipeline descri Depthai application package (**.dap**) consists of: - SBR (512B header which describes sections of data) -- Depthai device firmware (section “__firmware”) -- Pipeline description (section “pipeline”) -- Assets structure (section “assets”) -- Asset storage (section “asset_storage”) +- Depthai device firmware (section "__firmware") +- Pipeline description (section "pipeline") +- Assets structure (section "assets") +- Asset storage (section "asset_storage") MAC address ########### diff --git a/docs/source/components/nodes/stereo_depth.rst b/docs/source/components/nodes/stereo_depth.rst index 5b6750415..49a66c975 100644 --- a/docs/source/components/nodes/stereo_depth.rst +++ b/docs/source/components/nodes/stereo_depth.rst @@ -1,7 +1,7 @@ StereoDepth ########### -StereoDepth node calculates the disparity/depth from the stereo camera pair (2x :ref:`MonoCamera `). +StereoDepth node calculates the disparity/depth from the stereo camera pair (2x :ref:`MonoCamera `/:ref:`ColorCamera`). How to place it =============== @@ -47,28 +47,29 @@ Inputs and Outputs .. tab:: **Inputs** - - :code:`left` - :ref:`ImgFrame` from the left :ref:`MonoCamera` - - :code:`right` - :ref:`ImgFrame` from the right :ref:`MonoCamera` - - :code:`inputConfig` - :ref:`StereoDepthConfig` + - ``left`` - :ref:`ImgFrame` from the left stereo camera + - ``right`` - :ref:`ImgFrame` from the right stereo camera + - ``inputConfig`` - :ref:`StereoDepthConfig` .. tab:: **Outputs** - - :code:`confidenceMap` - :ref:`ImgFrame` - - :code:`rectifiedLeft` - :ref:`ImgFrame` - - :code:`syncedLeft` - :ref:`ImgFrame` - - :code:`depth` - :ref:`ImgFrame`: UINT16 values - depth in depth units (millimeter by default) - - :code:`disparity` - :ref:`ImgFrame`: UINT8 or UINT16 if Subpixel mode - - :code:`rectifiedRight` - :ref:`ImgFrame` - - :code:`syncedRight` - :ref:`ImgFrame` - - :code:`outConfig` - :ref:`StereoDepthConfig` + - ``confidenceMap`` - :ref:`ImgFrame` + - ``rectifiedLeft`` - :ref:`ImgFrame` + - ``syncedLeft`` - :ref:`ImgFrame` + - ``depth`` - :ref:`ImgFrame`: UINT16 values - depth in depth units (millimeter by default) + - ``disparity`` - :ref:`ImgFrame`: UINT8 or UINT16 if Subpixel mode + - ``rectifiedRight`` - :ref:`ImgFrame` + - ``syncedRight`` - :ref:`ImgFrame` + - ``outConfig`` - :ref:`StereoDepthConfig` .. tab:: **Debug outputs** - - :code:`debugDispLrCheckIt1` - :ref:`ImgFrame` - - :code:`debugDispLrCheckIt2` - :ref:`ImgFrame` - - :code:`debugExtDispLrCheckIt1` - :ref:`ImgFrame` - - :code:`debugExtDispLrCheckIt2` - :ref:`ImgFrame` - - :code:`debugDispCostDump` - :ref:`ImgFrame` + - ``debugDispLrCheckIt1`` - :ref:`ImgFrame` + - ``debugDispLrCheckIt2`` - :ref:`ImgFrame` + - ``debugExtDispLrCheckIt1`` - :ref:`ImgFrame` + - ``debugExtDispLrCheckIt2`` - :ref:`ImgFrame` + - ``debugDispCostDump`` - :ref:`ImgFrame` + - ``confidenceMap`` - :ref:`ImgFrame` Internal block diagram of StereoDepth node ========================================== @@ -285,221 +286,6 @@ as: For the final disparity map, a filtering is applied based on the confidence threshold value: the pixels that have their confidence score larger than the threshold get invalidated, i.e. their disparity value is set to zero. You can set the confidence threshold with :code:`stereo.initialConfig.setConfidenceThreshold()`. -Calculate depth using disparity map -=================================== - -Disparity and depth are inversely related. As disparity decreases, depth increases exponentially depending on baseline and focal length. Meaning, if the disparity value is close to zero, then a small change in disparity generates a large change in depth. Similarly, if the disparity value is big, then large changes in disparity do not lead to a large change in depth. - -By considering this fact, depth can be calculated using this formula: - -.. code-block:: python - - depth = focal_length_in_pixels * baseline / disparity_in_pixels - -Where baseline is the distance between two mono cameras. Note the unit used for baseline and depth is the same. - -To get focal length in pixels, you can :ref:`read camera calibration `, as focal length in pixels is -written in camera intrinsics (``intrinsics[0][0]``): - -.. code-block:: python - - import depthai as dai - - with dai.Device() as device: - calibData = device.readCalibration() - intrinsics = calibData.getCameraIntrinsics(dai.CameraBoardSocket.RIGHT) - print('Right mono camera focal length in pixels:', intrinsics[0][0]) - -Here's theoretical calculation of the focal length in pixels: - -.. code-block:: python - - focal_length_in_pixels = image_width_in_pixels * 0.5 / tan(HFOV * 0.5 * PI/180) - - # With 400P mono camera resolution where HFOV=71.9 degrees - focal_length_in_pixels = 640 * 0.5 / tan(71.9 * 0.5 * PI / 180) = 441.25 - - # With 800P mono camera resolution where HFOV=71.9 degrees - focal_length_in_pixels = 1280 * 0.5 / tan(71.9 * 0.5 * PI / 180) = 882.5 - -Examples for calculating the depth value, using the OAK-D (7.5cm baseline): - -.. code-block:: python - - # For OAK-D @ 400P mono cameras and disparity of eg. 50 pixels - depth = 441.25 * 7.5 / 50 = 66.19 # cm - - # For OAK-D @ 800P mono cameras and disparity of eg. 10 pixels - depth = 882.5 * 7.5 / 10 = 661.88 # cm - -Note the value of disparity depth data is stored in :code:`uint16`, where 0 is a special value, meaning that distance is unknown. - -Min stereo depth distance -========================= - -If the depth results for close-in objects look weird, this is likely because they are below the minimum depth-perception distance of the device. - -To calculate this minimum distance, use the :ref:`depth formula ` and choose the maximum value for disparity_in_pixels parameter (keep in mind it is inveresly related, so maximum value will yield the smallest result). - -For example OAK-D has a baseline of **7.5cm**, focal_length_in_pixels of **882.5 pixels** and the default maximum value for disparity_in_pixels is **95**. By using the :ref:`depth formula ` we get: - -.. code-block:: python - - min_distance = focal_length_in_pixels * baseline / disparity_in_pixels = 882.5 * 7.5cm / 95 = 69.67cm - -or roughly 70cm. - -However this distance can be cut in 1/2 (to around 35cm for the OAK-D) with the following options: - -1. Changing the resolution to 640x400, instead of the standard 1280x800. - -2. Enabling Extended Disparity. - -Extended Disparity mode increases the levels of disparity to 191 from the standard 96 pixels, thereby 1/2-ing the minimum depth. It does so by computing the 96-pixel disparities on the original 1280x720 and on the downscaled 640x360 image, which are then merged to a 191-level disparity. For more information see the Extended Disparity tab in :ref:`this table `. - -Using the previous OAK-D example, disparity_in_pixels now becomes **190** and the minimum distance is: - -.. code-block:: python - - min_distance = focal_length_in_pixels * baseline / disparity_in_pixels = 882.5 * 7.5cm / 190 = 34.84cm - -or roughly 35cm. - -.. note:: - - Applying both of those options is possible, which would set the minimum depth to 1/4 of the standard settings, but at such short distances the minimum depth is limited by focal length, which is 19.6cm, since OAK-D mono cameras have fixed focus distance: 19.6cm - infinity. - -See `these examples `__ for how to enable Extended Disparity. - -Disparity shift to lower min depth perception ---------------------------------------------- - -Another option to perceive closer depth range is to use disparity shift. Disparity shift will shift the starting point -of the disparity search, which will significantly decrease max depth (MazZ) perception, but it will also decrease min depth (MinZ) perception. -Disparity shift can be combined with extended/subpixel/LR-check modes. - -.. image:: https://user-images.githubusercontent.com/18037362/189375017-2fa137d2-ad6b-46de-8899-6304bbc6c9d7.png - -**Left graph** shows min and max disparity and depth for OAK-D (7.5cm baseline, 800P resolution, ~70° HFOV) by default (disparity shift=0). See :ref:`Calculate depth using disparity map`. -Since hardware (stereo block) has a fixed 95 pixel disparity search, DepthAI will search from 0 pixels (depth=INF) to 95 pixels (depth=71cm). - -**Right graph** shows the same, but at disparity shift set to 30 pixels. This means that disparity search will be from 30 pixels (depth=2.2m) to 125 pixels (depth=50cm). -This also means that depth will be very accurate at the short range (**theoretically** below 5mm depth error). - -**Limitations**: - -- Because of the inverse relationship between disparity and depth, MaxZ will decrease much faster than MinZ as the disparity shift is increased. Therefore, it is **advised not to use a larger than necessary disparity shift**. -- Tradeoff in reducing the MinZ this way is that objects at **distances farther away than MaxZ will not be seen**. -- Because of the point above, **we only recommend using disparity shift when MaxZ is known**, such as having a depth camera mounted above a table pointing down at the table surface. -- Output disparity map is not expanded, only the depth map. So if disparity shift is set to 50, and disparity value obtained is 90, the real disparity is 140. - -**Compared to Extended disparity**, disparity shift: - -- **(+)** Is faster, as it doesn't require an extra computation, which means there's also no extra latency -- **(-)** Reduces the MaxZ (significantly), while extended disparity only reduces MinZ. - -Disparity shift can be combined with extended disparity. - -.. doxygenfunction:: dai::StereoDepthConfig::setDisparityShift - :project: depthai-core - :no-link: - -Max stereo depth distance -========================= - -The maximum depth perception distance depends on the :ref:`accuracy of the depth perception `. The formula used to calculate this distance is an approximation, but is as follows: - -.. code-block:: python - - Dm = (baseline/2) * tan((90 - HFOV / HPixels)*pi/180) - -So using this formula for existing models the *theoretical* max distance is: - -.. code-block:: python - - # For OAK-D (7.5cm baseline) - Dm = (7.5/2) * tan((90 - 71.9/1280)*pi/180) = 3825.03cm = 38.25 meters - - # For OAK-D-CM4 (9cm baseline) - Dm = (9/2) * tan((90 - 71.9/1280)*pi/180) = 4590.04cm = 45.9 meters - -If greater precision for long range measurements is required, consider enabling Subpixel Disparity or using a larger baseline distance between mono cameras. For a custom baseline, you could consider using `OAK-FFC `__ device or design your own baseboard PCB with required baseline. For more information see Subpixel Disparity under the Stereo Mode tab in :ref:`this table `. - -Depth perception accuracy -========================= - -Disparity depth works by matching features from one image to the other and its accuracy is based on multiple parameters: - -* Texture of objects / backgrounds - -Backgrounds may interfere with the object detection, since backgrounds are objects too, which will make depth perception less accurate. So disparity depth works very well outdoors as there are very rarely perfectly-clean/blank surfaces there - but these are relatively commonplace indoors (in clean buildings at least). - -* Lighting - -If the illumination is low, the diparity map will be of low confidence, which will result in a noisy depth map. - -* Baseline / distance to objects - -Lower baseline enables us to detect the depth at a closer distance as long as the object is visible in both the frames. However, this reduces the accuracy for large distances due to less pixels representing the object and disparity decreasing towards 0 much faster. -So the common norm is to adjust the baseline according to how far/close we want to be able to detect objects. - -Limitation -========== - -Since depth is calculated from disparity, which requires the pixels to overlap, there is inherently a vertical -band on the left side of the left mono camera and on the right side of the right mono camera, where depth -cannot be calculated, since it is seen by only 1 camera. That band is marked with :code:`B` -on the following picture. - -.. image:: https://user-images.githubusercontent.com/59799831/135310921-67726c28-07e7-4ffa-bc8d-74861049517e.png - -Meaning of variables on the picture: - -- ``BL [cm]`` - Baseline of stereo cameras. -- ``Dv [cm]`` - Minimum distace where both cameras see an object (thus where depth can be calculated). -- ``B [pixels]`` - Width of the band where depth cannot be calculated. -- ``W [pixels]`` - Width of mono in pixels camera or amount of horizontal pixels, also noted as :code:`HPixels` in other formulas. -- ``D [cm]`` - Distance from the **camera plane** to an object (see image :ref:`here `). -- ``F [cm]`` - Width of image at the distance ``D``. - -.. image:: https://user-images.githubusercontent.com/59799831/135310972-c37ba40b-20ad-4967-92a7-c71078bcef99.png - -With the use of the :code:`tan` function, the following formulas can be obtained: - -- :code:`F = 2 * D * tan(HFOV/2)` -- :code:`Dv = (BL/2) * tan(90 - HFOV/2)` - -In order to obtain :code:`B`, we can use :code:`tan` function again (same as for :code:`F`), but this time -we must also multiply it by the ratio between :code:`W` and :code:`F` in order to convert units to pixels. -That gives the following formula: - -.. code-block:: python - - B = 2 * Dv * tan(HFOV/2) * W / F - B = 2 * Dv * tan(HFOV/2) * W / (2 * D * tan(HFOV/2)) - B = W * Dv / D # pixels - -Example: If we are using OAK-D, which has a :code:`HFOV` of 72°, a baseline (:code:`BL`) of 7.5 cm and -:code:`640x400 (400P)` resolution is used, therefore :code:`W = 640` and an object is :code:`D = 100` cm away, we can -calculate :code:`B` in the following way: - -.. code-block:: - - Dv = 7.5 / 2 * tan(90 - 72/2) = 3.75 * tan(54°) = 5.16 cm - B = 640 * 5.16 / 100 = 33 # pixels - -Credit for calculations and images goes to our community member gregflurry, which he made on -`this `__ -forum post. - -.. note:: - - OAK-D-PRO will include both IR dot projector and IR LED, which will enable operation in no light. - IR LED is used to illuminate the whole area (for mono/color frames), while IR dot projector is mostly - for accurate disparity matching - to have good quality depth maps on blank surfaces as well. For outdoors, - the IR laser dot projector is only relevant at night. For more information see the development progress - `here `__. - Measuring real-world object dimensions ====================================== diff --git a/docs/source/index.rst b/docs/source/index.rst index 3dc8e8fb2..cb2651049 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -71,9 +71,9 @@ node functionalities are presented with code. :hidden: :caption: Tutorials: - tutorials/hello_world.rst tutorials/standalone_mode.rst tutorials/message_syncing.rst + tutorials/configuring-stereo-depth.rst tutorials/multiple.rst tutorials/maximize_fov.rst tutorials/debugging.rst @@ -81,6 +81,7 @@ node functionalities are presented with code. tutorials/dispaying_detections.rst tutorials/image_quality.rst tutorials/low-latency.rst + tutorials/hello_world.rst .. toctree:: :maxdepth: 1 diff --git a/docs/source/samples/calibration/calibration_reader.rst b/docs/source/samples/calibration/calibration_reader.rst index b238cecea..199922bf6 100644 --- a/docs/source/samples/calibration/calibration_reader.rst +++ b/docs/source/samples/calibration/calibration_reader.rst @@ -9,6 +9,43 @@ This example shows how to read calibration data stored on device over XLink. Thi - :ref:`Calibration Flash` - :ref:`Calibration Load` +Camera intrinsics +~~~~~~~~~~~~~~~~~ + +Calibration also contains camera intrinsics and extrinsics parameters. + +.. code-block:: python + + import depthai as dai + + with dai.Device() as device: + calibData = device.readCalibration() + intrinsics = calibData.getCameraIntrinsics(dai.CameraBoardSocket.RIGHT) + print('Right mono camera focal length in pixels:', intrinsics[0][0]) + +Here's theoretical calculation of the focal length in pixels: + +.. math:: + + focal_length_in_pixels = image_width_in_pixels * 0.5 / tan(HFOV * 0.5 * PI/180) + + // With 400P mono camera resolution where HFOV=71.9 degrees + focal_length_in_pixels = 640 * 0.5 / tan(71.9 * 0.5 * PI / 180) = 441.25 + + // With 800P mono camera resolution where HFOV=71.9 degrees + focal_length_in_pixels = 1280 * 0.5 / tan(71.9 * 0.5 * PI / 180) = 882.5 + +Examples for calculating the depth value, using the OAK-D (7.5cm baseline): + +.. math:: + + # For OAK-D @ 400P resolution and disparity of eg. 50 pixels + depth = 441.25 * 7.5 / 50 = 66.19 # cm + + # For OAK-D @ 800P resolution and disparity of eg. 10 pixels + depth = 882.5 * 7.5 / 10 = 661.88 # cm + + Setup ##### diff --git a/docs/source/tutorials/configuring-stereo-depth.rst b/docs/source/tutorials/configuring-stereo-depth.rst new file mode 100644 index 000000000..76da3da9c --- /dev/null +++ b/docs/source/tutorials/configuring-stereo-depth.rst @@ -0,0 +1,507 @@ +Configuring Stereo Depth +######################## + +Our :ref:`StereoDepth node ` is very configurable and with this tutorial we will go over some **configurations and troubleshooting** +you can do to get the best results. + +This documentation is divided into 6 chapters: + +- :ref:`1. Stereo Depth Basics` +- :ref:`2. Fixing noisy depth` +- :ref:`3. Improving depth accuracy` +- :ref:`4. Short range stereo depth` +- :ref:`5. Long range stereo depth` +- :ref:`6. Fixing noisy pointcloud` + +1. Stereo Depth Basics +********************** + +`Stereo depth vision `__ works by calculating the disparity between two images taken from +slightly different points. + +Stereo vision works a lot like our eyes. Our brains (subconsciously) estimate the depth of objects and scenes based on the difference between what our left eye sees +versus what our right eye sees. On OAK-D cameras, it's exactly the same; we have left and right cameras (of the stereo camera pair) +and the OAK does on-device disparity matching to estimate the depth of objects and scenes. + +Disparity refers to the distance between two corresponding points in the left and right image of a stereo pair. + +.. image:: /_static/images/components/disparity_explanation.jpeg + +Depth from disparity +-------------------- + +Let's first look at how the depth is calculated: + +.. math:: + + depth [mm] = focalLength [pix] * baseline [mm] / disparity [pix] + + +`RVC2 `__-based cameras have a **0..95 disparity search** range, +which limits the minimal depth perception. Baseline is the distance between two cameras of the +stereo camera pair. You can read the camera's focal length (in pixels) from calibration, see the :ref:`tutorial here ` + +Disparity and depth are inversely related. As disparity decreases, the depth increases exponentially depending on the baseline and focal length. +Meaning, if the disparity value is close to zero, then a small change in disparity generates a large change in depth. +Similarly, if the disparity value is big, then some change in disparity doesn't lead to a large change in depth (better accuracy). + +Here's a graph showing disparity vs depth for OAK-D (7.5cm baseline distance) at 800P: + +.. figure:: /_static/images/components/disp_to_depth.jpg + + `Full chart here `__ + +Note the value of depth data is stored in *uint16*, where 0 means that the distance is invalid/unknown. + +How baseline distance and focal length affect depth +--------------------------------------------------- + +Looking at the depth formula above, we can see that either a larger baseline distance or a larger focal length will result +in further depth at the same disparity, which means that the depth accuracy will be better. + +Focal length is the distance between the camera lens and the image sensor. The larger the focal length, the narrower the FOV. + +So to get **long-range depth** perception, you can **increase the baseline distance and/or decrease the FOV**. + +.. note:: + + Wider FOV will result in worse depth accuracy, even at shorter ranges (where accuracy drop isn't as noticeable). + +2. Fixing noisy depth +********************* + +A few topics we have noticed that are relevant for stereo depth quality are: + +- :ref:`Scene Texture` +- :ref:`Stereo depth confidence threshold` +- :ref:`Stereo camera pair noise` +- :ref:`Stereo postprocessing filters` + +Scene Texture +------------- + +Due to the way the stereo matching algorithm works, **passive stereo depth requires** to have a **good texture** in the scene, otherwise, the depth will be noisy/invalid. +low-visual-interest surfaces (blank surfaces with little to no texture), such as a wall or floor. + +**Solution:** Our `Pro version `__ of OAK cameras have onboard **IR laser dot projector**, +which projects thousands of little dots on the scene, which helps the stereo matching algorithm as it provides more texture to the scene. + +.. image:: https://user-images.githubusercontent.com/18037362/222730554-a6c8d4d3-cb0b-422e-8474-6a979e73727a.gif + +The technique that we use is called ASV (`Conventional Active Stereo Vision `__) +as stereo matching is performed on the device the same way as on a passive stereo OAK-D. + + +Stereo depth confidence threshold +--------------------------------- + +When calculating the disparity, each pixel in the disparity map gets assigned a confidence value :code:`0..255` by the stereo matching algorithm. +This confidence score is kind of inverted (if, say, comparing with NN confidences): + +- **0** - maximum confidence that it holds a valid value +- **255** - minimum confidence, so there is more chance that the value is incorrect + +For the final disparity map, filtering is applied based on the confidence threshold value: the pixels that have their confidence score larger than +the threshold get invalidated, i.e. their disparity value is set to zero. You can set the confidence threshold via the API below. + +This means that with the confidence threshold, users can prioritize **fill-rate or accuracy**. + +.. tabs:: + + .. tab:: Python + + .. code-block:: python + + # Create the StereoDepth node + stereo_depth = pipeline.create(dai.node.StereoDepth) + stereo_depth.initialConfig.setConfidenceThreshold(threshold) + + # Or, alternatively, set the Stereo Preset Mode: + # Prioritize fill-rate, sets Confidence threshold to 245 + stereo_depth.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY) + # Prioritize accuracy, sets Confidence threshold to 200 + stereo_depth.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_ACCURACY) + + .. tab:: C++ + + .. code-block:: cpp + + // Create the StereoDepth node + auto stereo_depth = pipeline.create(); + stereo_depth->initialConfig.setConfidenceThreshold(threshold); + + // Or, alternatively, set the Stereo Preset Mode: + // Prioritize fill-rate, sets Confidence threshold to 245 + stereo_depth->setDefaultProfilePreset(dai::node::StereoDepth::Preset::HIGH_DENSITY); + // Prioritize accuracy, sets Confidence threshold to 200 + stereo_depth->setDefaultProfilePreset(dai::node::StereoDepth::Preset::HIGH_ACCURACY); + + +.. + .. image:: gif of changing threshold, and how fill-rate/accuracy changes + +Stereo camera pair noise +------------------------ + +If input left/right images are noisy, the disparity map will be noisy as well. So the prerequisite for good depth are high IQ (see :ref:`Image Quality ` docs) +left/right stereo images. Active stereo (`Pro version `__ of OAK cameras) +mostly alleviates this issue, but for passive stereo cameras, there are a few things you can do to improve the quality of the stereo camera pair. + +It is preferred to use mono (grayscale) cameras for the stereo camera pair as they +have better quantum efficiency (QE) as they don't have color (Bayer) filter. Higher QE means more signal will be generated for the same amount of light (photons), +which leads to better SNR (signal-to-noise ratio). + +For better low-light performance, it's advised to use longer exposure times instead of higher gain (ISO) as it will improve SNR. Sometimes this means lowering +camera FPS - at 30 FPS, you can use 1/30s exposure time, at 15 FPS, you can use 1/15s exposure time, etc. For more information, see :ref:`Low-light increased sensitivity`. + +Another potential improvement is to tweak the sensor's ISP settings, like chroma & luma denoise, and sharpness. For more information, see the :ref:`Color camera ISP configuration`. + +Stereo postprocessing filters +----------------------------- + +The :ref:`StereoDepth` node has a few postprocessing filters that **run on-device**, which can be enabled to improve the quality of the disparity map. For **implementation +(API) details**, see :ref:`StereoDepth configurable blocks `. For an example, see the :ref:`Depth Post-Processing` example. + +As these filters run on the device, it has a some **performance cost**, which means that at high-resolution frames (1MP) these might bottleneck the FPS. To improve +the cost, one should consider using lower-resolution frames (eg. 400P) and/or using :ref:`Decimation filter`. Due to additional processing, these filters also introduce +:ref:`additional latency `. + +Median filter +~~~~~~~~~~~~~ + +This is a non-edge preserving Median filter, which can be used to reduce noise and smoothen the depth map. Median filter is implemented in hardware, so it's the +fastest filter. + +Speckle filter +~~~~~~~~~~~~~~ + +Speckle Filter is used to reduce the speckle noise. Speckle noise is a region with huge variance between neighboring disparity/depth pixels, and speckle +filter tries to filter this region. + +Temporal filter +~~~~~~~~~~~~~~~ + +Temporal Filter is intended to improve the depth data persistency by manipulating per-pixel values based on previous frames. The filter performs a single pass on +the data, adjusting the depth values while also updating the tracking history. + +In cases where the pixel data is missing or invalid, the filter uses a user-defined persistency mode to decide whether the missing value should be improved +with stored data. Note that due to its reliance on historic data, the filter may introduce +visible motion blurring/smearing artifacts, and therefore is best-suited for **static scenes**. + +Spatial filter +~~~~~~~~~~~~~~ + +Spatial Edge-Preserving Filter will fill invalid depth pixels with valid neighboring depth pixels. It performs a series of 1D horizontal and vertical passes or +iterations, to enhance the smoothness of the reconstructed data. It is based on `this research paper `__. + +Brightness filter +~~~~~~~~~~~~~~~~~ + +Brightness filter will filter out (invalidate, by setting to 0) all depth pixels for which input stereo camera image pixels are outside the configured +min/max brightness threshold values. This filter is useful when you have a high dynamic range scene, like outside on a bright day, or in general whenever +stereo camera pair can directly see a light source: + +.. figure:: https://user-images.githubusercontent.com/18037362/216110871-fe807fc0-858d-4c4d-bbae-3a8eff35645d.png + + Direct light source (ceiling light) - depth pixels are invalid + +It also helps with rectification "artifacts", especially when you have Wide FOV lenses and you apply alpha param. When there's no available pixel, +StereoDepth node will set that area to 0 (black) by default, but can be changed with ``stereoDepth.setRectifyEdgeFillColor(int8)``. This black area can then be +invalidated with brightness filter, as seen below: + +.. figure:: https://user-images.githubusercontent.com/18037362/223171135-734babe6-72b4-4aa1-9741-9fd8b4552555.jpeg + + Invalidating depth where we have rectification "artifacts" + +Threshold filter +~~~~~~~~~~~~~~~~ + +Threshold filter will filter out all depth pixels outside the configured min/max threshold values. In a controlled environment, where you know exactly how far the scene +can be (eg. 30cm - 2m) it's advised to use this filter. + +Decimation filter +~~~~~~~~~~~~~~~~~ + +Decimation Filter will sub-sample the depth map, which means it reduces the depth scene complexity and allows other filters to run faster. Setting +*decimationFactor* to 2 will downscale 1280x800 depth map to 640x400. We can either select pixel skipping, median, or mean decimation mode, and the latter two +modes help with filtering as well. + +It's also very useful :ref:`for pointclouds `. + +3. Improving depth accuracy +*************************** + +The above chapter we focused on noise, which isn't necessarily the only reason for inaccurate depth. + +There are a few ways to improve depth accuracy: + +- (mentioned above) :ref:`Fixing noisy depth <2. Fixing noisy depth>` - depth should be high quality in order to be accurate +- (mentioned above) :ref:`Stereo depth confidence threshold` should be low(er) in order to get the best accuracy +- :ref:`Move the camera closer to the object` for the best depth accuracy +- Enable :ref:`Stereo Subpixel mode`, especially if the object/scene isn't close to MinZ of the camera + +Move the camera closer to the object +------------------------------------ + +Looking at the :ref:`Depth from disparity` section, from the graph it's clear that at the 95 disparity pixels (close distance), +depth change between disparity pixels (eg. 95->94) is the lowest, so the **depth accuracy is the best**. + + +.. image:: /_static/images/components/theoretical_error.jpg + +Depth accuracy decreases exponentially with the distance from the camera. Note that with :ref:`Stereo Subpixel mode` +enabled you can have better depth accuracy (even at a longer distance) but it only works to some extent. + +So to conclude, **object/scene you are measuring** should be **as close as possible to MinZ** (minimal depth perception) of the camera +for **best depth accuracy**. You can find MinZ specification for each device in the `hardware documentation `__. + +Stereo Subpixel mode +-------------------- + +Let's first start with what Stereo Subpixel mode is and how it works. For image subpixel explanation, see `What's subpixel? `__). + +.. note:: + + The stereo depth pipeline is very complex (see :ref:`Internal block diagram of StereoDepth node`), and we will simplify it here for better understanding. It actually doesn't use confidence (eg. ``stereoDepth.confidenceMap`` output), but cost dump, which is what is used to calculate confidence values. + +When calculating disparity depth, stereo matching algorithm assign a "confidence" for each disparity pixel, which means each pixel +of the depth image contains 96 bytes (for confidence). If you are interested in all these cost values, you can use ``stereoDepth.debugDispCostDump`` output, +just note it's a very large output (eg. 1280*800*96 => 98MB for each frame). + +.. image:: /_static/images/components/disparity_confidence.jpg + +Stereo Subpixel mode will calculate subpixel disparity by looking at the confidence values of the 2 neighboring disparity pixels in each direction. +In the above example graph, in normal mode, StereoDepth would just get the max disparity = 34 pixels, but in Subpixel +mode, it will return a bit more, eg. 37.375 pixels, as confidences for pixels 35 and 36 are quite high as well. + +**TL;DR:** Stereo Subpixel mode should always provide more accurate depth, but will consume additional HW resources (see :ref:`Stereo depth FPS` for impact). + +Stereo subpixel effect on layering +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Default stereo depth output has 0..95 disparity pixels, which would produce 96 unique depth values. This can especially be seen when using pointcloud representation + and seeing how there are discrete "layers" of points, instead of a smooth transition: + +.. image:: /_static/images/components/layered-pointcloud.png + +This layering can especially be seen at longer distances, where these layers are exponentially further apart. + +But with Stereo Subpixel mode enabled, there are many more unique values possible, which produces more granular depth steps, and thus smoother a pointcloud. + +.. math:: + 94 * 2^3 [subpixel bits] + 2 [min/max value] = 754 unique values + +.. math:: + 94 * 2^4 [subpixel bits] + 2 [min/max value] = 1506 unique values + +.. math:: + 94 * 2^5 [subpixel bits] + 2 [min/max value] = 3010 unique values + +One can change the number of subpixel bits by setting ``stereoDepth.setSubpixelFractionalBits(int)`` parameter to 3, 4 or 5 bits. + +4. Short range stereo depth +*************************** + +To get accurate short-range depth, you'd first need to follow :ref:`3. Improving depth accuracy` steps. +For most normal-FOV, OV9282 OAK-D* cameras, you'd want to have the object/scene about 70cm away from the camera, +where you'd get below 2% error (with good :ref:`Scene Texture`), so ± 1.5cm error. + +But how to get an even better depth accuracy, eg. **sub-cm stereo depth accuracy**? +As we have learned at :Ref:`How baseline distance and focal length affect depth`, we would want to +have a closer baseline distance and/or narrower FOV lenses. + +That's why for the short-range depth perception **we suggest using** `OAK-D SR `__, +which has 2 cm baseline distance, 800P resolution, and is ideal for depth sensing of up to 1 meter. + +Going back to :ref:`Depth from disparity`, minimal depth perception (**MinZ**) is defined by the following formula, where the disparity is 95 pixels +(maximum number of pixel for disparity search): + +.. math:: + depth = focal_length * baseline / disparity + +.. math:: + MinZ = focal_length * baseline / 95 + +How to get lower MinZ +--------------------- + +If the depth results for close-in objects look weird, this is likely because they are below MinZ distance of the OAK camera. You can get lower MinZ for OAK cameras by either: + +- :ref:`Lowering resolution ` +- Enabling :ref:`Stereo Extended Disparity mode` +- Using :ref:`Disparity shift` - suggested in a controlled environment, where MaxZ is known + +Both of these last 2 options can be enabled at the same time, which would set the minimum depth to 1/4 of the standard settings, but at such short distances the MinZ +could be limited by the focal length. + +Lowering resolution to decrease MinZ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Above we have a formula for MinZ, and by lowering the resolution, we are lowering focal length (in pixels), so let's look at the formula again: + +.. math:: + MinZ = focalLength * baseline / 95 + +.. math:: + MinZ [800P OAK-D] = 882.5 * 7.5 / 95 = 70 cm + +.. math:: + MinZ [400P OAK-D] = 441 * 7.5 / 95 = 35 cm + +As you can see, by lowering resolution by 2, we are also lowering MinZ by 2. Note that because you have fewer pixels, you will also have lower depth accuracy (in cm). + +Stereo Extended Disparity mode +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Very similar to :ref:`Lowering resolution to decrease MinZ`, Extended mode runs stereo depth pipeline twice (thus consuming more HW resources); once with resolution of +the frame that was passed to :ref:`StereoDepth` node, and once with resolution downscaled by 2, then combines the 2 output disparity maps. + +Disparity shift +~~~~~~~~~~~~~~~ + +In a controlled environment, where MaxZ is known in advance, to perceive closer depth range it's advised to use disparity shift, as it doesn't decrease depth accuracy +as the other 2 methods above do. + +Disparity shift will shift the starting point of the disparity search, which will significantly decrease MaxZ, but +it will also decrease the MinZ. Disparity shift can be combined with extended/subpixel/LR-check modes. + +.. image:: https://user-images.githubusercontent.com/18037362/189375017-2fa137d2-ad6b-46de-8899-6304bbc6c9d7.png + +The **Left graph** shows min and max disparity and depth for OAK-D (7.5cm baseline, 800P resolution, ~70° HFOV) by default (disparity shift=0). See :ref:`Depth from disparity`. +Since hardware (stereo block) has a fixed 95 pixel disparity search, DepthAI will search from 0 pixels (depth=INF) to 95 pixels (depth=71cm). + +**Limitations**: +The **Right graph** shows the same, but at disparity shift set to 30 pixels. This means that disparity search will be from 30 pixels (depth=2.2m) to 125 pixels (depth=50cm). +This also means that depth will be very accurate at the short range (**theoretically** below 5mm depth error). + + +- Because of the inverse relationship between disparity and depth, MaxZ will decrease much faster than MinZ as the disparity shift is increased. Therefore, it is **advised not to use a larger-than-necessary disparity shift**. +- Teh tradeoff in reducing the MinZ this way is that objects at **distances farther away than MaxZ will not be seen**. +- Because of the point above, **we only recommend using disparity shift when MaxZ is known**, such as having a depth camera mounted above a table pointing down at the table surface. +- Output disparity map is not expanded, only the depth map. So if disparity shift is set to 50, and disparity value obtained is 90, the real disparity is 140. + +**Compared to Extended disparity**, disparity shift: + +- **(+)** Is faster, as it doesn't require an extra computation, which means there's also no extra latency +- **(-)** Reduces the MaxZ (significantly), while extended disparity only reduces MinZ. + +Disparity shift can be combined with extended disparity. + +.. doxygenfunction:: dai::StereoDepthConfig::setDisparityShift + :project: depthai-core + :no-link: + +Close range depth limitations +----------------------------- + +Since depth is calculated from disparity, which requires the pixels to overlap, there is inherently a vertical +band on the left side of the left mono camera and on the right side of the right mono camera, where depth +can not be calculated, since it is seen by only 1 stereo camera. + +At very close distance, even when enabling :ref:`Stereo Extended Disparity mode` and :ref:`Lowering resolution `, +you will notice this vertical band of invalid depth pixel. + +.. image:: https://user-images.githubusercontent.com/59799831/135310921-67726c28-07e7-4ffa-bc8d-74861049517e.png + +Meaning of variables on the picture: + +- ``BL [cm]`` - Baseline of stereo cameras. +- ``Dv [cm]`` - Minimum distance where both cameras see an object (thus where depth can be calculated). +- ``W [pixels]`` - Width of mono in pixels camera or amount of horizontal pixels, also noted as :code:`HPixels` in other formulas. +- ``D [cm]`` - Distance from the **camera plane** to an object (see image :ref:`here `). + +.. image:: https://user-images.githubusercontent.com/59799831/135310972-c37ba40b-20ad-4967-92a7-c71078bcef99.png + +With the use of the :code:`tan` function, the following formulas can be obtained: + +.. math:: + F = 2 * D * tan(HFOV/2) + +.. math:: + Dv = (BL/2) * tan(90 - HFOV/2) + +In order to obtain :code:`B`, we can use :code:`tan` function again (same as for :code:`F`), but this time +we must also multiply it by the ratio between :code:`W` and :code:`F` in order to convert units to pixels. +That gives the following formula: + +.. math:: + B = 2 * Dv * tan(HFOV/2) * W / F + +.. math:: + B = 2 * Dv * tan(HFOV/2) * W / (2 * D * tan(HFOV/2)) + +.. math:: + B = W * Dv / D # pixels + +Example: If we are using OAK-D, which has a HFOV of 72°, a baseline (:code:`BL`) of 7.5 cm and +640x400 (400P) resolution is used, therefore :code:`W = 640` and an object is :code:`D = 100` cm away, we can +calculate :code:`B` in the following way: + +.. math:: + + Dv = 7.5 / 2 * tan(90 - 72/2) = 3.75 * tan(54°) = 5.16 cm + B = 640 * 5.16 / 100 = 33 # pixels + +Credit for calculations and images goes to our community member gregflurry, which he made on +`this `__ +forum post. + +5. Long range stereo depth +************************** + +To get accurate long-range depth, we should first check :ref:`3. Improving depth accuracy` steps, +as they are especially applicable to long-range depth. + +For long-range depth, we should also consider the following: + +- Narrow FOV lenses +- Wide baseline distance between stereo cameras + +That's why for long range, **we suggest using** `OAK-D LR `__, +which has a (larger) baseline distance of 15cm and default FOV of 60°. It has `M12 mount lenses `__, +so users can replace these with even narrower (or wider) FOV lenses. + +6. Fixing noisy pointcloud +************************** + +For noisy pointcloud we suggest a few approaches: + +* (mentioned above) Start with the :ref:`Fixing noisy depth <2. Fixing noisy depth>` chapter, as otherwise, noise will produce points all over the pointcloud +* (mentioned above) Continue with the :ref:`Improving depth accuracy <3. Improving depth accuracy>` chapter - depth inaccuracy will be easily visible in pointcloud + * Enable Stereo subpixel mode, especially due to the :ref:`Stereo subpixel effect on layering` + +* :ref:`Decimation filter for pointcloud` for faster processing (FPS) and additional filtering +* :ref:`Invalidating pixels around the corner` should help to reduce noise around the corners of the depth frame +* :ref:`Host-side pointcloud filtering` for additional filtering + +Decimation filter for pointcloud +-------------------------------- + +:ref:`Decimation filter` is especially useful for pointclouds, you don't really want 1 million points (even though it sounds good for marketing), +as it's too much data to process. Decimation filter helps here, and should be enabled when working with pointclouds. + +When using decimation filter for pointcloud you should enable **median/mean mode decimation**, as it will provide additional filtering (compared to pixel skipping mode). +It also makes other :ref:`Stereo postprocessing filters` faster, since there will be less data to process. + +Invalidating pixels around the corner +------------------------------------- + +There are often invalid/noisy pixels around the corners, and we have seen that some customers preventively invalidate a few pixels (eg. 3) all around the corner of depth +image. We also suggest enabling :ref:`Brightness filter`, especially due to rectification "artifacts". + +Host-side pointcloud filtering +------------------------------ + +Besides device-side :ref:`Stereo postprocessing filters`, we also suggest running host-side pointcloud filtering (with eg. `Open3D `__, or `PCL `__ library). + +We especially suggest using pointcloud voxalization and removing statistical outliers techniques, `example here `__ for both of these. + + +.. + Best practices in certain environments + ************************************** + + - In high dynamic range env (like outside), use brightness filter (img above) + - In more static env, temporal filter + +.. include:: /includes/footer-short.rst \ No newline at end of file diff --git a/docs/source/tutorials/low-latency.rst b/docs/source/tutorials/low-latency.rst index 72358b398..978e467cb 100644 --- a/docs/source/tutorials/low-latency.rst +++ b/docs/source/tutorials/low-latency.rst @@ -44,7 +44,7 @@ disabled for these tests (:code:`pipeline.setXLinkChunkSize(0)`). For an example - `link `__ - **Time-to-Host** is measured time between frame timestamp (:code:`imgFrame.getTimestamp()`) and host timestamp when the frame is received (:code:`dai.Clock.now()`). -- **Histogram** shows how much Time-to-Host varies frame to frame. Y axis represents number of frame that occured at that time while the X axis represents microseconds. +- **Histogram** shows how much Time-to-Host varies frame to frame. Y axis represents number of frame that occurred at that time while the X axis represents microseconds. - **Bandwidth** is calculated bandwidth required to stream specified frames at specified FPS. Encoded frames