RSDK-6530: Add depth map encode decode methods to C++ SDK #213

hexbabe · 2024-02-15T17:58:25Z

From https://viam.atlassian.net/browse/RSDK-6530 :

Sean Yu wrote a deserialize_depth_map function in order to consume depth maps in cpp-land for the rgb-d-overlay module. It is best to make this a static Camera method in C++ SDK

stuqdog

Nice, glad to see this here! Generally looks good though I can't speak much to the actual camera logic. I do have a couple small suggestions/questions.

stuqdog · 2024-02-15T18:29:19Z

src/viam/sdk/components/camera/camera.hpp

+    /// @brief Decode image data of custom MIME type FORMAT_RAW_DEPTH into a standard
+    /// representation.


Can we add docstring comments here to explain the arguments, what is returned, and when this might throw an exception?

stuqdog · 2024-02-15T19:01:21Z

src/viam/sdk/components/camera/camera.cpp

+std::tuple<uint64_t, uint64_t, std::vector<uint16_t>> Camera::deserialize_depth_map(
+    const std::vector<unsigned char>& data) {
+    if (data.size() < 24) {
+        throw std::runtime_error("Data too short to contain valid depth information");


We should be throwing our internal Exception class rather than runtime_error.

stuqdog · 2024-02-15T19:04:45Z

src/viam/sdk/components/camera/camera.hpp


+    /// @brief Decode image data of custom MIME type FORMAT_RAW_DEPTH into a standard
+    /// representation.
+    static std::tuple<uint64_t, uint64_t, std::vector<uint16_t>> deserialize_depth_map(


(q) I'm not familiar with depth map logic so perhaps this is totally standard, but would it make sense to create a struct type to return instead of a tuple so we can make it clear which value is height, which width, and which the arr?

stuqdog · 2024-02-15T19:05:29Z

src/viam/sdk/components/camera/camera.hpp


+    /// @brief Decode image data of custom MIME type FORMAT_RAW_DEPTH into a standard
+    /// representation.
+    static std::tuple<uint64_t, uint64_t, std::vector<uint16_t>> deserialize_depth_map(


Is it possible/sensible to add tests for this method?

I think we should. It will require writing an equivalent encode_depth_map function, but I think that's a good idea I will do it

stuqdog · 2024-02-15T19:05:46Z

src/viam/sdk/components/camera/camera.cpp

+    }
+
+    if (data.size() < 24 + width * height * sizeof(uint16_t)) {
+        throw std::runtime_error("Data size does not match width and height");


See above re Exception.

…p to remote to run tests

hexbabe · 2024-02-16T16:23:10Z

Wait, does the run-tests action run unit tests?

src/viam/sdk/components/camera/camera.cpp

stuqdog · 2024-02-16T18:48:45Z

Wait, does the run-tests action run unit tests?

yep! it's a little funky though, if one test fails in a file then often it says they all failed. Lemme know if you want any help with debugging.

bhaney

Thanks for adding this! Just some small changes for clarity

bhaney · 2024-02-22T17:20:52Z

src/viam/sdk/components/camera/camera.cpp

+    size_t offset = 0;
+    const uint64_t magic_number = read_uint64_big_endian(data, offset);
+    if (magic_number != MAGIC_NUMBER) {
+        throw Exception("Invalid magic number. The data may not be a depth map.");


Maybe explain more here? This would be an extremely frustrating error to get (but also a common one, if they tried to feed in an image that wasn't a viam depth map). Maybe say something like "Invalid header for a vnd.viam.dep encoded depth image. The data may be corrupted, or not a viam-encoded depth map."

bhaney · 2024-02-22T17:22:32Z

src/viam/sdk/components/camera/camera.cpp

+    for (size_t i = 0; i < width * height; ++i) {
+        const size_t data_index = HEADER_SIZE + i * sizeof(uint16_t);
+        const uint16_t depth_value = static_cast<uint16_t>(
+            data[data_index] << 8 | data[data_index + 1]);  // Assemble from big endian into uint16


bhaney · 2024-02-22T17:24:45Z

src/viam/sdk/components/camera/camera.hpp

    };

+    /// @brief UTF-8 encoding of 'DEPTHMAP' used in the header of FORMAT_RAW_DEPTH bytes payload.
+    static const uint64_t MAGIC_NUMBER = 0x44455054484D4150ULL;


Could you explain what the ULL at the end is? Seems like "extra" stuff - 0x44455054484D4150 is the complete number

These constants belong in the .cpp file. Also you don't need to use SHOUTY constants in C++. Often in the C++SDK we prefix constants with k_ to indicate that they are constant.

From the gcc manual:

ISO C99 supports data types for integers that are at least 64 bits wide ( . . . ) . To make an integer constant of type long long int, add the suffix LL to the integer. To make an integer constant of type unsigned long long int, add the suffix ULL to the integer.

It should ensure that the literal has the correct type and size @bhaney

bhaney · 2024-02-22T17:27:22Z

src/viam/sdk/components/camera/camera.hpp

+    struct depth_map {
+        uint64_t width;                      ///< Width of the depth map in pixels.
+        uint64_t height;                     ///< Height of the depth map in pixels.
+        std::vector<uint16_t> depth_values;  ///< A flat vector of depth values.


The depth info will almost always be converted into some 2D array eventually. Like you said, a lot of frameworks have methods of reading in a flat vector and turning into into a 2D array. The python SDK reshapes the data into a 2D list (though a numpy array would have been better IMO --- but they don't want to add numpy as a dependency).

Could totally just add a public get(x,y) method to this struct for a bare-bones convenience -- then again -- depending on whether a user is using it as an "image" or a "matrix" will make the accessor different. get(i, j) which treats the data as a matrix indexes on a row first. get(x, y) which treats the data as an image accesses on a column first. I would say that "image-like" access makes more sense, since this is a depth image.

bhaney · 2024-02-22T17:29:09Z

src/viam/sdk/tests/test_camera.cpp

+    const uint64_t width = 2;
+    const uint64_t height = 2;
+    const std::vector<uint16_t> depth_values = {100, 200, 300, 400};


Can you make the test case non-symmetric, like a 2x3 image at least? if you accidentally swapped height and width, you wouldn't know from this test case.

acmorrow · 2024-02-23T19:14:26Z

@hexbabe -

I think I had sort of misunderstood the point of this review. I expected it was creating a new type like raw_image that would actually be used in the Camera api (e.g. raw_image Camera::get_image(...)). But that isn't what is going on here: the depth_map type isn't returned or consumed by Camera, and the two new functions exist only to transform a byte buffer into a depth_map and vice versa.

So, what is the flow of control here? How does the user of the SDK end up with a byte buffer which happens to contain an encoded depth map, and know that it should be passed to Camera::decode_depth_map? Or is this just a "part 1" review and subsequent changes will add methods that traffic in depth_map to Camera?

acmorrow

Overall, the code looks good. I have some small comments and suggestions.

I do wonder whether 1) the depth_map stuff belongs on Camera at all given that the rest of the Camera API doesn't make use of it in any way (yet?), and 2) how exactly it is expected for clients or module authors to use these APIs.

acmorrow · 2024-02-23T19:17:33Z

src/viam/sdk/components/camera/camera.cpp

+// Appends a uint64_t value in big-endian format to a byte vector and updates the offset.
+void append_uint64_big_endian(std::vector<unsigned char>& data, size_t& offset, uint64_t value) {
+    const uint64_t value_be = boost::endian::native_to_big(value);
+    if (data.size() < offset + 8) {


You can get rid of a lot of these explicit 8's by using sizeof

acmorrow · 2024-02-23T19:18:39Z

src/viam/sdk/components/camera/camera.hpp

    };

+    /// @brief UTF-8 encoding of 'DEPTHMAP' used in the header of FORMAT_RAW_DEPTH bytes payload.
+    static const uint64_t MAGIC_NUMBER = 0x44455054484D4150ULL;


These constants belong in the .cpp file. Also you don't need to use SHOUTY constants in C++. Often in the C++SDK we prefix constants with k_ to indicate that they are constant.

acmorrow · 2024-02-23T19:22:24Z

src/viam/sdk/components/camera/camera.hpp

+    ///         per depth value.
+    /// @throws Exception: if the depth data values do not correspond to height and width.
+    ///
+    static std::vector<unsigned char> encode_depth_map(const Camera::depth_map& m);


I take back my comment about it being part of the public API. My comment was based on my assumption that Camera methods were actually going to be trading in depth_maps, in which case I wouldn't have expected this part of the transformation pair to need to be visible to users. However, per my newer understanding about what is being done here, I agree that it needs to be public. I still have some questions though about how users are expected to know what is and isn't a buffer containing a depth map that they should be passing to Camera::decode_depth_map, or what they are expected to do with a buffer containing an encoded depth map after they call Camera::encode_depth_map.

acmorrow · 2024-02-23T19:22:46Z

src/viam/sdk/components/camera/camera.hpp

+    /// data.
+    /// @throws Exception: if the data is misformatted e.g. doesn't contain valid depth information,
+    ///         or if the data size does not match the expected size based on the width and height.
+    static Camera::depth_map decode_depth_map(const std::vector<unsigned char>& data);


acmorrow · 2024-02-23T19:23:34Z

src/viam/sdk/components/camera/camera.cpp

+}
+
+std::vector<unsigned char> Camera::encode_depth_map(const Camera::depth_map& m) {
+    if (m.depth_values.size() != m.width * m.height) {


Couldn't the depth_map constructor enforce this invariant?

acmorrow · 2024-02-23T19:25:18Z

src/viam/sdk/components/camera/camera.cpp

+    size_t offset = 0;
+
+    // Network data is stored in big-endian, while most host systems are little endian.
+    append_uint64_big_endian(data, offset, MAGIC_NUMBER);


Might be better if offset was passed by pointer here to make it clearer to a reader that it was going to be modified.

acmorrow · 2024-02-23T19:26:53Z

src/viam/sdk/components/camera/camera.cpp

+    depth_map.width = width;
+    depth_map.height = height;
+    depth_map.depth_values = std::move(arr);
+    return depth_map;


I'll bet you can write this entire thing as just return {width, height, std::move(arr)};

acmorrow · 2024-02-23T19:28:32Z

src/viam/sdk/components/camera/camera.hpp

+    /// depth_map holds the width and height of a depth map, along with a vector
+    /// of actual depth values. Each depth value is a 16-bit unsigned integer representing
+    /// the distance from the camera to a point in the scene.
+    struct depth_map {


There are invariants for a valid depth_map, notably that depth_values.size() == width * height. That'd argue for making this a class so that the invariant can be enforced on construction and then preserved.

hexbabe · 2024-02-23T19:37:38Z

@hexbabe -

I think I had sort of misunderstood the point of this review. I expected it was creating a new type like raw_image that would actually be used in the Camera api (e.g. raw_image Camera::get_image(...)). But that isn't what is going on here: the depth_map type isn't returned or consumed by Camera, and the two new functions exist only to transform a byte buffer into a depth_map and vice versa.

So, what is the flow of control here? How does the user of the SDK end up with a byte buffer which happens to contain an encoded depth map, and know that it should be passed to Camera::decode_depth_map? Or is this just a "part 1" review and subsequent changes will add methods that traffic in depth_map to Camera?

The scope of this PR is to put decode and encode methods into the SDK as sort of util methods for depth map handling in modules and clients. For decoding, I wanted to mirror how they are handled in the Python SDK. In the Python SDK, we leave it to the client to call raw_img.bytes_to_depth_array() to decode the depth map. The goal here is to provide similar decode util for C++ SDK clients. For encoding, we would expect depth camera module creators to use the encode function to convert their raw bytes into the Viam depth mime type format.

Sorry for all the confusion. There is definitely a misalignment in what the methods in this PR should be doing/where they should be used. My mistake was thinking that this ticket would be as simple as copying over the function I wrote for the RGB-D overlay module into Camera to be used as util functions. If I were to redo this PR, I would've communicated the requirements and thought harder about how to integrate the methods into the exiting C++ APIs.

To amend this mistake, I think we should chat briefly to discuss where it makes sense to place these functions in a less haphazard way that is appropriate for each use case.

We should definitely add these methods though because for most of the modules I've worked on so far, I've had to write my own custom transcoding logic for Viam depth map mime return types.

acmorrow · 2024-03-06T17:53:32Z

@hexbabe and I met to discuss this. The plan is to move forward with this review. The depth_map type will be re-worked in terms of xtensor (either it will be an xtensor type alias, or it will wrap an xtensor). There will then be a subsequent review where the Camera interface is adjusted in order to integrate depth map information. This might look like having Camera::get_image return a variant over raw_image and depth_map, or adding a get_depth_map function that returns an Optional<depth_map>. Similar adjustments will need to be made to get_images to incorporate depth_maps.

… implementation to use xtensor aliased depth map type; Fixed import orders; Updated tests accordingly

acmorrow

This looks really good, though I think there is a memory error that needs to be addressed.

acmorrow · 2024-03-08T19:00:32Z

src/viam/sdk/components/camera.cpp

+        for (size_t j = 0; j < width; ++j) {
+            const uint16_t value = m(i, j);
+            const uint16_t value_be = boost::endian::native_to_big(value);
+            std::memcpy(&data[offset], &value_be, sizeof(uint16_t));
+            offset += sizeof(uint16_t);


If you make append_uint64_big_endian a template on the value type, you could re-use it here:

template<typename T> append_big_endian(std::vector<unsigned char>& data, size_t* offset, T value);

Wow that's so elegant it brings a tear to my eye

acmorrow · 2024-03-08T19:05:29Z

src/viam/sdk/components/camera.cpp

+        depth_values.push_back(depth_value);
+    }
+
+    return xt::adapt(depth_values, std::array<size_t, 2>{height, width});


I think this is going to cause problems: I think xt::adapt is only going to view the memory in depth_values, and it will go out of scope when this function returns.

I think you will need another approach to construct an xtensor that owns the memory here. Let me know if you need me to look into how to do that best.

…-endian helpers to use template value types

hexbabe · 2024-03-08T20:56:13Z

src/viam/sdk/components/camera.cpp

+    }
+
+    xt::xarray<uint16_t> m = xt::xarray<uint16_t>::from_shape({height, width});
+    std::copy(depth_values.begin(), depth_values.end(), m.begin());


I looked up a method to copy the memory in. Let me know if there's a better way to do it @acmorrow

Hey, I was actually just looking at that. So, what you have here is fine, though it does cost you a copy. If you wanted to avoid the copy though, I think that's totally doable.

The trick would be to not create a std::vector, but just to directly create the xt::xarray you want to return, of the appropriate size and shape, before the loop, and then populate it in the loop.

xt::xarray<uint16_t> m = xt::xarray<uint16_t>::from_shape({height, width}); for (size_t i = 0; i < width * height; ++i) { m(<ii>, <jj>) = read_big_endian<uint16_t>(data, &offset);

Now, that'd require a little hassle to compute the correct ii and jj w.r.t. height and width and i, which is sort of a hassle. However, xtensor lets you reshape and view. So I think you could do it something like this:

xt::xarray<uint16_t> m = xt::xarray<uint16_t>::from_shape({height, width}); auto m_linear_view = xt::flatten(m); for (size_t i = 0; i < width * height; ++i) { m_linear_view[i] = read_big_endian<uint16_t>(data, &offset)); return m;

It may not be exactly flatten, you might need reshape_view? I'd need to spend a little time with the xtensor docs to be sure. But something from https://xtensor.readthedocs.io/en/latest/view.html.

I'd say spend no more than 15 minutes on it. If you can make it work, great. If not, put in a TODO and leave it as a copy.

acmorrow

LGTM if you want to keep the copy. If you elect to use reshape/view to build directly into the output tensor, please post an update with that so I can give it a look.

OOO

Add method to camera.hpp and camera.cpp

44e961b

hexbabe requested a review from a team as a code owner February 15, 2024 17:58

hexbabe requested review from njooma, purplenicole730 and stuqdog and removed request for a team February 15, 2024 17:58

hexbabe added 2 commits February 15, 2024 13:01

Run clang-format

d02db03

Make values that can be consts consts

6068ba7

hexbabe force-pushed the RSDK-6530 branch from a009255 to 6068ba7 Compare February 15, 2024 18:07

stuqdog requested changes Feb 15, 2024

View reviewed changes

hexbabe changed the title ~~RSDK-6530: Add deserialize_depth_map to C++ SDK~~ RSDK-6530: Add decode_depth_map to C++ SDK Feb 16, 2024

Add first draft impl of encode/decode + tests. Mostly throwing this u…

8edae9a

…p to remote to run tests

hexbabe force-pushed the RSDK-6530 branch from cc67c66 to 8edae9a Compare February 16, 2024 16:19

hexbabe and others added 2 commits February 16, 2024 16:23

Merge branch 'main' into RSDK-6530

390eaf2

Make name and syntax changes

1b0126f

hexbabe changed the title ~~RSDK-6530: Add decode_depth_map to C++ SDK~~ RSDK-6530: Add depth map encode decode methods to C++ SDK Feb 16, 2024

hexbabe commented Feb 16, 2024

View reviewed changes

src/viam/sdk/components/camera/camera.cpp Outdated Show resolved Hide resolved

Make encode_depth_map static oops

ecd98f4

hexbabe force-pushed the RSDK-6530 branch 5 times, most recently from 93e6c57 to 7f70d7c Compare February 16, 2024 18:06

Add better debugging and little endian check (why the test is failing?)

b6fe35f

hexbabe force-pushed the RSDK-6530 branch from 7f70d7c to b6fe35f Compare February 16, 2024 18:12

hexbabe force-pushed the RSDK-6530 branch 3 times, most recently from 6dddc8f to 3469494 Compare February 16, 2024 19:20

bhaney previously requested changes Feb 22, 2024

View reviewed changes

acmorrow requested changes Feb 23, 2024

View reviewed changes

Address all review points except xtensor

988cec8

hexbabe force-pushed the RSDK-6530 branch from 68f9e75 to 06601da Compare March 7, 2024 19:22

Merge branch 'main' into RSDK-6530

09398fe

hexbabe force-pushed the RSDK-6530 branch 4 times, most recently from 1366e6d to 16fba03 Compare March 8, 2024 15:26

Change read bytes helper to take offset as pointer; Change header and…

244dfef

… implementation to use xtensor aliased depth map type; Fixed import orders; Updated tests accordingly

hexbabe force-pushed the RSDK-6530 branch from 16fba03 to 244dfef Compare March 8, 2024 15:47

hexbabe requested review from acmorrow and bhaney March 8, 2024 16:02

acmorrow requested changes Mar 8, 2024

View reviewed changes

Change xtensor to own depth values memory; Change read and append big…

df75842

…-endian helpers to use template value types

hexbabe force-pushed the RSDK-6530 branch 3 times, most recently from a77c318 to b84b545 Compare March 8, 2024 19:56

Remove unnecessary imports

7bf6b60

hexbabe force-pushed the RSDK-6530 branch from b84b545 to 7bf6b60 Compare March 8, 2024 19:57

hexbabe requested a review from acmorrow March 8, 2024 20:55

hexbabe commented Mar 8, 2024

View reviewed changes

acmorrow approved these changes Mar 8, 2024

View reviewed changes

hexbabe merged commit 641a618 into viamrobotics:main Mar 11, 2024

hexbabe deleted the RSDK-6530 branch March 11, 2024 13:48

hexbabe mentioned this pull request Apr 15, 2024

[RSDK-6710] Consume depth map transcoders from C++ SDK viam-modules/viam-camera-realsense#36

Merged

		/// @brief Decode image data of custom MIME type FORMAT_RAW_DEPTH into a standard
		/// representation.

RSDK-6530: Add depth map encode decode methods to C++ SDK #213

RSDK-6530: Add depth map encode decode methods to C++ SDK #213

Uh oh!

Conversation

hexbabe commented Feb 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stuqdog left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hexbabe commented Feb 16, 2024

Uh oh!

Uh oh!

stuqdog commented Feb 16, 2024

Uh oh!

bhaney left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bhaney Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

acmorrow commented Feb 23, 2024

Uh oh!

acmorrow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hexbabe commented Feb 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

acmorrow commented Mar 6, 2024

Uh oh!

acmorrow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

hexbabe commented Feb 15, 2024 •

edited

Loading

bhaney Feb 22, 2024 •

edited

Loading

hexbabe commented Feb 23, 2024 •

edited

Loading