Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any documentation on the depth frame format? #7

Closed
andybak opened this issue Dec 1, 2020 · 12 comments
Closed

Any documentation on the depth frame format? #7

andybak opened this issue Dec 1, 2020 · 12 comments

Comments

@andybak
Copy link

andybak commented Dec 1, 2020

I've exported a recording in the native r3d format and I'm attempting to read the depth data

>>> pth = 'winhome/Documents/3D Scans/2020-10-28--15-01-03/rgbd/1.depth'
>>> fh = open(pth, "rb")
>>> compressed = fh.read()
>>> decompressed = liblzfse.decompress(compressed)

But then I'm not sure what to do with the decompressed data. Is it just a case of reading each 4 bytes, unpacking them to a single precision float? The jpgs are 192x256 and doing the maths on that seems to add up: 192 x 256 x 4 = 196608 and len(decompressed) gives me 196608.

So this looks right:

>>> f = [struct.unpack('f', d[x:x+4]) for x in range(0,len(d),4)]

Then I guess I can just write f into any image format that supports floating point (.hdr or .exr maybe)

Am I on the right lines? Are the values linear distances from the camera?

If so - it would be nice to add this to the docs.

@andybak
Copy link
Author

andybak commented Dec 1, 2020

Follow up question - what are the .conf files for? Are there some docs on this I've overlooked?

@marek-simonik
Copy link
Owner

Hello Andy,
yes, you are right. See this simple example of how to load a .depth file:

import numpy as np
import cv2
import liblzfse  # https://pypi.org/project/pyliblzfse/


def load_depth(filepath):
    with open(filepath, 'rb') as depth_fh:
        raw_bytes = depth_fh.read()
        decompressed_bytes = liblzfse.decompress(raw_bytes)
        depth_img = np.frombuffer(decompressed_bytes, dtype=np.float32)

    depth_img = depth_img.reshape((640, 480))  # For a FaceID camera 3D Video
    # depth_img = depth_img.reshape((256, 192))  # For a LiDAR 3D Video

    return depth_img


if __name__ == '__main__':
    depth_filepath = '/tmp/depth_0.lzfse'
    depth_img = load_depth(depth_filepath)

    cv2.imshow('Depth', depth_img)
    cv2.waitKey(0)

As you wrote, the decompressed .depth file is just a buffer of raw float32 depth bytes (each float32 value is a depth value in meters). There are 49 152 (i.e. 192×256) values for a LiDAR frame and 307 200 (i.e. 480×640) values for a FaceID frame.

The .conf files contain confidence map for each frame, which is of the same size as the depth map and for each pixel of the depth map it contains an uint8 number in the range 0-2, which suggest the confidence that the sensed LiDAR depth is "correct". In other words, it is a measure of depth data quality.

I think this answers your question, so I am closing this issue, but feel free to ask follow-up questions.

@andybak
Copy link
Author

andybak commented Dec 18, 2020

(thanks! the above was really helpful for me. However - will anyone else find it easily as it is in a closed github issue? Part of my reason for opening this was to suggest that something like the above would be a great addition to the docs)

@marek-simonik
Copy link
Owner

You are right, thanks for reminding me that. I added link to this Issue into the Wiki.

@wolterlw
Copy link

wolterlw commented Mar 12, 2021

Had the same confusion and found this issue before going to the Wiki.
It would be very helpful to add some mention of it into the main Readme

Also on a related note - is it possible to get distance in meters from an exported RGBD video?

@marek-simonik
Copy link
Owner

OK, I will mention the Wiki in the Readme the next time I will push an update.

As for getting the distance in meters from exported RGBD videos: yes, it is possible. I described how to do it in the Readme of this demo.

@wolterlw
Copy link

got it, thank you for a great app and library!
please do add landscape mode for the iPad someday though

@marek-simonik
Copy link
Owner

Thank you for the suggestion, noted! I will include landscape mode in a future update.

@zehuiz2
Copy link

zehuiz2 commented Sep 1, 2021

Two follow-up questions:

  1. In 'How to use?', you wrote 'JSON config file (containing the intrinsic matrix, FPS, and width/height of the RGBD frames)'. Where could I find this info?
  2. Does 2 mean high confidence or 0?

@marek-simonik
Copy link
Owner

To answer your questions:

  1. After you unzip an exported .r3d file, you will see a metadata file. This is the JSON config file.
  2. In my understanding, 2 is high confidence, 1 is "lower" confidence and 0 is the lowest confidence.

@zehuiz2
Copy link

zehuiz2 commented Jan 28, 2022

Hi,
I wonder if you've updated both LiDAR & FaceID depth resolution?

depth_img = depth_img.reshape((1280, 960)) # For a FaceID camera 3D Video
depth_img = depth_img.reshape((512, 384)) # For a LiDAR 3D Video

Is the above correct?
Another three questions:

  1. Is it possible you could update the FaceID RGB resolution? It is still 640*480.
  2. It seems the LiDAR confidence resolution is not updated?
  3. What are the units of the depth measurements? I assume it is mm?

Thank you very much!

@knsjoon
Copy link

knsjoon commented May 25, 2022

Addition on this issue regarding Apple ARKIT depth confidence map:

From https://developer.apple.com/documentation/arkit/arconfidencelevel, there are only three levels of confidence map values.

  • case low
    Depth-value accuracy in which the framework is less confident.

  • case medium
    Depth-value accuracy in which the framework is moderately confident.

  • case high
    Depth-value accuracy in which the framework is fairly confident.

Hope it helps future users to understand why confidence map is only consist of 0, 1, 2 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants