Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aligning external camera and depth in Python using numpy arrays results in black image expect the first pixel #12677

Open
fcitil opened this issue Feb 17, 2024 · 20 comments

Comments

@fcitil
Copy link

fcitil commented Feb 17, 2024

Required Info
Camera Model D435i
Operating System & Version Ubuntu 20.04.6 LTS
Kernel Version (Linux Only) 5.15.0-94-generic
Platform PC
SDK Version 2.53.1.4623
Language Python 3.9.16

Issue Description

Hi,
I am trying to align a external RGB camera pixels with Realsense D435i depth usign align.process. In order to achieve that, I am trying to use software-device with two seperate rostopic of RGB frame and depth topics with ros. In this regard, I am currently working with .npy files to confirm that I am able create a new frameset consisting of RGB frame and corresponding depth. I first tried to run sample python script provided by @gwen2018 in issue #12020, but I need to change the first part in order to work with realsense .bag ( I used official sample .bag files) file as follows:

#####################################################################################################
#   Demo                                                                                           ##
#     Align depth to color with precaptured images in software device                              ##
#                                                                                                  ##
##  Purpose                                                                                        ##
##    This example first captures depth and color images from realsense camera and then            ##
##    demonstrate align depth to color with the precaptured images in software device              ##
##                                                                                                 ##
##  Steps:                                                                                         ##
##    1) stream realsense camera with depth 640x480@30fps and color 1280x720@30fps                 ##
##    2) capture camera depth and color intrinsics and extrinsics                                  ##
##    3) capture depth and color images and save into files in npy format                          ##
##    4) construct software device from the saved intrinsics, extrinsics, depth and color images   ##
##    5) align the precaptured depth image to to color image                                       ##
##                                                                                                 ##
#####################################################################################################

import cv2
import pyrealsense2 as rs
import numpy as np
import os
import time

fps = 30                  # frame rate
tv = 1000.0 / fps         # time interval between frames in miliseconds

max_num_frames  = 300    # max number of framesets to be captured into npy files and processed with software device

depth_file_name = "depth"  # depth_file_name + str(i) + ".npy"
color_file_name = "color"  # color_file_name + str(i) + ".npy"

# intrinsic and extrinsic from the camera
camera_depth_intrinsics          = rs.intrinsics()  # camera depth intrinsics
camera_color_intrinsics          = rs.intrinsics()  # camera color intrinsics
camera_depth_to_color_extrinsics = rs.extrinsics()  # camera depth to color extrinsics


######################## Start of first part - capture images from live device #######################################
# stream depth and color on attached realsnese camera and save depth and color frames into files with npy format
try:
  # # create a context object, this object owns the handles to all connected realsense devices
  ctx = rs.context()
  devs = list(ctx.query_devices())
  
  if len(devs) > 0:
      print("Devices: {}".format(devs))
  else:
      print("No camera detected. Please connect a realsense camera and try again.")
      # exit(0) # since no device is needed to be work with .bag files skip this line
  
  pipeline = rs.pipeline()

  # configure streams
  config = rs.config()
  config.enable_stream(rs.stream.depth,1280, 720)#, rs.format.z16, fps)  # when formats and fps added
  config.enable_stream(rs.stream.color, 640, 480)#, rs.format.bgr8, fps)   # It fails, I don't know why ??
  
  config.enable_device_from_file("d435i_walking.bag") # add this line to work with .bag file

  # start streaming with pipeline and get the configuration
  cfg = pipeline.start(config)
  
  # get intrinsics
  camera_depth_profile = cfg.get_stream(rs.stream.depth)                                      # fetch depth depth stream profile
  camera_depth_intrinsics = camera_depth_profile.as_video_stream_profile().get_intrinsics()   # downcast to video_stream_profile and fetch intrinsics
  
  camera_color_profile = cfg.get_stream(rs.stream.color)                                      # fetch color stream profile
  camera_color_intrinsics = camera_color_profile.as_video_stream_profile().get_intrinsics()   # downcast to video_stream_profile and fetch intrinsics
  
  camera_depth_to_color_extrinsics = camera_depth_profile.get_extrinsics_to(camera_color_profile)

  print("camera depth intrinsic:", camera_depth_intrinsics)
  print("camera color intrinsic:", camera_color_intrinsics)
  print("camera depth to color extrinsic:", camera_depth_to_color_extrinsics)

  print("streaming attached camera and save depth and color frames into files with npy format ...")

  i = 0
  while i < max_num_frames:
      # wait until a new coherent set of frames is available on the device
      frames = pipeline.wait_for_frames()
      depth = frames.get_depth_frame()
      color = frames.get_color_frame()

      if not depth or not color: continue
      
      # convert images to numpy arrays
      depth_image = np.asanyarray(depth.get_data())
      color_image = np.asanyarray(color.get_data())

      # save images in npy format
      depth_file = depth_file_name + str(i) + ".npy"
      color_file = color_file_name + str(i) + ".npy"
      print("saving frame set ", i, depth_file, color_file)
      
      with open(depth_file, 'wb') as f1:
          np.save(f1,depth_image)
      
      with open(color_file, 'wb') as f2:
          np.save(f2,color_image)

      # next frameset
      i = i +1

except Exception as e:
  print(e)
  pass

######################## End of first part - capture images from live device #######################################



######################## Start of second part - align depth to color in software device #############################
# align depth to color with the above precaptured images in software device

# software device
sdev = rs.software_device()

# software depth sensor
depth_sensor: rs.software_sensor = sdev.add_sensor("Depth")

# depth instrincis
depth_intrinsics = rs.intrinsics()

depth_intrinsics.width  = camera_depth_intrinsics.width
depth_intrinsics.height = camera_depth_intrinsics.height

depth_intrinsics.ppx = camera_depth_intrinsics.ppx
depth_intrinsics.ppy = camera_depth_intrinsics.ppy

depth_intrinsics.fx = camera_depth_intrinsics.fx
depth_intrinsics.fy = camera_depth_intrinsics.fy

depth_intrinsics.coeffs = camera_depth_intrinsics.coeffs       ## [0.0, 0.0, 0.0, 0.0, 0.0]
depth_intrinsics.model = camera_depth_intrinsics.model         ## rs.pyrealsense2.distortion.brown_conrady

#depth stream
depth_stream = rs.video_stream()
depth_stream.type = rs.stream.depth
depth_stream.width = depth_intrinsics.width
depth_stream.height = depth_intrinsics.height
depth_stream.fps = fps
depth_stream.bpp = 2                              # depth z16 2 bytes per pixel
depth_stream.fmt = rs.format.z16
depth_stream.intrinsics = depth_intrinsics
depth_stream.index = 0
depth_stream.uid = 1

depth_profile = depth_sensor.add_video_stream(depth_stream)

# software color sensor
color_sensor: rs.software_sensor = sdev.add_sensor("Color")

# color intrinsic:
color_intrinsics = rs.intrinsics()
color_intrinsics.width = camera_color_intrinsics.width
color_intrinsics.height = camera_color_intrinsics.height

color_intrinsics.ppx = camera_color_intrinsics.ppx
color_intrinsics.ppy = camera_color_intrinsics.ppy

color_intrinsics.fx = camera_color_intrinsics.fx
color_intrinsics.fy = camera_color_intrinsics.fy

color_intrinsics.coeffs = camera_color_intrinsics.coeffs
color_intrinsics.model = camera_color_intrinsics.model

color_stream = rs.video_stream()
color_stream.type = rs.stream.color
color_stream.width = color_intrinsics.width
color_stream.height = color_intrinsics.height
color_stream.fps = fps
color_stream.bpp = 3                                # color stream rgb8 3 bytes per pixel in this example
color_stream.fmt = rs.format.rgb8
color_stream.intrinsics = color_intrinsics
color_stream.index = 0
color_stream.uid = 2

color_profile = color_sensor.add_video_stream(color_stream)

# depth to color extrinsics
depth_to_color_extrinsics = rs.extrinsics()
depth_to_color_extrinsics.rotation = camera_depth_to_color_extrinsics.rotation
depth_to_color_extrinsics.translation = camera_depth_to_color_extrinsics.translation
depth_profile.register_extrinsics_to(depth_profile, depth_to_color_extrinsics)

# start software sensors
depth_sensor.open(depth_profile)
color_sensor.open(color_profile)

# syncronize frames from depth and color streams
camera_syncer = rs.syncer()
depth_sensor.start(camera_syncer)
color_sensor.start(camera_syncer)

# create a depth alignment object
# rs.align allows us to perform alignment of depth frames to others frames
# the "align_to" is the stream type to which we plan to align depth frames
# align depth frame to color frame
align_to = rs.stream.color
align = rs.align(align_to)

# colorizer for depth rendering
colorizer = rs.colorizer()

# use "Enter", "Spacebar", "p", keys to pause for 5 seconds
paused = False

# loop through pre-captured frames
for i in range(0, max_num_frames):
  print("\nframe set:", i)
  
  # pause for 5 seconds at frameset 15 to allow user to better observe the images rendered on screen
  if i == 15: paused = True

  # precaptured depth and color image files in npy format
  df = depth_file_name + str(i) + ".npy"
  cf = color_file_name + str(i) + ".npy"

  if (not os.path.exists(cf)) or (not os.path.exists(df)): continue

  # load depth frame from precaptured npy file
  print('loading depth frame ', df)
  depth_npy = np.load(df, mmap_mode='r')

  # create software depth frame
  depth_swframe = rs.software_video_frame()
  depth_swframe.stride = depth_stream.width * depth_stream.bpp
  depth_swframe.bpp = depth_stream.bpp
  depth_swframe.timestamp = i * tv
  depth_swframe.pixels = depth_npy
  depth_swframe.domain = rs.timestamp_domain.hardware_clock
  depth_swframe.frame_number = i
  depth_swframe.profile = depth_profile.as_video_stream_profile()
  depth_swframe.pixels = depth_npy

  depth_sensor.on_video_frame(depth_swframe)

  # load color frame from precaptured npy file
  print('loading color frame ', cf)
  color_npy = np.load(cf, mmap_mode='r')

  # create software color frame
  color_swframe = rs.software_video_frame()
  color_swframe.stride = color_stream.width * color_stream.bpp
  color_swframe.bpp = color_stream.bpp
  color_swframe.timestamp = i * tv
  color_swframe.pixels = color_npy
  color_swframe.domain = rs.timestamp_domain.hardware_clock
  color_swframe.frame_number = i
  color_swframe.profile = color_profile.as_video_stream_profile()
  color_swframe.pixels = color_npy

  color_sensor.on_video_frame(color_swframe)
  
  # synchronize depth and color, receive as frameset
  frames = camera_syncer.wait_for_frames()
  print("frame set:", frames.size(), " ", frames)

  # get unaligned depth frame
  unaligned_depth_frame = frames.get_depth_frame()
  if not unaligned_depth_frame: continue

  # align depth frame to color frame
  aligned_frames = align.process(frames)

  aligned_depth_frame = aligned_frames.get_depth_frame()
  cv2.imshow("aligned_depth_frame", np.asanyarray(aligned_depth_frame.get_data()))
  color_frame = aligned_frames.get_color_frame()

  if (not aligned_depth_frame) or (not color_frame): continue

  aligned_depth_frame = colorizer.colorize(aligned_depth_frame)
  
  print("converting frames into npy array")
  npy_aligned_depth_image = np.asanyarray(aligned_depth_frame.get_data())
  npy_color_image = np.asanyarray(color_frame.get_data())

  # render aligned images:
  # depth align to color
  # aligned depth on left
  # color on right
  images = np.hstack((npy_aligned_depth_image, npy_color_image))

  cv2.namedWindow('Align Example', cv2.WINDOW_NORMAL)
  cv2.imshow('Align Example', images)

  # render original unaligned depth as reference
  colorized_unaligned_depth_frame = colorizer.colorize(unaligned_depth_frame)
  npy_unaligned_depth_image = np.asanyarray(colorized_unaligned_depth_frame.get_data())
  cv2.imshow("Unaligned Depth", npy_unaligned_depth_image)
  
  # press ENTER or SPACEBAR key to pause the image window for 5 seconds
  key = cv2.waitKey(1)

  if key == 13 or key == 32: paused = not paused
      
  if paused:
      print("Paused for 5 seconds ...", i, ", press ENTER or SPACEBAR key anytime for additional pauses.")
      time.sleep(5)
      paused = not paused

# end of second part - align depth to color with the precaptured images in software device
######################## End of second part - align depth to color in software device #############################
  
cv2.destroyAllWindows()

Although, the first part seems to properly saves .npy files and I can get frameset with color and depth image using this files, the align.process give black frame, except the [0,0] indexed pixel as in issue #6522. Therefore, I applied the solution stated in this issue, though it is given in C++, by adding the following lines to the python script:

# add read only option to the depth sensor
depth_sensor.add_read_only_option(rs.option.depth_units, 0.001)  # 0.001 meters
depth_sensor.add_read_only_option(rs.option.stereo_baseline, 49.93613815307617)  # 0.001 meters

but still I get black image, except the first pixel, for aligned depth image. Since the first part of the sample python script seems (we cannot explicitly state fps and data format for config) working and I created required .npy files thanks to it, I ended up using this script that includes only the second part of the sample python script and required read_only_options additions given above.

Is it possible that, the problem with align process might be due to the fact that the fps and encoding cannot be stated explicitly in the first part, which would cause the problem when using the corresponding numpy arrays? If not what could be the problem with align.process considering that given scripts are able to show unaligned depth and color images from frameset created by software-device ? I couldn't find any working reference for this problem, any advice would be appreciated.

Thank you in advance,
Furkan Çitil

@MartyG-RealSense
Copy link
Collaborator

Hi @fcitil As only the first pixel is being aligned, that makes me wonder whether this is because the program is not looping through the frames, like in the align_depth2color.py alignment example.

https://github.com/IntelRealSense/librealsense/blob/master/wrappers/python/examples/align-depth2color.py#L63

@fcitil
Copy link
Author

fcitil commented Feb 18, 2024

Hi @MartyG-RealSense, I am able to run align-depth2color.py script with the official sample realsense .bag file successfully. It would have been the case, if the program was not able to show unaligned depth and color frames in the loop but it is. I think, the problem is most likely to be related to use of software-device with align.process. Instead of the loop in the python script for frames, is it possible that the cpp program for align.process in SDK is not able to loop through the frames?

@MartyG-RealSense
Copy link
Collaborator

The software-device interface has rarely worked with Python without errors and is best used with C++. There are therefore very few working examples of Python code for software-device, which makes it difficult to diagnose any possible problems in your script that are preventing alignment from working.

Does alignment take place if the sample Python script at #12020 (comment) is run on your computer?

@fcitil
Copy link
Author

fcitil commented Feb 18, 2024

As I stated in the description, it does not work with .bag files directly. I needed to change and add following lines

  # configure streams
  config = rs.config()
  config.enable_stream(rs.stream.depth,1280, 720)#, rs.format.z16, fps)  # when formats and fps added
  config.enable_stream(rs.stream.color, 640, 480)#, rs.format.bgr8, fps)   # It fails, I don't know why ??
  
  config.enable_device_from_file("d435i_walking.bag") # add this line to work with .bag file

And I get black image as an aligned depth frame, except the first pixel.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Feb 18, 2024

What happems if you comment out the config.enable_device_from_file line and have a D435i attached so that the script tries to use a live camera instead of a bag file?

@fcitil
Copy link
Author

fcitil commented Feb 18, 2024

I haven't tried yet with the camera have attached. Let me try it with D435i in the following days. However, I am trying to align depth stream of realsense camera and color stream of an external camera. Therefore, I need to work with numpy arrays as here, which will end up with the same problem even if I use the D435i, so I don't think having attached D435 would solve the problem with align.process when working with numpy arrays.
Eventually, I am planning to get depth frames and color frames from external camera using ros and align numpy frames as in the sample python script. Do you have better suggestion for this purpose?

@MartyG-RealSense
Copy link
Collaborator

Is use of numpy compulsory for your project, please? The link below has an Intel guide for using ROS1 to combine the views of two separate cameras together.

https://github.com/IntelRealSense/realsense-ros/wiki/Showcase-of-using-2-cameras

@fcitil
Copy link
Author

fcitil commented Feb 18, 2024

I'm using numpy because the sample-python script saves .npy files and uses numpy to create framesets. But I don't have to stick with numpy - if there's another way to align external camera with different format, I'm open to that. The guide that you provided looks like it's more for aligning up two realsense cameras with serial numbers and explains merging their point clouds. My goal is different. I need to align RGB frame of external non-realsense camera whose relative pose w.r.t. realsense camera is known and realsense depth frame, like the official align_depth2color.py example shows.

@fcitil
Copy link
Author

fcitil commented Feb 19, 2024

What happems if you comment out the config.enable_device_from_file line and have a D435i attached so that the script tries to use a live camera instead of a bag file?

I have just tried with D435i camera have attached, still I got the same response. I get color and depth frames separately but I don't get aligned frame as a result of align.process.

@MartyG-RealSense
Copy link
Collaborator

I conducted extensive further research into this issue but #12020 is the best available reference unfortunately and there does not appear to be any other approach available to using software-device with alignment in Python. I do apologize.

@MartyG-RealSense
Copy link
Collaborator

Hi @fcitil Bearing in mind the information in the comment above, do you require further assistance with this case please? Thanks!

@fcitil
Copy link
Author

fcitil commented Feb 27, 2024

I have searched through all related issues but have not found any solution for my problem. I have tried to run the script with the sdk version that is said to be working in the issue, but it is not. I am still investigating for a possible solution ...

@MartyG-RealSense
Copy link
Collaborator

Please do share a solution here if you find it. Good luck!

@MartyG-RealSense
Copy link
Collaborator

Hi @fcitil Do you require further assistance on this case please as your last comment was in February 2024. Thanks!

@fcitil
Copy link
Author

fcitil commented May 19, 2024

I am still not able to use software-device in python, I implemented the code for pixel matching for an external camera and D435i depth camera, which performs de-projection from depth image to depth camera coordinates, transformation from depth camera coordinates to external camera coordinates, and then projection to external camera for each depth pixel. However, it is too slow for realtime applications since it performs pixel-wise sequential operation with my current implementation without utilizing realsense pipeline, unfortunately.

@MartyG-RealSense
Copy link
Collaborator

I researched your case again from the beginning but did not find a better solution than the software-device script at #12677 that you have already looked at, unfortunately.

The current latest librealsense version 2.55.1 has implemented a fix for software-device's matcher creation - as described at #12394 - so it may be worth testing software-device again in 2.55.1 if you have not done so already.

@MartyG-RealSense
Copy link
Collaborator

Hi @fcitil Have you tried testing software-device in SDK version 2.55.1 please?

@MartyG-RealSense
Copy link
Collaborator

Hi @fcitil Do you require further assistance with this case, please? Thanks!

@fcitil
Copy link
Author

fcitil commented Jun 3, 2024

Hi, I didn't have a chance to test yet. I will test and reply as soon as possible.

@MartyG-RealSense
Copy link
Collaborator

Thanks very much for the update. Good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants