Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python/viewer] Add support of user-specified extra cameras (rgb and depth). #826

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

duburcqa
Copy link
Owner

@duburcqa duburcqa commented Jul 14, 2024

import numpy as np
import matplotlib.pyplot as plt

import gymnasium as gym

from panda3d.core import VBase4, Point3, Vec3
from jiminy_py.viewer import Viewer
import pinocchio as pin

Viewer.close()
#Viewer.connect_backend("panda3d-sync")
env = gym.make("gym_jiminy.envs:atlas-pid", viewer_kwargs={"backend": "panda3d-sync"})
env.reset(seed=0)
env.step(env.action)
#env.render()
env.simulator.render(return_rgb_array=True)

env.viewer.add_marker("sphere",
                      shape="sphere",
                      pose=(np.array((1.7, 0.0, 1.5)), None),
                      color="red",
                      radius=0.1,
                      always_foreground=False)

#Viewer.add_camera("rgb", height=200, width=200, is_depthmap=False)
Viewer._backend_obj.gui.add_camera("rgb", False, (200, 200))
#Viewer.add_camera("depth", height=128, width=128, is_depthmap=True)
Viewer._backend_obj.gui.add_camera("depth", True, (128, 128))
Viewer.set_camera_transform(
    position=[2.5, -1.4, 1.6],  # [3.0, 0.0, 0.0],
    rotation=[1.35, 0.0, 0.8],  # [np.pi/2, 0.0, np.pi/2]
    camera_name="depth")

frame_index = env.robot.pinocchio_model.getFrameId("head")
frame_pose = env.robot.pinocchio_data.oMf[frame_index]
# Viewer._backend_obj.gui.set_camera_transform(
#     pos=frame_pose.translation + np.array([0.0, 0.0, 0.0]),
#     quat=pin.Quaternion(frame_pose.rotation @ pin.rpy.rpyToMatrix(0.0, 0.0, -np.pi/2)).coeffs(),
#     camera_name="rgb")
Viewer.set_camera_transform(
    position=frame_pose.translation + np.array([0.0, 0.0, 0.0]),
    rotation=pin.rpy.matrixToRpy(frame_pose.rotation @ pin.rpy.rpyToMatrix(np.pi/2, 0.0, -np.pi/2)),
    camera_name="rgb")

lens = Viewer._backend_obj.render.find("user_camera_depth").node().get_lens()
# proj = lens.get_projection_mat_inv()
# buffer = Viewer._backend_obj._user_buffers["depth"]
# buffer.trigger_copy()
# Viewer._backend_obj.graphics_engine.render_frame()
# texture = buffer.get_texture()
# tex_peeker = texture.peek()
# pixel = VBase4()
# tex_peeker.lookup(pixel, 0.5, 0.5)  # (y, x normalized coordinates, from top-left to bottom-right)
# depth_rel = 2.0 * pixel[0] - 1.0   # map range [0.0 (near), 1.0 (far)] to [-1.0, 1.0]
# point = Point3()
# #lens.extrude_depth(Point3(0.0, 0.0, depth_rel), point)
# # proj.xform_point_general(Point3(0.0, 0.0, pixel[0]))
# # depth = point[1]
# depth = 1.0 / (proj[2][3] * depth_rel + proj[3][3])
# print(depth)

rgb_array = Viewer.capture_frame(camera_name="rgb")
depth_array = Viewer.capture_frame(camera_name="depth")
# depth_normalized_array = lens.near / (lens.far - (lens.far - lens.near) * depth_array)
depth_true_array = lens.near / (1.0 - (1.0 - lens.near / lens.far) * depth_array)
fig = plt.figure(figsize=(10, 5))
ax1 = fig.add_subplot(121)
ax1.imshow(rgb_array)
ax2 = fig.add_subplot(122)
ax2.imshow(depth_true_array, cmap=plt.cm.binary)
for ax in (ax1, ax2):
    ax.axis('off')
    ax.xaxis.set_visible(False)
    ax.yaxis.set_visible(False)
fig.tight_layout(pad=1.0)
plt.show(block=False)

image

@duburcqa duburcqa changed the title [python/viewer] Add support of user-specified additional cameras (rgb and depth map). [python/viewer] Add support of user-specified extra cameras (rgb and depth map). Jul 14, 2024
@duburcqa duburcqa changed the title [python/viewer] Add support of user-specified extra cameras (rgb and depth map). [python/viewer] Add support of user-specified extra cameras (rgb and depth). Jul 14, 2024
@duburcqa
Copy link
Owner Author

duburcqa commented Jul 14, 2024

@mwulfman This is just a POC. Applying filters to make the depth map more realistic is still necessary. Do you have any reference or even source code about this topic ? It would be very helpful.

Viewer.capture_frame(camera_name="depth_map") takes 2.5ms on my machine, which a consider very slow. It caps the real-time factor to x15 compare to >100 for an environment with 40ms env steps such as Atlas. I don't think this is acceptable. I will investigate alternative options.

@mwulfman
Copy link
Contributor

If it can help you, here's what I started to draft. It lacks filtering capabilities on the output images. I believe, besides the approach (using class versus adding it to the viewer), it is very similar to what you've done. I've been considering using cv2 directly for doing the filter. I believe capturing the frame takes around the same amount of time than you (1-2 ms)

class DepthMapSensorPOC:
    """Depth Map Sensor Class"""

    def __init__(
        self,
        panda3d_app: Panda3dApp,
        window_size: Tuple[float, float],
        depth_range: Tuple[float, float],
        field_of_view: Tuple[float, float],
    ):
        winprops = WindowProperties(size=window_size)
        props = FrameBufferProperties()
        props.set_float_depth(True)
        props.set_depth_bits(32)
        depth_buffer = panda3d_app.graphics_engine.make_output(
            panda3d_app.pipe,
            "depth_buffer",
            -2,
            props,
            winprops,
            GraphicsPipe.BFRefuseWindow,
            panda3d_app.win.get_gsg(),
            panda3d_app.win,
        )

        depth_texture = Texture("depth_texture")
        depth_buffer.add_render_texture(
            depth_texture, GraphicsOutput.RTM_triggered_copy_ram, GraphicsOutput.RTPDepthStencil
        )

        self.depth_buffer = depth_buffer

        self.depth_cam = panda3d_app.make_camera(self.depth_buffer, camName="depth_cam")
        self.depth_min, self.depth_max = depth_range
        self.depth_cam.node().getLens().set_near_far(self.depth_min, self.depth_max)
        self.depth_cam.node().getLens().set_fov(*field_of_view)

        self.depth_cam.reparent_to(panda3d_app.render)
        self.depth_cam.set_pos(0.0, 0.0, 0.0)
        self.depth_cam.set_hpr(0.0, 0.0, 0.0)

        # for debug, don't use for real run !
        # It adds black pixels on side of depth image
        # self.depth_cam.node().showFrustum()

        panda3d_app.graphics_engine.render_frame()

        self.depth_buffer.inverted = True

        panda3d_app.bufferViewer.setInclude([self.depth_buffer])
        panda3d_app.bufferViewer.enable(1)

    def get_camera_depth_image(self, graphics_engine):
        """
        Returns the camera's depth image, which is of type float32 and has
        values between 0.0 and 1.0.
        """
        self.depth_buffer.trigger_copy()
        graphics_engine.render_frame()

        depth_texture = self.depth_buffer.get_texture()
        data = depth_texture.get_ram_image()
        depth_image = np.frombuffer(data, np.float32)
        depth_image.shape = (depth_texture.get_y_size(), depth_texture.get_x_size())
        normalize_world_depth = np.clip(
            self.depth_min / (self.depth_max - (self.depth_max - self.depth_min) * depth_image),
            0.0,
            1.0,
        )
        return normalize_world_depth

@duburcqa duburcqa force-pushed the panda3d_camera branch 2 times, most recently from 5d5a2c9 to f2d2875 Compare July 15, 2024 09:44
@duburcqa
Copy link
Owner Author

I made some improvements. Now it should be twice faster, ie about 1.25ms to render the depth map. I will see if it is possible to further optimise the rendering pipeline, but I'm afraid it is unlikely.

@duburcqa
Copy link
Owner Author

duburcqa commented Jul 15, 2024

I have an unreliable setup where computation of the depth map takes only 570us on my machine. If I make find a way to achieve this speed without breaking anything else, then using depth map in RL may be an option !

Edit: I found a reliable way 🎉 @mwulfman it would be nice if you could benchmark it on a machine with Nvidia GPU.

Edit2: After some in-depth profiling, this appears to be the absolute limit because now 90% of the time is spent on copying the result from the GPU to the CPU. I don't think it is possible to speed up this operation but I will have a look. The good thing is that decreasing the resolution is now improving the timing, eg it takes 390us for 64 x 64 depth map.

… and depth map). Significantly speed-up both offscreen and onscreen rendering.
@duburcqa duburcqa enabled auto-merge (squash) July 16, 2024 11:52
@duburcqa duburcqa merged commit 59464a3 into dev Jul 16, 2024
31 checks passed
@duburcqa duburcqa deleted the panda3d_camera branch July 17, 2024 06:43
@duburcqa duburcqa mentioned this pull request Jul 17, 2024
duburcqa added a commit that referenced this pull request Jul 17, 2024
* [core] Fix robot serialization issue. (#821)
* [core] Minor improvement periodic Perlin process and periodic stair ground. (#799) 
* [core] 'PeriodicGaussianProcess' and 'PeriodicFourierProcess' are now differentiable. (#799) 
* [core] Fix negative time support for all existing random processes. (#799) 
* [core] Add N-dimension Perlin processes. (#799) (#823) 
* [core] Add gradient computation for all Perlin processes. (#799) (#823) (#825)
* [core] Make all Perlin processes faster and copy-able. (#799) 
* [core] Add Perlin ground generators. (#799) 
* [core] Replace MurmurHash3 by xxHash32 which is faster. (#824) 
* [core] Make gradient computation optional for heightmap functions. (#824) 
* [jiminy_py] Fix 'tree.unflatten_as' mixing up key order for 'gym.spaces.Dict'. (#819)
* [python/simulator] Consistent keyword arguments between 'Simulator.build' and 'Simulator.add_robot'. (#821)
* [python/viewer] Fix MacOS support. (#822)
* [python/viewer] Add support of user-specified extra cameras (rgb and depth). (#826)
* [python/viewer] Significantly speed-up both offscreen and onscreen rendering for Panda3D. (#826)
* [gym/common] More generic stacking quantity. (#812) 
* [gym/common] Add termination condition abstraction. (#812) 
* [gym/common] Add quantity shift and drift tracking termination conditions. (#812) 
* [gym/common] Add support of termination composition in pipeline environments. (#812) 
* [gym/common] Add base roll/pitch termination condition. (#813) 
* [gym/common] Add base relative height termination condition. (#813) 
* [gym/common] Add foot collision termination condition. (#813) 
* [gym/common] More generic actuated joint kinematic quantity. (#814) 
* [gym/common] Add multi-ary operator quantity. (#814) 
* [gym/common] Add safety limits termination condition. (#814) 
* [gym/common] Add robot flying termination condition. (#815)
* [gym/common] Add power consumption termination condition. (#816) 
* [gym/common] Add ground impact force termination condition. (#816) 
* [gym/common] Add base odometry pose drift tracking  termination condition. (#817) 
* [gym/common] Add  motor positions shift tracking termination condition. (#817) 
* [gym/common] Add relative foot odometry pose shift tracking termination conditions. (#820)
* [gym/common] Add unit test checking that observation wrappers preserve key ordering. (#820)
* [gym/common] Fix quantity hash collision issue in quantity manager. (#821)
* [gym/common] Refactor quantity management to dramatically improve its performance. (#821)
* [gym/common] Add 'order' option to 'AdditiveReward'. (#821)
* [misc] Fix missing compositions documentation. (#812) 

---------

Co-authored-by: Mathias Wulfman <101942083+mwulfman@users.noreply.github.com>
@mwulfman
Copy link
Contributor

I have an unreliable setup where computation of the depth map takes only 570us on my machine. If I make find a way to achieve this speed without breaking anything else, then using depth map in RL may be an option !

Edit: I found a reliable way 🎉 @mwulfman it would be nice if you could benchmark it on a machine with Nvidia GPU.

Edit2: After some in-depth profiling, this appears to be the absolute limit because now 90% of the time is spent on copying the result from the GPU to the CPU. I don't think it is possible to speed up this operation but I will have a look. The good thing is that decreasing the resolution is now improving the timing, eg it takes 390us for 64 x 64 depth map.

Just seen your message. Do you have a reliable script that I can run to benchmark the depth map acquisition speed?

@duburcqa
Copy link
Owner Author

duburcqa commented Jul 21, 2024

Just seen your message. Do you have a reliable script that I can run to benchmark the depth map acquisition speed?

Yes, execute the script provided in the PR description in ipython, then run the following command:

%timeit Viewer.capture_frame(camera_name="depth")

PS: You must install jiminy 1.8.8 or any commit that was merged after this PR.

@mwulfman
Copy link
Contributor

mwulfman commented Jul 21, 2024

I am getting the following error after installing jiminy 1.8.8 and trying to run the script in an ipython:

Cell In[3], line 1
----> 1 Viewer.add_camera("rgb", height=200, width=200, is_depthmap=False)

File ~/.pyenv/versions/3.10.9/envs/test/lib/python3.10/site-packages/jiminy_py/viewer/viewer.py:299, in _with_lock.<locals>.fun_safe(*args, **kwargs)
    297 self = kwargs.get('self', self)
    298 with self._lock:
--> 299     return fun(*args, **kwargs)

File ~/.pyenv/versions/3.10.9/envs/test/lib/python3.10/site-packages/jiminy_py/viewer/viewer.py:287, in _must_be_open.<locals>.fun_safe(*args, **kwargs)
    283 if not self.is_open():
    284     raise RuntimeError(
    285         "No backend available. Please start one before calling "
    286         f"'{fun.__name__}'.")
--> 287 return fun(*args, **kwargs)

File ~/.pyenv/versions/3.10.9/envs/test/lib/python3.10/site-packages/jiminy_py/viewer/viewer.py:1922, in Viewer.add_camera(camera_name, width, height, is_depthmap)
   1918     raise NotImplementedError(
   1919         "This method is only supported by Panda3d.")
   1921 # Add camera
-> 1922 Viewer._backend_obj.gui.add_camera(
   1923     camera_name, (width, height), is_depthmap)

File ~/.pyenv/versions/3.10.9/envs/test/lib/python3.10/site-packages/jiminy_py/viewer/panda3d/panda3d_visualizer.py:824, in Panda3dApp.add_camera(self, name, is_depthmap, size)
    822     fbprops.set_multisamples(4)
    823 winprops = WindowProperties()
--> 824 winprops.set_size(*size)
    825 flags = GraphicsPipe.BF_refuse_window
    826 buffer = self.graphicsEngine.make_output(
    827     self.pipe, f"user_buffer_{name}", 0, fbprops, winprops, flags,
    828     self.win.get_gsg(), self.win)

TypeError: WindowProperties.set_size() argument after * must be an iterable, not bool

Looks like there is a small arguments inversion typo.

@duburcqa
Copy link
Owner Author

duburcqa commented Jul 21, 2024

Arf damn, the code being merged is broken, what a shame... I dirty-fixed the snippet in PR description. It should be enough to run the benchmark!

@mwulfman
Copy link
Contributor

mwulfman commented Jul 21, 2024

Here the results on Nvidia GPU RTX 3060:

In [2]: %timeit Viewer.capture_frame(camera_name="depth")
387 μs ± 1.9 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [3]: %timeit Viewer.capture_frame(camera_name="depth")
388 μs ± 3.04 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [4]: %timeit Viewer.capture_frame(camera_name="depth")
405 μs ± 14.7 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [5]: %timeit Viewer.capture_frame(camera_name="depth")
388 μs ± 9.6 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [6]: %timeit Viewer.capture_frame(camera_name="depth")
388 μs ± 2.93 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Congrats on getting such a speed up!

@duburcqa
Copy link
Owner Author

Nice ! These results are encouraging, thanks :)

Apparently, the same performance is the same as running on Intel integrated, which makes sense because computations are too small to take avantage of the available resources. If you have time, I'm curious to see if the timing changes after increasing the resolution:

Viewer._backend_obj.gui.add_camera("depth", True, (200, 200))

or even

Viewer._backend_obj.gui.add_camera("depth", True, (500, 500))

@mwulfman
Copy link
Contributor

mwulfman commented Jul 21, 2024

200x200:

In [2]: %timeit Viewer.capture_frame(camera_name="depth")
370 μs ± 61.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [3]: %timeit Viewer.capture_frame(camera_name="depth")
395 μs ± 27.1 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [4]: %timeit Viewer.capture_frame(camera_name="depth")
385 μs ± 2.01 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [5]: %timeit Viewer.capture_frame(camera_name="depth")
395 μs ± 9.23 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

500x500:

In [2]: %timeit Viewer.capture_frame(camera_name="depth")
432 μs ± 29.5 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [3]: %timeit Viewer.capture_frame(camera_name="depth")
430 μs ± 24.5 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [4]: %timeit Viewer.capture_frame(camera_name="depth")
425 μs ± 18.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [5]: %timeit Viewer.capture_frame(camera_name="depth")
430 μs ± 25.4 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

@duburcqa
Copy link
Owner Author

duburcqa commented Jul 21, 2024

Ok perfect, just like I expected :) From these results, I think with a powerful GPU it should not be a problem to generate the depth map for 10 / 20 env in parallel without slowdown if the resolution is small enough.

Hopefully it is now fast enough to run learning algorithms with it, but the post-processing is still missing !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants