

# Ryzen AI CVML Integration with ROS 2

![](images/ros_cvml.png)

The Robot Operating System (ROS) is the de facto standard framework for building robotic applications. In this notebook, we'll integrate the Ryzen AI CVML library with ROS 2, demonstrating how to build NPU-accelerated vision nodes that can be used in real robotic systems.

By combining CVML's optimized computer vision features with ROS 2's robust communication infrastructure, we can create power-efficient perception pipelines for autonomous robots, drones, and edge AI applications.

## Goals

* Understand how to integrate NPU acceleration into ROS 2 nodes
* Build a complete vision pipeline with multiple nodes
* Visualize real-time depth estimation outputs

## References

* [Writing a Simple Publisher and Subscriber (C++)](https://docs.ros.org/en/kilted/Tutorials/Beginner-Client-Libraries/Writing-A-Simple-Cpp-Publisher-And-Subscriber.html)
* [Using ROS 2 Launch Files](https://docs.ros.org/en/kilted/Tutorials/Intermediate/Launch/Launch-Main.html)
* [cv_bridge](https://github.com/ros-perception/vision_opencv/tree/ros2/cv_bridge)

## The `cvml_ros` Package

The `cvml_ros` package provides a bridge between the CVML C++ API and ROS 2, wrapping the computer vision features into standard ROS nodes. This allows you to leverage NPU acceleration within the familiar ROS ecosystem.

### Package Features

The package includes three main nodes, each implementing a different CVML feature:

- **depth_estimation_node**: Subscribes to image topics and publishes dense depth maps using NPU-accelerated Depth Anything V2
- **face_detection_node**: Detects human faces in real-time, publishing bounding boxes and facial landmarks
- **face_mesh_node**: Generates detailed 3D face meshes with 468 landmarks for applications like AR/VR and expression analysis

Each node follows the standard ROS 2 publisher-subscriber pattern, making them composable and easy to integrate into existing robotics stacks.

## Explore the Package Structure

Let's examine the `cvml_ros` package structure to understand how it's organized. For each feature we have a Python launcher we will use to launch the nodes and the features are based on CVML C++ APIs and live in the `cvml_ros/src` directory - feel free to explore the individual implementations.

In [1]:
!ls -R cvml_ros/

cvml_ros/:
CMakeLists.txt	include  launch  package.xml  README.md  scripts  src

cvml_ros/include:
cvml_ros

cvml_ros/include/cvml_ros:
depth_estimation_node.hpp  face_detection_node.hpp  face_mesh_node.hpp

cvml_ros/launch:
depth_estimation.launch.py  face_detection.launch.py  face_mesh.launch.py

cvml_ros/scripts:
video_publisher.py

cvml_ros/src:
depth_estimation_node.cpp  face_detection_node.cpp  face_mesh_node.cpp
face_detection_main.cpp    face_mesh_main.cpp	    main.cpp


## Building the ROS Package

ROS 2 uses the `colcon` build system to compile packages. Before we can run our CVML nodes, we need to ensure the package is built with all its dependencies linked correctly.

The build process compiles the C++ source files, links against the CVML libraries, and sets up the necessary ROS 2 message types and interfaces.

In [2]:
%%bash
# This cell might take a few minutes

cd cvml_ros

# Check if already built
if [ -d "build" ]; then
    echo "Package appears to be already built."
    echo "Skipping build. If you need to rebuild, run: colcon build --packages-select cvml_ros"
else
    echo "Building cvml_ros package..."
    source /opt/ros/kilted/setup.bash
    cd ..
    colcon build --packages-select cvml_ros
    echo "Build complete!"
fi

Building cvml_ros package...
Starting >>> cvml_ros
Finished <<< cvml_ros [7.14s]

Summary: 1 package finished [7.24s]
Build complete!


## Launch Video Publisher Node

To test our depth estimation pipeline, we need a source of images. The `video_publisher.py` script is a simple ROS node that reads frames from a video file and publishes them to a ROS topic, simulating a camera stream.

This loopback approach is perfect for development and testing - it gives us a consistent, repeatable video stream without needing physical hardware.

### Start the Video Publisher

For this step, you'll want to open a new Jupyter terminal:

![](images/new_terminal.png)

Copy-paste the following command into the terminal:

```bash
sudo bash -c "source install/setup.sh && \
              ros2 run cvml_ros video_publisher.py \
                --ros-args \
                -p video_path:=/ryzers/RyzenAI-SW/Ryzen-AI-CVML-Library/samples/video_call.mp4 \
                -p topic:=/camera/image_raw"
```

**What this does:**
- Sources the ROS workspace setup to make our package available
- Runs the video_publisher.py script from the cvml_ros package
- Sets parameters: input video path and output topic name
- Publishes frames to `/camera/image_raw` topic

You should see these messages if successful:
```
[INFO] [1760488629.706959565] [video_publisher]: Publishing video from: /ryzers/RyzenAI-SW/Ryzen-AI-CVML-Library/samples/video_call.mp4
[INFO] [1760488629.707166260] [video_publisher]: Publishing to topic: /camera/image_raw
```

### Verify the Node is Running

Let's use ROS 2 tools to check if our video publisher is active and publishing data:

In [3]:
!source install/setup.sh && ros2 topic list

/camera/image_raw
/parameter_events
/rosout


Perfect! The `/camera/image_raw` topic is now available in our ROS system, publishing video frames. Now we can connect our CVML depth estimation node to subscribe to this stream and process it using the NPU.

## Launch Depth Estimation Pipeline

Now comes the exciting part - launching our NPU-accelerated depth estimation node! This node will:

1. Subscribe to `/camera/image_raw` (our video stream)
2. Convert ROS messages to OpenCV format using cv_bridge
3. Run depth estimation on the NPU using CVML
4. Publish the depth maps to `/depth_estimation/depth`

### Start the Depth Estimation Node

Open another new terminal:

![](images/new_terminal.png)

Execute these commands in the new terminal:

```bash
source /opt/ros/kilted/setup.bash
source install/setup.sh

sudo bash -c "source /opt/ros/kilted/setup.bash && \
              source install/setup.sh && \
              export LD_LIBRARY_PATH=$LD_LIBRARY_PATH && \
              ros2 launch cvml_ros depth_estimation.launch.py"
```

**Note:** We need sudo permissions because the XDNA driver requires elevated privileges for NPU access. The environment variables are passed through to ensure all libraries are found.

### What to Look For

The node will subscribe to our video_publisher and begin processing frames. You should observe initialization messages followed by processing logs. Key lines to look out for:

```
...
[depth_estimation_node-1] [INFO] [1760488970.246808904] [depth_estimation_node]: Ryzen AI Depth Estimation initialized
[depth_estimation_node-1] [INFO] [1760488970.491575724] [depth_estimation_node]: Created publisher on: /depth_estimation/depth
[depth_estimation_node-1] [INFO] [1760488970.491899477] [depth_estimation_node]: Created subscriber on: /camera/image_raw
[depth_estimation_node-1] [INFO] [1760488970.491906947] [depth_estimation_node]: Depth estimation node started successfully
[depth_estimation_node-1] [INFO] [1760488970.525778512] [depth_estimation_node]: Received first image: 1920x1080
...
```

These messages confirm:
1. CVML depth estimation context initialized successfully
2. ROS publisher and subscriber created
3. Node is processing incoming frames

### Check Active Topics and Nodes

Now let's verify the topics in our ROS system. We should see both the input camera topic and the new depth estimation output:

In [4]:
!source install/setup.sh && ros2 topic list

/camera/image_raw
/depth_estimation/depth
/parameter_events
/rosout


Excellent! We now have a complete pipeline:
- `/camera/image_raw` - Input video stream
- `/depth_estimation/depth` - NPU-generated depth maps

### Verify NPU Utilization

Let's confirm the NPU is actually being used for inference. This is really important - we want to ensure our depth estimation is running on the NPU hardware, not falling back to CPU.

In [5]:
!sudo /opt/xilinx/xrt/bin/xrt-smi examine --report aie-partitions


--------------------------------
[0000:c6:00.1] : NPU Strix Halo
--------------------------------
AIE Partitions
  Total Memory Usage: N/A
  Partition Index   : 0
    Columns: [0, 1, 2, 3, 4, 5, 6, 7]
    HW Contexts:
      |PID                 |Ctx ID     |Submissions |Migrations  |Err  |Priority |
      |Process Name        |Status     |Completions |Suspensions |     |GOPS     |
      |Memory Usage        |Instr BO   |            |            |     |FPS      |
      |                    |           |            |            |     |Latency  |
      |11624               |1          |1314        |0           |0    |Normal   |
      |N/A                 |Active     |1314        |6           |     |9        |
      |N/A                 |1712 KB    |            |            |     |N/A      |
      |                    |           |            |            |     |N/A      |
      |--------------------|-----------|------------|------------|-----|---------|


Perfect! The NPU hardware context shows active processing:
- **Active status**: NPU is currently running inference
- **Submissions/Completions**: Shows frame processing throughput
- **Columns [0-7]**: Full NPU array is allocated for depth estimation
- **Memory Usage**: Shows the model and intermediate buffers loaded on NPU

We now have a complete pipeline: **MP4 → video_publisher → depth_estimation_node (NPU)** 

What's missing? Visualization! We need a way to see the depth maps being generated in real-time.

## Visualize the Pipeline

To see what our depth estimation node is producing, we'll use `web_video_server` - a ROS package that provides HTTP endpoints for streaming ROS image topics to web browsers or Jupyter notebooks.

### Start the Web Video Server

Open another terminal and run:

```bash
sudo bash -c "source install/setup.sh && \
              ros2 run web_video_server web_video_server --ros-args -p port:=8080 -p address:=0.0.0.0"
```

This creates a web server on port 8080 that can serve snapshots and streams from any image topic in our system.

### Check Available Topics

Let's confirm all our topics are still active:

In [6]:
!source install/setup.sh && ros2 topic list

/camera/image_raw
/depth_estimation/depth
/parameter_events
/rosout


### Use `ipywidgets` to Visualize the Published Streams

We will use `ipywidgets` and HTML display to visualize the node outputs.

In [7]:
from IPython.display import HTML, display
from ipywidgets import Button, Output
import requests
import base64

out = Output()
display(out)

def update_display(b=None):
    with out:
        out.clear_output(wait=True)
        try:
            camera = requests.get('http://localhost:8080/snapshot?topic=/camera/image_raw', timeout=3).content
            depth = requests.get('http://localhost:8080/snapshot?topic=/depth_estimation/depth', timeout=3).content
            
            cam_b64 = base64.b64encode(camera).decode()
            depth_b64 = base64.b64encode(depth).decode()
            
            display(HTML(f'''
            <div style="display: flex; gap: 20px;">
                <div>
                    <h3>Camera</h3>
                    <img src="data:image/jpeg;base64,{cam_b64}" width="480">
                </div>
                <div>
                    <h3>Depth (NPU)</h3>
                    <img src="data:image/jpeg;base64,{depth_b64}" width="480">
                </div>
            </div>
            '''))
        except Exception as e:
            print(f"Error: {e}")

button = Button(description="Refresh Images")
button.on_click(update_display)
display(button)
update_display()  # Initial display

Output()

Button(description='Refresh Images', style=ButtonStyle())

## Key Takeaways

Congratulations! You have finished the NPU portion of the workshop! You are now familiar with the following:
* You can inspect the NPU device status for debugging.
* Are familiar with the CVML library features and how to compile and run them on the NPU.
* You can integrate NPU application within a ROS 2 system.

## Next Steps

* Try launching other nodes like face detection or face mesh.
* Check NPU status as you launch new nodes, can the device handle multiple applications running simultaneously?
* If running locally, try other ROS 2 visualization nodes like `rviz2`.

---
Copyright© 2025 AMD, Inc SPDX-License-Identifier: MIT