Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserialization error publishing a LaserScan on ESP32 at certain message sizes #1427

Closed
iovsiann opened this issue Jun 20, 2023 · 5 comments
Closed
Assignees

Comments

@iovsiann
Copy link

iovsiann commented Jun 20, 2023

Issue template

  • Hardware description: a ESP32-WROOM-32 board connected to a Docker
  • RTOS: Arduino "ESP32 Dev Module" board, default settings, with Micro-ROS
  • Installation type: Docker microros/micro-ros-agent:humble Docker image, v2.0.6-humble precompiled Micro-ROS Arduino library, Arduino IDE 1.8.13 (Windows)
  • Version or commit hash: Humble

Steps to reproduce the issue

  1. Edit the sketch below to specify your WiFi SSID, password and IP of your PC that will run a Micro-ROS agent. Use the Arduino IDE (Windows) to compile and upload the sketch to an ESP32 module.
  2. Launch the Micro-ROS agent. In my case, I'm using a Windows machine with Docker-for-Windows installed. Here is the Windows shell command to launch the Docker Micro-ROS agent image.
docker run --name uros-agent -it -p 8888:8888/udp microros/micro-ros-agent:humble udp4 -p 8888
  1. ESP32 connects to the Micro-ROS agent and starts publishing a dummy LaserScan message on /scan topic. Launch a Docker bash session to observe LaserScan getting published successfully.
docker exec -it uros-agent bash
ros2 topic echo /scan  # in Micro-ROS container bash
  1. ISSUE 1. Edit the sketch to change BUF_LEN value from 1000 to 500. Compile and upload the sketch.
//#define BUF_LEN 1000 // works
#define BUF_LEN 500 // causes Micro-ROS agent deserialization error
  1. ISSUE 1. ESP32 connects to the Micro-ROS agent. The Micro-ROS agent reports "deserialization error".
  2. ISSUE 2. Undo the sketch code changes from step 4. Edit the sketch to use rclc_publisher_init_best_effort() instead of rclc_publisher_init_default(), see below. Compile and upload the sketch and open the Arduino serial monitor to see the debug messages. Make sure the Micro-ROS agent is running (step 2).
// rclc_publisher_init_default() works
//  RCCHECK(rclc_publisher_init_default(&publisher, &node,
 //   ROSIDL_GET_MSG_TYPE_SUPPORT(sensor_msgs, msg, LaserScan), TOPIC_NAME));

// rclc_publisher_init_best_effort() causes rcl_publish() to error out
RCCHECK(rclc_publisher_init_best_effort(&publisher, &node,
  ROSIDL_GET_MSG_TYPE_SUPPORT(sensor_msgs, msg, LaserScan), TOPIC_NAME));
  1. ISSUE 2. ESP32 connects to the Micro-ROS agent successfully, but the Arduino serial monitor prints "rcl_publish() error", i.e. the call to rcl_publish().
  2. ISSUE 3. Open the micro-ros_publisher_wifi example in the Arduino IDE, edit the WiFi SSID, password and Micro-ROS agent IP, compile and upload the code to ESP32. Make sure the Micro-ROS agent is running (step 2).
  3. ISSUE 3. ESP32 connects to the Micro-ROS agent successfully. Launch a Docker bash to observe the int32 messages getting published to /topic_name, at first, successfully.
docker exec -it uros-agent bash
ros2 topic echo /topic_name  # in Micro-ROS container bash
  1. ISSUE 3. Wait for a couple of minutes or so until "ros2 topic echo /topic_name" freezes and does not output message data anymore.

Expected behavior

LaserScan message gets successfully published, as in step 3 for issues 1 and 2.
int32 message gets successfully published - and does not freeze after a while - for issue 3.

Actual behavior

Issue 1: Micro-ROS agent receives LaserScan messages, but reports "deserialization error" and thus fails to publish the received messages.

Additional information

The LaserScan dummy publisher sketch is based on the micro-ros_publisher_wifi Micro-ROS Arduino example.
Note that in issue 1, changing just one line of code (BUF_LEN from, say, 1000 to 500 or 600) breaks the successful operation. If I set BUF_LEN to 400 or 800, the sketch works fine.
Note that in issue 2, replacing rclc_publisher_init_default() with rclc_publisher_init_best_effort() for the LaserScan dummy publisher breaks the operation. However, the original micro-ros_publisher_wifi example works fine with rclc_publisher_init_best_effort().
Note that in issue 3, the official micro-ros_publisher_wifi example stops working (specifically, the Micro-ROS agent stops receiving messages) simply after waiting for a couple of minutes or so.

I've searched Micro-ROS issues before opening this ticket and I've noticed a couple of related issues, but I haven't been able to find a solution that would work for me.

Any help would be much appreciated!

// docker run --name uros-agent -it -p 8888:8888/udp microros/micro-ros-agent:humble udp4 -p 8888
// docker exec -it uros-agent bash
// ros2 topic echo /scan
#include <micro_ros_arduino.h>
#include <stdio.h>
#include <rcl/rcl.h>
#include <rcl/error_handling.h>
#include <rclc/rclc.h>
#include <rclc/executor.h>
#include <sensor_msgs/msg/laser_scan.h>

#if !defined(ESP32)
#error This example is only avaible for ESP32 Dev module
#endif

#define FRAME_ID "lds"
#define TOPIC_NAME "scan"
#define NODE_NAME "uros_node"
#define BUF_LEN 1000 // 400 and 1000 work, 500...600 give "deserialization error"
#define LED_PIN 2
#define RCCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){error_loop();}}
#define RCSOFTCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){}}

rcl_publisher_t publisher;
rclc_support_t support;
rcl_allocator_t allocator;
rcl_node_t node;

sensor_msgs__msg__LaserScan laserScanMsg;
struct timespec tv = {0};
static float memory[BUF_LEN];
static float dist_offset = 0.15;

void setup() {
  Serial.begin(115200);
  set_microros_wifi_transports((char*)"YOUR_WIFI_SSID", (char*)"your_wifi_password",
    (char*)"192.168.1.YOUR-UROS-PC-IP", 8888);

  pinMode(LED_PIN, OUTPUT);
  digitalWrite(LED_PIN, HIGH);

  laserScanMsg.header.frame_id.data = (char*)FRAME_ID;
  laserScanMsg.header.frame_id.capacity = strlen(laserScanMsg.header.frame_id.data);
  laserScanMsg.header.frame_id.size = laserScanMsg.header.frame_id.capacity;

  // LaserScan  
  laserScanMsg.angle_min = 0.0;               // start angle of the scan [rad]
  laserScanMsg.angle_max = TWO_PI;           // end angle of the scan [rad]
  laserScanMsg.angle_increment =
    (laserScanMsg.angle_max - laserScanMsg.angle_min) / BUF_LEN; // rad/meas
  laserScanMsg.range_min = 0.12;               // minimum range value [m]
  laserScanMsg.range_max = 10.0;               // maximum range value [m]

  //https://micro.ros.org/docs/tutorials/advanced/handling_type_memory/
  laserScanMsg.ranges.capacity = BUF_LEN;
  laserScanMsg.ranges.data = memory;
  laserScanMsg.ranges.size = 0;

  //float time_increment; // time between measurements
  //float scan_time; // time between scans
  //rosidl_runtime_c__float__Sequence intensities; // intensity or empty

  delay(2000);

  allocator = rcl_get_default_allocator();
  Serial.println(F("After allocator()"));

  Serial.println(F("Connecting to ROS agent ..."));
  RCCHECK(rclc_support_init(&support, 0, NULL, &allocator));
  Serial.println(F("After rclc_support_init()"));

  // https://micro.ros.org/docs/tutorials/programming_rcl_rclc/node/
  RCCHECK(rclc_node_init_default(&node, NODE_NAME, "", &support));
  Serial.println(F("After rclc_node_init_default()"));

  // Creates a reliable rcl publisher
  RCCHECK(rclc_publisher_init_default(&publisher, &node,
    ROSIDL_GET_MSG_TYPE_SUPPORT(sensor_msgs, msg, LaserScan), TOPIC_NAME));

  // rcl_publish() errors out
  //RCCHECK(rclc_publisher_init_best_effort(&publisher, &node,
  //  ROSIDL_GET_MSG_TYPE_SUPPORT(sensor_msgs, msg, LaserScan), TOPIC_NAME));

  Serial.println(F("After rclc_publisher_init_*()"));
}

void loop() {
  dist_offset = dist_offset <= 5.0 ? dist_offset + 0.1 : 0.15;
  laserScanMsg.ranges.size = BUF_LEN;
  for(int32_t i = 0; i < BUF_LEN; i++) {
    laserScanMsg.ranges.data[i] = dist_offset + i*0.05;
  }
  clock_gettime(CLOCK_REALTIME, &tv);         //main/time_sync.c
  laserScanMsg.header.stamp.sec = tv.tv_sec;
  laserScanMsg.header.stamp.nanosec = tv.tv_nsec;
  
  //RCSOFTCHECK(rcl_publish(&publisher, &laserScanMsg, NULL));
  rcl_ret_t rc = rcl_publish(&publisher, &laserScanMsg, NULL);
  if (rc == RCL_RET_OK)
    digitalWrite(LED_PIN, !digitalRead(LED_PIN));
  else
    Serial.println("rcl_publish() error");
  delay(100);
}

void error_loop(){
  Serial.println(F("error_loop()"));
  while(1){
    digitalWrite(LED_PIN, !digitalRead(LED_PIN));
    delay(500);
  }
}

/*
void timer_callback(rcl_timer_t * timer, int64_t last_call_time)
{
  Serial.println(F("timer_callback()"));
  RCLC_UNUSED(last_call_time);
  if (timer != NULL) {
    RCSOFTCHECK(rcl_publish(&publisher, &laserScanMsg, NULL));
  }
}
*/
@pablogs9
Copy link
Member

Some details:

@iovsiann
Copy link
Author

Thank you for the instant reply @pablogs9 ! As a quick test, I've reduced the message size a lot and that fixed the Best Effort publisher. That said, "ros2 topic echo /scan" with the Best Effort publisher still freezes after a minute or two. I'll check about recompiling the library, but for now it seems the freezing issue will still keep me from using the Best Effort publisher.

I've tried zeroing out the LaserScan message and filling out the intensities array (with dummy values) while using rclc_node_init_default() - I'm afraid the deserialization error is still there. Here is the code to reproduce it. The code works (no deserialization error) when I set BUF_LEN=500, but doesn't work (serialization error) with BUF_LEN=250.

#include <micro_ros_arduino.h>
#include <stdio.h>
#include <rcl/rcl.h>
#include <rcl/error_handling.h>
#include <rclc/rclc.h>
#include <rclc/executor.h>
#include <sensor_msgs/msg/laser_scan.h>

#if !defined(ESP32)
#error This example is only available for ESP32 Dev module
#endif

#define FRAME_ID "lds"
#define TOPIC_NAME "scan"
#define NODE_NAME "uros_node"
#define BUF_LEN 250 // causes a "deserialization error"
//#define BUF_LEN 500 // works fine
#define LED_PIN 2
#define RCCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){error_loop();}}
#define RCSOFTCHECK(fn) { rcl_ret_t temp_rc = fn; if((temp_rc != RCL_RET_OK)){}}

rcl_publisher_t publisher;
rclc_support_t support;
rcl_allocator_t allocator;
rcl_node_t node;

sensor_msgs__msg__LaserScan laserScanMsg;
struct timespec tv = {0};
static float memory[BUF_LEN];
static float mem_intensity[BUF_LEN];
static float dist_offset = 0.15;

void error_loop(){
  Serial.println(F("error_loop()"));
  while(1){
    digitalWrite(LED_PIN, !digitalRead(LED_PIN));
    delay(500);
  }
}

void setup() {
  Serial.begin(115200);
  set_microros_wifi_transports((char*)"YOUR_WIFI_SSID", (char*)"your_wifi_password",
    (char*)"192.168.1.YOUR_UROS_PC_IP", 8888);

  pinMode(LED_PIN, OUTPUT);
  digitalWrite(LED_PIN, HIGH);

  memset((void*)&laserScanMsg, 0, sizeof(laserScanMsg));

  laserScanMsg.header.frame_id.data = (char*)FRAME_ID;
  laserScanMsg.header.frame_id.capacity = strlen(laserScanMsg.header.frame_id.data);
  laserScanMsg.header.frame_id.size = laserScanMsg.header.frame_id.capacity;

  // LaserScan  
  laserScanMsg.angle_min = 0.0;               // start angle of the scan [rad]
  laserScanMsg.angle_max = TWO_PI;           // end angle of the scan [rad]
  laserScanMsg.angle_increment =
    (laserScanMsg.angle_max - laserScanMsg.angle_min) / BUF_LEN; // rad/meas
  laserScanMsg.range_min = 0.12;               // minimum range value [m]
  laserScanMsg.range_max = 10.0;               // maximum range value [m]

  //https://micro.ros.org/docs/tutorials/advanced/handling_type_memory/
  laserScanMsg.ranges.capacity = BUF_LEN;
  laserScanMsg.ranges.data = memory;
  laserScanMsg.ranges.size = 0;

  //float time_increment; // time between measurements
  //float scan_time; // time between scans
  //rosidl_runtime_c__float__Sequence intensities; // intensity or empty

  laserScanMsg.intensities.capacity = BUF_LEN;
  laserScanMsg.intensities.data = mem_intensity;
  laserScanMsg.intensities.size = 0;

  delay(2000);

  allocator = rcl_get_default_allocator();
  Serial.println(F("After allocator()"));

  Serial.println(F("Connecting to ROS agent ..."));
  RCCHECK(rclc_support_init(&support, 0, NULL, &allocator));
  Serial.println(F("After rclc_support_init()"));

  // https://micro.ros.org/docs/tutorials/programming_rcl_rclc/node/
  RCCHECK(rclc_node_init_default(&node, NODE_NAME, "", &support));
  Serial.println(F("After rclc_node_init_default()"));

  // Creates a reliable rcl publisher
  RCCHECK(rclc_publisher_init_default(&publisher, &node,
    ROSIDL_GET_MSG_TYPE_SUPPORT(sensor_msgs, msg, LaserScan), TOPIC_NAME));

  // rcl_publish() errors out
  //RCCHECK(rclc_publisher_init_best_effort(&publisher, &node,
  //  ROSIDL_GET_MSG_TYPE_SUPPORT(sensor_msgs, msg, LaserScan), TOPIC_NAME));

  Serial.println(F("After rclc_publisher_init_*()"));
}

void loop() {
  dist_offset = dist_offset <= 5.0 ? dist_offset + 0.1 : 0.15;
  laserScanMsg.ranges.size = BUF_LEN;
  laserScanMsg.intensities.size = BUF_LEN;
  for(int32_t i = 0; i < BUF_LEN; i++) {
    laserScanMsg.ranges.data[i] = dist_offset + i*0.05;
    laserScanMsg.intensities.data[i] = laserScanMsg.ranges.data[i];
  }
  clock_gettime(CLOCK_REALTIME, &tv);         //main/time_sync.c
  laserScanMsg.header.stamp.sec = tv.tv_sec;
  laserScanMsg.header.stamp.nanosec = tv.tv_nsec;
  
  //RCSOFTCHECK(rcl_publish(&publisher, &laserScanMsg, NULL));
  rcl_ret_t rc = rcl_publish(&publisher, &laserScanMsg, NULL);
  if (rc == RCL_RET_OK)
    digitalWrite(LED_PIN, !digitalRead(LED_PIN));
  else
    Serial.println("rcl_publish() error");
  delay(100);
}

@Acuadros95 Acuadros95 self-assigned this Jun 20, 2023
@Acuadros95
Copy link
Contributor

@iovsiann You did in fact discover a bug in our middleware! We have fixed this bug here: eProsima/Micro-XRCE-DDS-Client#360. This repository branches has been updated to include this fix.

Please give it a try, we will create a new minor release once your feedback is positive.

@iovsiann
Copy link
Author

@Acuadros95 awesome, thank you! I have recompiled the Arduino library with the output_reliable_stream.c patch and the reliable publisher seems to work properly on my end - with and without intensities.* filled out.

@Acuadros95
Copy link
Contributor

Nice! The latest minor v2.0.7 releases include the fix.

Closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants