Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large PointCloud Jumping and Latency #144

Open
skeel3r opened this issue Apr 15, 2016 · 8 comments
Open

Large PointCloud Jumping and Latency #144

skeel3r opened this issue Apr 15, 2016 · 8 comments

Comments

@skeel3r
Copy link

skeel3r commented Apr 15, 2016

I'm visualizing a velodyne point cloud scan, however the scan points look like they are jumping back and forth between two scans. Also there is a big delay between when the velodyne_points topic stops publishing and when the scene stops.

It seems like there is a growing buffer that can't get visualized fast enough (verified from htop). Unfortunately the pointcloud2 object doesn't include a rate parameter.

I've set the max points from really low to really high and it doesn't have an effect.

The pointclouds are large but I'd expect it to at least be smooth and delayed.

Here is the relevant part of the code

    // Create the main viewer.
    var viewer = new ROS3D.Viewer({
      divID : 'viewer',
      width : 800,
      height : 600,
      antialias : true
    });

    // Setup a client to listen to TFs.
    var tfClient = new ROSLIB.TFClient({
      ros : ros,
      angularThres : 0.1,
      transThres : 0.1,
      rate : 10.0,
      fixedFrame : '/velodyne'
    });

    // Setup PointCloud2 stream
    pointCloud = new ROS3D.PointCloud2({
      ros : ros,
      topic : '/velodyne_points',
      tfClient : tfClient,
      //texture: image_url,
      rootObject : viewer.scene,
      size: 0.02,
      max_pts: 10000
    });

My thoughts are that

  1. The data rate is just too high and it's causing both jumping and a growing queue.
  2. There is an issue with the tf2_web_republisher (again because of long processing time) that is causing the jumping.
  3. I need to make a callback with no queue that only updates after the pointcloud has been processed.
  4. Or I could modify pointcloud.js to take advantage of the rate parameter.

Relevant Info
I'm running this on ubuntu 14.04 with apache2 on an intel nuc i5
I've followed RobotWebTools/rosbridge_suite#192 and RobotWebTools/rosbridge_suite#212 which seem relevant.

@skeel3r
Copy link
Author

skeel3r commented Apr 17, 2016

I added queue size and rate to the config of the ros3d.PointCloud2 and that seemed to help with the jumpiness after some fiddling. The issue is that the client side doesn't seem to be able to keep up with the large data size (even after a reasonable queue size on the server). I have another topic that the client subscribes to but the websocket seems to be saturated because it is extremely laggy after the point cloud get's published. I don't know if this is a problem with the viewer rendering or some latency with the websocket but any input would be appreciated.

@FuzzAU
Copy link

FuzzAU commented Apr 17, 2016

It certainly isn't just high rate point cloud data that is an issue.

I was recently playing with moving markers using the script from the second answer here:
http://answers.ros.org/question/11135/plotting-a-markerarray-of-spheres-with-rviz/

While it's pretty to look at, the latency and buffering becomes apparently when you stop the script and realise that the markers continue to move for quite some time.

I likely won't have the chance to do so this week, but I think it'd be important (for someone, anyone) to set up a test in which we log the send time of the data, the time each item is received at rosbridge and the time it is received in ros3djs. Unless someone can think of a better way, this will at least pinpoint if it's a rosbridge issue or something inside ros3djs. I feel like it's possibly a rosbridge thing...

@skeel3r
Copy link
Author

skeel3r commented Apr 19, 2016

Interesting about the marker visualizer being latent with only 100 markers. I'm also suspicious that it is in the rosbridge side of things. I might have some time this weekend to look at logging the time-stamp. I'll let you know if I confirm our suspicions.

@skeel3r
Copy link
Author

skeel3r commented Apr 20, 2016

I found this issue for rosbridge, sounds like it confirms our problem. RobotWebTools/rosbridge_suite#69

@FuzzAU
Copy link

FuzzAU commented Apr 24, 2016

I've done a fair amount of debugging and I think I know where the overhead is, but not quite sure what to do about it...

My test case was discussed above in which I had 100 markers in a marker array being sent at 100Hz. I didn't actually expect rosbridge to be able to keep up, but figured this would help exacerbate the issues.

Using a combination of PyCharm's profiling tool and line_profiler, I found that the queue of messages coming in from the subscriber inside rosbridge grows. In particular in rosbridge_library\internal\subscribers.py the line json = message_conversion.extract_values(msg) stood out as taking a huge amount of time.

I dug a little bit deeper and found that the parsing of my messages was just taking a long time (most of which takes place in message_conversion.py in rosbridge_library/internals). This is to be expected given that my MarkerArray container 100 Marker objects, and a Marker object is made up of quite a few other ros message types.

There wasn't a lot that stood out as being particularly inefficient code. I made a few small improvements to the main parser components in _from_inst(inst, rostype) but it wasn't enough to make a significant impact given the deepely nested message types and high data rate.

Your velodyne point clouds are much simpler message types, but contain large binary blobs. So perhaps it is the memory copying that hurts.

Do you have a test data set I could experiment, or perhaps let me know how much data and what rate you're running at?

@skeel3r
Copy link
Author

skeel3r commented Apr 30, 2016

Nice work! I've uploaded a bagfile here for testing. I took a cursory look at the file you mentioned but like you nothing stood out. Should we throw out messages that come in before the parser has finished? Not sure if just finding a way to make the parser faster will be the fix.

@yuyuyu00
Copy link

yuyuyu00 commented Jul 4, 2017

Is this problem solved? I have come across a similar situation..@FuzzAU @skeel3r

@mbredif
Copy link
Contributor

mbredif commented Apr 4, 2018

On a related topic, I have added message and point subsampling options to PointCloud2 and LaserScan and optimized it somehow in #218.
A new example (examples/kitti.html) shows that subsampling helps to keep up with the incoming flow of points when visualizing the kitti velodyne, but that optimizations are not sufficient (at least on my desktop) to visualize all points of all messages.
@FuzzAU @skeel3r @yuyuyu00 , this example may provide you a use case to track down where things could get more optimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants