Strange RAM usage when creating subscriptions #257

alsora · 2019-02-15T17:47:01Z

Bug report

Required Info:

Operating System:
- Ubuntu 18.04
Installation type:
- binaries
Version or commit hash:
- Crystal
DDS implementation:
- Fast-RTPS, RTI Connext
Client library (if applicable):
- rclcpp

Steps to reproduce issue

Define messages of fixed size

Array100kb.msg

byte[100000] data

Instantiate subscription to topic of this message type

#include <memory>
#include <iostream>
#include <thread>
#include <sstream>
#include "rclcpp/rclcpp.hpp"
#include "simple_memory_test/msg/array1kb.hpp"

class SimpleSubscriberNode : public rclcpp::Node {

public:
    SimpleSubscriberNode(std::string name) : Node(name)
    {
        rmw_qos_profile_t custom_qos_profile = rmw_qos_profile_default;

        _sub = this->create_subscription<simple_memory_test::msg::Array1kb>( "chatter",
            std::bind(&SimpleSubscriberNode::simple_callback, this, std::placeholders::_1),
            custom_qos_profile);
    }
private:
    void simple_callback(const simple_memory_test::msg::Array1kb::SharedPtr msg)
    {
        (void)msg;
    }
    rclcpp::Subscription<simple_memory_test::msg::Array1kb>::SharedPtr _sub;
};



int main(int argc, char ** argv)
{
    rclcpp::init(argc, argv);

    std::shared_ptr<SimpleSubscriberNode> sub_node = std::make_shared<SimpleSubscriberNode>("this_node");

    rclcpp::spin(sub_node);

    rclcpp::shutdown();

    return 0;
}

Behavior

From what I have seen, the RAM required for instantiating a subscription, it's more or less invariant to the type of messages.
Note that I'm only running a system containing that subscriber node, no other nodes are alive.

However, testing this assumption for several values I got this

The physical memory required by the previously presented code, for a message between 50KB and 500KB it's bigger than the one required for a message of size more than 1MB.
After the peak of physical memory (56MB for a message of 450KB), the amount returns to the same constant value.

For what concerns the virtual memory usage, it keeps increasing a lot reaching 1.1GB for a message of size 4MB.

This behavior does not occur when creating publishers.

This behavior does not occur when using OpenSplice DDS.

Do you have any ideas why ?
May be related to this ? #203

Here you can find the package I used for running the tests.
https://github.com/alsora/ros2-code-examples/tree/master/simple_memory_test

It creates an executable for each data point.
I am measuring the RAM usage with
ps aux | grep sub_nodes

The package also contains a plot.py file where the exact results of the tests are hardcoded and you can get plot them again to better inspect the graph.

The text was updated successfully, but these errors were encountered:

dirk-thomas · 2019-02-15T18:06:28Z

Please include an example which exact executable you are running. When looking at the pub_node_<SIZE> (e.g. https://github.com/alsora/ros2-code-examples/blob/c58fa3b3b4196e5e3333f11baa0b295d008e8419/simple_memory_test/src/pub/4mb_main.cpp) the node seems to never publish any messages and just sit there being idle?

alsora · 2019-02-15T18:30:31Z

Yes exactly.
In the plots I'm only creating subscribers and letting them spin, without anyone publishing.
I just wanted to look at the memory requirement of calling create_subscription.

Similarly I tested if the issue was present for publishers, i.e. only calling create_publisher, but without doing anything.

dirk-thomas · 2019-02-15T18:45:43Z

So when running e.g. just the pub_node_4mb executable you see an increased memory usage over time for that process? I see a constant memory usage of 9.5 MiB for that process (on 18.04 with the latest Debian packages). For sub_node_4mb the value is 9.9 MiB but also constant.

alsora · 2019-02-15T18:51:14Z

Values are always constant in time.

The problem is that I would expect them to be also more or less invariant to the type of message.

The value I see with sub_node_10b is 20MB.
The value I see with sub_node_450kb is 56MB.
The value I see with sub_node_4MB is 22MB.

There is a very strange high memory usage for some subscriptions in the range 50kb 450kb.

I'm using the official Crystal docker image.

dirk-thomas · 2019-02-15T18:56:46Z

The value I see with sub_node_10b is 20MB.
The value I see with sub_node_450kb is 56MB.
The value I see with sub_node_4MB is 22MB.

Which RMW implementation are these for? FastRTPS or Connext?

alsora · 2019-02-15T18:58:03Z

Fastrtps..

With opensplice the memory usage for the same tests is almost invariant.

dirk-thomas · 2019-02-15T19:01:30Z

I can't reproduce the numbers on my end - all pub/sub binaries have a memory usage of ~ 9-10 MiB.

One thing to keep in mind is that when creating publishers and subscribers the RMW impl. might choose to preallocate a number of message instances. That is usually limited to smaller messages where as larger messages might only be allocated dynamically. That might be a reason for your drop in the plot.

alsora · 2019-02-16T10:34:36Z

One thing to keep in mind is that when creating publishers and subscribers the RMW impl. might choose to preallocate a number of message instances. That is usually limited to smaller messages where as larger messages might only be allocated dynamically. That might be a reason for your drop in the plot.

That's very interesting.
Is there a way for managing how this feature works?

However I tested again the nodes on a fresh Ubuntu 18.04 system, installing the latest Debian packages and I still see the issue.
Moreover I never get less than 20MB of memory usage.

I first noticed this problem on an ARMv7 platform where I was still getting 11MB for most of the nodes, except for a subscription to a statically allocated 77KB message which was taking 18MB.

MiguelCompany · 2019-02-18T10:15:27Z

Hi @alsora

Fast-RTPS will by default pre-allocate space for 100 message instances. You can change this behavior using an XML file. By default, Fast-RTPS will look for DEFAULT_FASTRTPS_PROFILES.xml on the current working directory, but a full path and filename can be specified using the environment variable FASTRTPS_DEFAULT_PROFILES_FILE. You can find the documentation on the format for the subscriber profile here

You coud start with this simple example:

<?xml version="1.0" encoding="UTF-8" ?>
<dds>
    <profiles>
        <!-- 
            Please note that rmw_fastrtps will only use the profile marked as default,
            so is_default_profile attribute here should be true 
        -->
        <subscriber profile_name="test_subscriber_profile" is_default_profile="true">
            <topic>
                <resourceLimitsQos>
                    <!-- Number of message instances initially allocated. Default is 100. -->
                    <allocated_samples>0</allocated_samples>

                    <!-- 
                        Maximum number of message instances allocated. 0 means no limit. 
                        Default is 5000. 
                    -->
                    <max_samples>0</max_samples>
                </resourceLimitsQos>
            </topic>
            
            <!-- 
                For each message instance, this indicates if memory for the seriazed payload is
                preallocated or not. On ROS2, this will be overriden to PREALLOCATED_WITH_REALLOC
                unless 'RMW_FASTRTPS_USE_QOS_FROM_XML=1' is defined in the environment.
                Possible values:
                   DYNAMIC                     Memory is not allocated until needed.
                   PREALLOCATED                Memory is allocated when message instances are 
                                               created and it is not allowed to grow. Useful for
                                               fixed size messages.
                   PREALLOCATED_WITH_REALLOC   Memory is allocated when message instances are 
                   (default)                   created based on estimates on the maximum size of
                                               the serialized payload, but it is allowed to grow
                                               upon serialization. Useful for dynamic size messages.
            -->
            <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
        </subscriber>
    </profiles>
</dds>

alsora · 2019-02-19T17:44:32Z

@MiguelCompany Thank you a lot.
Changing the allocated_samples to 1 worked perfectly.

Indeed in my plot the difference in size between a subscription to a 100 KiloBytes message and the one to a 10 bytes message was approximately 10 MegaBytes (100KB x 100 pre-allocated).

@dirk-thomas are there any plans to better integrate this feature?
I was thinking that if a subscription has a depth of 1 in its qos profile, then it should not preallocate more than 1 message.

dirk-thomas · 2019-02-19T17:53:13Z

are there any plans to better integrate this feature?

Since it works well with other RMW implementation it seems to be a FastRTPS internal implementation decision (not sure if the DDS spec defines any of that behavior?). Therefore I would refer to @MiguelCompany if he can provide more context if that is something they could change in the FastRTPS code base.

MiguelCompany · 2019-02-26T10:29:48Z

@alsora You're welcome

I was thinking that if a subscription has a depth of 1 in its qos profile, then it should not preallocate more than 1 message.

At first I thought it was like that, but it seems it is only the case for publishers. I have created eProsima/Fast-DDS#434 in order for subscriber's history behavior to be consistent.

alsora · 2019-03-05T14:45:38Z

I think that we can close this as the new FastRTPS version 1.7.1 solves the issue.

ubercool2 · 2019-03-06T08:01:28Z

@alsora just for your info, I do not see this fix in https://github.com/eProsima/Fast-RTPS/commits/v1.7.1.
Also the latest release of ROS 2 only includes Fast-RTPS 1.7.0 release: https://github.com/ros2/ros2/blob/release-crystal-20190214/ros2.repos#L37.

I'd re-open this as a super bug as this actually crashes the applications if you send pointclouds or images (e.g. 1MB messages).

sagniknitr · 2019-03-06T13:33:02Z

@ubercool2 are you seeing this crash while visualizing point clouds in rviz2 ? For ex- does rviz2 crash with segmentation fault when you subscribe to any topic publishing pointcloud2 ?

dejanpan · 2019-03-06T17:31:46Z

@sagniknitr we have this problem when running https://github.com/ApexAI/performance_test. We can see that the memory usage increases (left plot, orange line)
log_Array2m_27-02-2019_18-44-46.pdf.

Now I've talked to @MiguelCompany offline and these are the findings:

The reason for increased memory is twofold:
1. we used KEEP_ALL instead of KEEP_LAST for the "History Kind" QoS setting. This in combination with the TRANSIENT_LOCAL causes memory to keep growing.
2. this bug Made Subscriber history resource limits dependent on history depth [4782] eProsima/Fast-DDS#434
  1. Made Subscriber history resource limits dependent on history depth [4782] eProsima/Fast-DDS#434 will get cherry picked as hotfix into FastRTPS master branch, they will release 1.7.2 version in the next TBD days
As a workaround:
1. [ ] we would need to repeat the performance_test experiment with KEEP_LAST QoS
2. [ ] then we have 2 options: a) wait for above hotfix b) add into known limitations that TRANSIENT_LOCAL QoS should not be used in current ROS 2 Crystal Patch 2

@MiguelCompany is this correct?

MiguelCompany · 2019-03-07T06:42:51Z

@dejanpan

eProsima/Fast-RTPS#434 will get cherry picked as hotfix into FastRTPS master branch, they will release 1.7.2 version in the next TBD days

The hotfix will get cherry picked to master today, and patch release 1.7.2 will be out next week. Meanwhile, I have updated 1.7.1 release notes

then we have 2 options:
a) wait for above hotfix
b) add into known limitations that TRANSIENT_LOCAL QoS should not be used in current ROS 2 Crystal Patch 2

A third option is to use the workaround i mention on this comment

MiguelCompany · 2019-03-13T14:59:16Z

Fast-RTPS v1.7.2 has been released and includes eProsima/Fast-DDS#434

@alsora feel free to check the behavior is the expected.

Issue References: * ros2/rmw_fastrtps#257 * ros2/rmw_fastrtps#258 Signed-off-by: Steven! Ragnarök <steven@nuclearsandwich.com>

* Update the known issues section in Crystal. Issue References: * ros2/rmw_fastrtps#257 * ros2/rmw_fastrtps#258 Signed-off-by: Steven! Ragnarök <steven@nuclearsandwich.com> * remove duplicate periods

alsora · 2019-03-18T11:11:27Z

Ok, I finally tested release 1.7.2 with the current ROS2 master branch and the issue is now fixed.

We can close it.

* Update the known issues section in Crystal. Issue References: * ros2/rmw_fastrtps#257 * ros2/rmw_fastrtps#258 Signed-off-by: Steven! Ragnarök <steven@nuclearsandwich.com> * remove duplicate periods

dirk-thomas added the more-information-needed Further information is required label Feb 15, 2019

alsora closed this as completed Mar 5, 2019

alsora reopened this Mar 6, 2019

MiguelCompany mentioned this issue Mar 8, 2019

Release 1.7.2 eProsima/Fast-DDS#444

Merged

dejanpan mentioned this issue Mar 14, 2019

Crystal Patch Release 3 ros2/ros2#656

Closed

10 tasks

nuclearsandwich mentioned this issue Mar 14, 2019

Crystal Patch Release 4 ros2/ros2#668

Closed

8 tasks

nuclearsandwich added a commit to ros2/ros2_documentation that referenced this issue Mar 14, 2019

Update the known issues section in Crystal.

ca3f87e

Issue References: * ros2/rmw_fastrtps#257 * ros2/rmw_fastrtps#258 Signed-off-by: Steven! Ragnarök <steven@nuclearsandwich.com>

nuclearsandwich mentioned this issue Mar 14, 2019

Update the known issues section in Crystal. ros2/ros2_documentation#142

Merged

clalancette mentioned this issue Mar 15, 2019

Crystal Fast-RTPS fix backport tracking issue ros2/ros2#669

Closed

alsora closed this as completed Mar 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange RAM usage when creating subscriptions #257

Strange RAM usage when creating subscriptions #257

alsora commented Feb 15, 2019 •

edited

Loading

dirk-thomas commented Feb 15, 2019

alsora commented Feb 15, 2019 •

edited

Loading

dirk-thomas commented Feb 15, 2019

alsora commented Feb 15, 2019 •

edited

Loading

dirk-thomas commented Feb 15, 2019

alsora commented Feb 15, 2019 •

edited

Loading

dirk-thomas commented Feb 15, 2019

alsora commented Feb 16, 2019

MiguelCompany commented Feb 18, 2019

alsora commented Feb 19, 2019 •

edited

Loading

dirk-thomas commented Feb 19, 2019

MiguelCompany commented Feb 26, 2019

alsora commented Mar 5, 2019

ubercool2 commented Mar 6, 2019

sagniknitr commented Mar 6, 2019

dejanpan commented Mar 6, 2019

MiguelCompany commented Mar 7, 2019

MiguelCompany commented Mar 13, 2019

alsora commented Mar 18, 2019

Strange RAM usage when creating subscriptions #257

Strange RAM usage when creating subscriptions #257

Comments

alsora commented Feb 15, 2019 • edited Loading

Bug report

Steps to reproduce issue

Behavior

dirk-thomas commented Feb 15, 2019

alsora commented Feb 15, 2019 • edited Loading

dirk-thomas commented Feb 15, 2019

alsora commented Feb 15, 2019 • edited Loading

dirk-thomas commented Feb 15, 2019

alsora commented Feb 15, 2019 • edited Loading

dirk-thomas commented Feb 15, 2019

alsora commented Feb 16, 2019

MiguelCompany commented Feb 18, 2019

alsora commented Feb 19, 2019 • edited Loading

dirk-thomas commented Feb 19, 2019

MiguelCompany commented Feb 26, 2019

alsora commented Mar 5, 2019

ubercool2 commented Mar 6, 2019

sagniknitr commented Mar 6, 2019

dejanpan commented Mar 6, 2019

MiguelCompany commented Mar 7, 2019

MiguelCompany commented Mar 13, 2019

alsora commented Mar 18, 2019

alsora commented Feb 15, 2019 •

edited

Loading

alsora commented Feb 15, 2019 •

edited

Loading

alsora commented Feb 15, 2019 •

edited

Loading

alsora commented Feb 15, 2019 •

edited

Loading

alsora commented Feb 19, 2019 •

edited

Loading