Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange RAM usage when creating subscriptions #257

Closed
alsora opened this issue Feb 15, 2019 · 19 comments · Fixed by eProsima/Fast-DDS#444
Closed

Strange RAM usage when creating subscriptions #257

alsora opened this issue Feb 15, 2019 · 19 comments · Fixed by eProsima/Fast-DDS#444
Labels
more-information-needed Further information is required

Comments

@alsora
Copy link
Contributor

alsora commented Feb 15, 2019

Bug report

Required Info:

  • Operating System:
    • Ubuntu 18.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • Crystal
  • DDS implementation:
    • Fast-RTPS, RTI Connext
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

Define messages of fixed size

Array100kb.msg

byte[100000] data

Instantiate subscription to topic of this message type

#include <memory>
#include <iostream>
#include <thread>
#include <sstream>
#include "rclcpp/rclcpp.hpp"
#include "simple_memory_test/msg/array1kb.hpp"

class SimpleSubscriberNode : public rclcpp::Node {

public:
    SimpleSubscriberNode(std::string name) : Node(name)
    {
        rmw_qos_profile_t custom_qos_profile = rmw_qos_profile_default;

        _sub = this->create_subscription<simple_memory_test::msg::Array1kb>( "chatter",
            std::bind(&SimpleSubscriberNode::simple_callback, this, std::placeholders::_1),
            custom_qos_profile);
    }
private:
    void simple_callback(const simple_memory_test::msg::Array1kb::SharedPtr msg)
    {
        (void)msg;
    }
    rclcpp::Subscription<simple_memory_test::msg::Array1kb>::SharedPtr _sub;
};



int main(int argc, char ** argv)
{
    rclcpp::init(argc, argv);

    std::shared_ptr<SimpleSubscriberNode> sub_node = std::make_shared<SimpleSubscriberNode>("this_node");

    rclcpp::spin(sub_node);

    rclcpp::shutdown();

    return 0;
}

Behavior

From what I have seen, the RAM required for instantiating a subscription, it's more or less invariant to the type of messages.
Note that I'm only running a system containing that subscriber node, no other nodes are alive.

However, testing this assumption for several values I got this

fastrtps_peak

The physical memory required by the previously presented code, for a message between 50KB and 500KB it's bigger than the one required for a message of size more than 1MB.
After the peak of physical memory (56MB for a message of 450KB), the amount returns to the same constant value.

For what concerns the virtual memory usage, it keeps increasing a lot reaching 1.1GB for a message of size 4MB.

This behavior does not occur when creating publishers.

This behavior does not occur when using OpenSplice DDS.

Do you have any ideas why ?
May be related to this ? #203

Here you can find the package I used for running the tests.
https://github.com/alsora/ros2-code-examples/tree/master/simple_memory_test

It creates an executable for each data point.
I am measuring the RAM usage with
ps aux | grep sub_nodes

The package also contains a plot.py file where the exact results of the tests are hardcoded and you can get plot them again to better inspect the graph.

@dirk-thomas
Copy link
Member

Please include an example which exact executable you are running. When looking at the pub_node_<SIZE> (e.g. https://github.com/alsora/ros2-code-examples/blob/c58fa3b3b4196e5e3333f11baa0b295d008e8419/simple_memory_test/src/pub/4mb_main.cpp) the node seems to never publish any messages and just sit there being idle?

@dirk-thomas dirk-thomas added the more-information-needed Further information is required label Feb 15, 2019
@alsora
Copy link
Contributor Author

alsora commented Feb 15, 2019

Yes exactly.
In the plots I'm only creating subscribers and letting them spin, without anyone publishing.
I just wanted to look at the memory requirement of calling create_subscription.

Similarly I tested if the issue was present for publishers, i.e. only calling create_publisher, but without doing anything.

@dirk-thomas
Copy link
Member

So when running e.g. just the pub_node_4mb executable you see an increased memory usage over time for that process? I see a constant memory usage of 9.5 MiB for that process (on 18.04 with the latest Debian packages). For sub_node_4mb the value is 9.9 MiB but also constant.

@alsora
Copy link
Contributor Author

alsora commented Feb 15, 2019

Values are always constant in time.

The problem is that I would expect them to be also more or less invariant to the type of message.

The value I see with sub_node_10b is 20MB.
The value I see with sub_node_450kb is 56MB.
The value I see with sub_node_4MB is 22MB.

There is a very strange high memory usage for some subscriptions in the range 50kb 450kb.

I'm using the official Crystal docker image.

@dirk-thomas
Copy link
Member

The value I see with sub_node_10b is 20MB.
The value I see with sub_node_450kb is 56MB.
The value I see with sub_node_4MB is 22MB.

Which RMW implementation are these for? FastRTPS or Connext?

@alsora
Copy link
Contributor Author

alsora commented Feb 15, 2019

Fastrtps..

With opensplice the memory usage for the same tests is almost invariant.

@dirk-thomas
Copy link
Member

I can't reproduce the numbers on my end - all pub/sub binaries have a memory usage of ~ 9-10 MiB.

One thing to keep in mind is that when creating publishers and subscribers the RMW impl. might choose to preallocate a number of message instances. That is usually limited to smaller messages where as larger messages might only be allocated dynamically. That might be a reason for your drop in the plot.

@alsora
Copy link
Contributor Author

alsora commented Feb 16, 2019

One thing to keep in mind is that when creating publishers and subscribers the RMW impl. might choose to preallocate a number of message instances. That is usually limited to smaller messages where as larger messages might only be allocated dynamically. That might be a reason for your drop in the plot.

That's very interesting.
Is there a way for managing how this feature works?

However I tested again the nodes on a fresh Ubuntu 18.04 system, installing the latest Debian packages and I still see the issue.
Moreover I never get less than 20MB of memory usage.

I first noticed this problem on an ARMv7 platform where I was still getting 11MB for most of the nodes, except for a subscription to a statically allocated 77KB message which was taking 18MB.

@MiguelCompany
Copy link
Contributor

Hi @alsora

Fast-RTPS will by default pre-allocate space for 100 message instances. You can change this behavior using an XML file. By default, Fast-RTPS will look for DEFAULT_FASTRTPS_PROFILES.xml on the current working directory, but a full path and filename can be specified using the environment variable FASTRTPS_DEFAULT_PROFILES_FILE. You can find the documentation on the format for the subscriber profile here

You coud start with this simple example:

<?xml version="1.0" encoding="UTF-8" ?>
<dds>
    <profiles>
        <!-- 
            Please note that rmw_fastrtps will only use the profile marked as default,
            so is_default_profile attribute here should be true 
        -->
        <subscriber profile_name="test_subscriber_profile" is_default_profile="true">
            <topic>
                <resourceLimitsQos>
                    <!-- Number of message instances initially allocated. Default is 100. -->
                    <allocated_samples>0</allocated_samples>

                    <!-- 
                        Maximum number of message instances allocated. 0 means no limit. 
                        Default is 5000. 
                    -->
                    <max_samples>0</max_samples>
                </resourceLimitsQos>
            </topic>
            
            <!-- 
                For each message instance, this indicates if memory for the seriazed payload is
                preallocated or not. On ROS2, this will be overriden to PREALLOCATED_WITH_REALLOC
                unless 'RMW_FASTRTPS_USE_QOS_FROM_XML=1' is defined in the environment.
                Possible values:
                   DYNAMIC                     Memory is not allocated until needed.
                   PREALLOCATED                Memory is allocated when message instances are 
                                               created and it is not allowed to grow. Useful for
                                               fixed size messages.
                   PREALLOCATED_WITH_REALLOC   Memory is allocated when message instances are 
                   (default)                   created based on estimates on the maximum size of
                                               the serialized payload, but it is allowed to grow
                                               upon serialization. Useful for dynamic size messages.
            -->
            <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
        </subscriber>
    </profiles>
</dds>

@alsora
Copy link
Contributor Author

alsora commented Feb 19, 2019

@MiguelCompany Thank you a lot.
Changing the allocated_samples to 1 worked perfectly.

Indeed in my plot the difference in size between a subscription to a 100 KiloBytes message and the one to a 10 bytes message was approximately 10 MegaBytes (100KB x 100 pre-allocated).

@dirk-thomas are there any plans to better integrate this feature?
I was thinking that if a subscription has a depth of 1 in its qos profile, then it should not preallocate more than 1 message.

@dirk-thomas
Copy link
Member

are there any plans to better integrate this feature?

Since it works well with other RMW implementation it seems to be a FastRTPS internal implementation decision (not sure if the DDS spec defines any of that behavior?). Therefore I would refer to @MiguelCompany if he can provide more context if that is something they could change in the FastRTPS code base.

@MiguelCompany
Copy link
Contributor

@alsora You're welcome

I was thinking that if a subscription has a depth of 1 in its qos profile, then it should not preallocate more than 1 message.

At first I thought it was like that, but it seems it is only the case for publishers. I have created eProsima/Fast-DDS#434 in order for subscriber's history behavior to be consistent.

@alsora
Copy link
Contributor Author

alsora commented Mar 5, 2019

I think that we can close this as the new FastRTPS version 1.7.1 solves the issue.

@alsora alsora closed this as completed Mar 5, 2019
@ubercool2
Copy link

@alsora just for your info, I do not see this fix in https://github.com/eProsima/Fast-RTPS/commits/v1.7.1.
Also the latest release of ROS 2 only includes Fast-RTPS 1.7.0 release: https://github.com/ros2/ros2/blob/release-crystal-20190214/ros2.repos#L37.

I'd re-open this as a super bug as this actually crashes the applications if you send pointclouds or images (e.g. 1MB messages).

@alsora alsora reopened this Mar 6, 2019
@sagniknitr
Copy link

@ubercool2 are you seeing this crash while visualizing point clouds in rviz2 ? For ex- does rviz2 crash with segmentation fault when you subscribe to any topic publishing pointcloud2 ?

@dejanpan
Copy link

dejanpan commented Mar 6, 2019

@sagniknitr we have this problem when running https://github.com/ApexAI/performance_test. We can see that the memory usage increases (left plot, orange line)
log_Array2m_27-02-2019_18-44-46.pdf.

Now I've talked to @MiguelCompany offline and these are the findings:

  1. The reason for increased memory is twofold:
    1. we used KEEP_ALL instead of KEEP_LAST for the "History Kind" QoS setting. This in combination with the TRANSIENT_LOCAL causes memory to keep growing.
    2. this bug Made Subscriber history resource limits dependent on history depth [4782] eProsima/Fast-DDS#434
      1. Made Subscriber history resource limits dependent on history depth [4782] eProsima/Fast-DDS#434 will get cherry picked as hotfix into FastRTPS master branch, they will release 1.7.2 version in the next TBD days
  2. As a workaround:
    1. [ ] we would need to repeat the performance_test experiment with KEEP_LAST QoS
    2. [ ] then we have 2 options: a) wait for above hotfix b) add into known limitations that TRANSIENT_LOCAL QoS should not be used in current ROS 2 Crystal Patch 2

@MiguelCompany is this correct?

@MiguelCompany
Copy link
Contributor

@dejanpan

eProsima/Fast-RTPS#434 will get cherry picked as hotfix into FastRTPS master branch, they will release 1.7.2 version in the next TBD days

The hotfix will get cherry picked to master today, and patch release 1.7.2 will be out next week. Meanwhile, I have updated 1.7.1 release notes

then we have 2 options:
a) wait for above hotfix
b) add into known limitations that TRANSIENT_LOCAL QoS should not be used in current ROS 2 Crystal Patch 2

A third option is to use the workaround i mention on this comment

@MiguelCompany
Copy link
Contributor

Fast-RTPS v1.7.2 has been released and includes eProsima/Fast-DDS#434

@alsora feel free to check the behavior is the expected.

nuclearsandwich added a commit to ros2/ros2_documentation that referenced this issue Mar 14, 2019
Issue References:

*  ros2/rmw_fastrtps#257
*  ros2/rmw_fastrtps#258

Signed-off-by: Steven! Ragnarök <steven@nuclearsandwich.com>
nuclearsandwich added a commit to ros2/ros2_documentation that referenced this issue Mar 14, 2019
* Update the known issues section in Crystal.

Issue References:

*  ros2/rmw_fastrtps#257
*  ros2/rmw_fastrtps#258

Signed-off-by: Steven! Ragnarök <steven@nuclearsandwich.com>

* remove duplicate periods
@alsora
Copy link
Contributor Author

alsora commented Mar 18, 2019

Ok, I finally tested release 1.7.2 with the current ROS2 master branch and the issue is now fixed.

We can close it.

@alsora alsora closed this as completed Mar 18, 2019
ferranm99 added a commit to ferranm99/ferran-ros that referenced this issue May 20, 2022
* Update the known issues section in Crystal.

Issue References:

*  ros2/rmw_fastrtps#257
*  ros2/rmw_fastrtps#258

Signed-off-by: Steven! Ragnarök <steven@nuclearsandwich.com>

* remove duplicate periods
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more-information-needed Further information is required
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants