Support static topologies #126

jpsamper2009 · 2017-10-23T21:09:45Z

Feature request

Allow users to define a static topology during rmw_init, such that the system will allocate only as many resources as specified by the user.

Motivation

Currently, ROS2 does not support the use of a static topology: the user gets to specificy exactly how many publishers, subscribers, topics, etc. a node needs, and thus the system allocates for exactly that many -- no more, no less. While Fast-RTPS and RTI Connext Pro do not require this feature, since they allow for dynamic allocation, I foresee that with the introduction of DDS-XCRE and the use of ROS in embedded devices, we will eventually have the need to statically allocate resources during initialization.
While it is possible to over-allocate resources, this solution is obviously not ideal (especially if you are working in a low-resource device).

Feature description

We see two ways to implement this: 1) creating rmw_pre_init() and rmw_post_init() functions or 2) Passing a parameter to rmw_init that would be used to specify the topology.
e.g,:
Option 1

rmw_ret_t rmw_pre_init(rmw_init_options_t * context_ptr) //not necessarily void *
{
  //pre-init stuff here
  return RMW_RET_OK;
}

rmw_ret_t rmw_post_init(rmw_init_options_t * context_ptr) //not necessarily void *
{
  // post-init stuff here
  return RMW_RET_OK;
}

int main(void)
{
  rmw_init_options_t * context_ptr = ...; 
  rmw_ret_t retval = rmw_pre_init(context_ptr);
  // Check for and handle errors
  retval = rmw_init();
  if (ret == RMW_RET_OK) {
    retval = rmw_post_init(context_ptr);
    // Check for and handle errors
  } else {
    // error handling...
  }

  return 0;
}

Option 2

rmw_ret_t entry_callback(void * caller_context_ptr)
{
  // do pre-init stuff
  return RMW_RET_OK;
}

rmw_ret_t exit_callback(void * caller_context_ptr, rmw_ret_t current_rmw_init_status)
{
  // do post-init stuff
  return RMW_RET_OK;
}

int main(void)
{
  struct rmw_init_callbacks_t init_callbacks {
    .m_caller_context_ptr = NULL,
    .m_callback_at_rmw_init_entry = &entry_callback,
    .m_callback_at_rmw_init_exit = &exit_callback
  };

  rmw_ret_t ret = rmw_init(init_callbacks); // new rmw_init
  // error handling...

  return 0;
}

rmw_ret_t rmw_init(const rmw_init_callbacks_t *  callback_ptr) // new rmw_init
{
  rmw_ret_t retval = RMW_RET_ERROR;
  retval = callback_ptr->m_callback_at_rmw_init_entry(callback_ptr->m_caller_context_ptr);
  // check for and handle errors
  retval = old_rmw_init(); //old rmw_init
  // check for and handle errors 
  retval = callback_ptr->m_callback_at_rmw_init_exit(callback_ptr->m_caller_context_ptr, retval);
  // check for and handle errors
  return retval
}

** Credit to @wjwwood @serge-nikulin for the original examples (which I may have bastardized when I elaborated on them)

Implementation considerations

Option 1
- Pros;
  1. Backwards-compatible (if you haven't needed to do any pre-init or post-init steps, your code won't have to change).
- Cons:
  1. User code written initially for a dynamic topology will require some changes before it is compatible with the static DDS implementation.
Option 2
- Pros:
  1. When switching from a dynamic to a static topology, the user code will not have to change since the rmw implementation will handle the static allocation of resources. If the users have an idea of what resources are needed, they will be able to specify this in a configuration file, and if they don't, then there can be a default over-allocating configuration that can later be tuned for the specific use-case.
- Cons:
  1. Not backwards-compatible.
At a high-level for the static topology, you would need to specify a graph of how your nodes are connected and what kind of resources they need to communicate (on an edge-by-edge basis). e.g., :
- Number of nodes
- Per node:
  - Number of publishers
  - Number of subscribers
  - Topics
- Per publisher/subscriber:
  - Number of Writers/Readers
  - Number of samples
- Routes between nodes
- QoS settings?

@dejanpan @dirk-thomas @wjwwood @serge-nikulin FYI

dirk-thomas · 2017-10-23T21:32:32Z

Before diving into the various ways these arguments could be passed through the API I want to take a step back and ask the question if this is really a programming time decision. I can imagine scenarios where I program a few nodes but don't need to limit the resources on my development machine. Only when deploying a set of nodes on a target machine (which has to deal with constraint resources) I would like to define the upper boundaries. This kind of use case sounds to me that it should be possible to provide such configuration from the "outside" (instead of within the code calling rmw_init).

jpsamper2009 · 2017-10-24T19:10:03Z

@dirk-thomas

[is] this is really a programming time decision?

I would argue yes, since one has to write one's code in such a way that it will be runnable on the target machine. A simple analogy is that if a developer is using a microprocessor with 16-bit registries, she may want to write her code using 16-bit (or smaller) numbers, even though she is developing on a 64-bit system. Similarly, even if the development machine has 64GB of RAM, she may want to run programs that have a limited, pre-allocated amount of memory because the target system has a limited amount of memory. This ability to limit can be especially useful if the developer is trying to spec out what kind of micro-controller she needs, for example.

This kind of use case sounds to me that it should be possible to provide such configuration from the "outside" (instead of within the code calling rmw_init).

I agree that it could be provided outside of rmw_init (one suggestion was an rmw_pre_init function), but ideally the API can be standardized across rmw implementations, such that ROS2 retains its modularity across middlewares. My thought is that the current rmw implementations (rti connext and fast-rtps) both assume that the target machine has the ability to dynamically allocate memory, and therefore, ROS2 ends up with functionality, such as dynamic discovery, which will not be compatible with embedded systems. By having the API standardized, users will have a placeholder for the moment in which they decide to start optimizing their code for an embedded system: at first, for example, they'll want to be able to switch between a dynamic rmw implementation and an over-allocated static rmw implementation to see if their programs still work.

dirk-thomas · 2017-10-24T19:35:30Z

I agree that it could be provided outside of rmw_init (one suggestion was an rmw_pre_init function)

With "outside" I was aiming for outside-of-code, e.g. a configuration file. Connext allows to specify those limits in an xml file afaik.

serge-nikulin · 2017-10-24T19:39:02Z

File systems and files very often do not exist on production controllers.

Connext allows to specify those limits in an xml file afaik.

RTI does not recommend using this mechanism for production. It's for R&D only.

iluetkeb · 2017-10-25T08:38:00Z

Firstly, I support this feature request in general.

File systems and files very often do not exist on production controllers.

I surmise that "configuration file" was short-hand for "configuration as data" (as opposed to code, which seems to have been the initial suggestion).

I agree with @dirk-thomas that this seems to be something which could be handled with configuration data supplied "from the outside". In my opinion, this has many advantages,

it would be easy to adapt to different devices (which might require differently sized buffers, etc.)
it separates the specification from the implementation, so it would allow both implementations which pre-allocate everything on process start-up, as well as implementations which do so at compile or link time.

However, I'm not sure whether it is sufficient and/or advisable to use the underlying DDS implementation's mechanisms. For one thing, ROS2 might require internal buffers, or other things which also need to be statically allocated. I would prefer that ROS2 defines the configuration format itself.

wjwwood · 2017-10-26T02:20:23Z

These are my take aways for this discussion so far:

I also support this kind of API in rmw, but it would probably be needed at the client library level as well.
The limits should be expressed in terms of ROS primitives, e.g. 1 ROS node, 2 publishers, 1 subscription, etc... rather than in terms of underlying implementations primitives.
- else we cannot pre allocate the ros structures, using one of the DDS configurations will only pre-allocate their stuff (may or may not be sufficient)
You need to be able to do it programmatically before you can do it with an external configuration file (dog fooding).
- if you read from a file you need to some how make the contents affect the runtime, that API might as well be public too
Should be possible to express these limits (or not) programmatically without modifying a typical example statements or interleaving new statements into existing code.
- i.e. you can put new functions before or after the content of main, but not within it
- that means no new arguments to rmw_init or rmw_create_node, etc.

is really a programming time decision.

As to whether or not it is a programming time decision depends on how it is implemented imo. If you want to do static initialization on the stack or data section (what @iluetkeb described as compile/link time), then you need to change the code (unless maybe you do some fancy linking stuff). However, if it is fine to have a period of time in the runtime of the program where memory allocation is acceptable (what @iluetkeb described as "pre-allocation on process start-up"), then it could be done after programming/compile time.

As I said in one of my bullets above, it needs to be doable programmatically and it needs to be expressed in terms of ROS 2 entities, so whether or not we take in settings from the filesystem or argv or eeprom doesn't really matter to me.

As I see it there are a few questions (mostly decoupled from one another) that we need to answer before making/judging concrete proposals:

Is pre-allocation at process startup sufficient or does it have to be done at compile time?
- If compile time, do we also need to support pre-allocation at startup?
- If the answer is compile time, then that makes external configuration hard/impossible in my opinion.
What happens when the underlying rmw implementation does not support pre-allocation, i.e. do we abort or just pre-allocate the ros structures as much as possible? Is that useful?

Then there are many technical questions like:

How do we communicate these settings and/or pre-allocated resources to the rmw functions without passing them directly/optionally?
- either global settings/storage or we start passing a context object to everything (we've discussed doing the latter in the past)
- but what does that state/structure look like?
- the current proposal in the original post only allows you an opportunity to configure the underlying middleware's resource limits before ros tries to utilize or create them, but it does not let you pre-allocate the ros ones
How do we change the rmw functions (and allocation functions) in a way that they use the pre-allocate ros structures?
What do we currently allow to be dynamically sized which would not be possible anymore and how do we handle that?
- e.g. there are "soft" limits on the node name/namespace and topic names but storage for them is dynamically allocated on demand
- many structures have something like PIMPL pointers, which can not be pre-allocated on the stack atm, how to hook into that?
Do we need to assert (at compile time or runtime) that bounded sized messages are being used?
Do we need a message structure loaning system for the ros messages?
other things, probably

If we can begin to answer some of the questions, then an implementation will be easier to propose/review in my opinion.

serge-nikulin · 2017-11-07T18:37:28Z

@wjwwood,

I’d be much more comfortable to answer your questions after trying to do the actual code hardening.
Nevertheless see my current opinion on two your questions below. This opinion might change later on.

Is pre-allocation at process startup sufficient or does it have to be done at compile time?

Process startup is OK.

What happens when the underlying rmw implementation does not support pre-allocation, i.e. do we abort or just pre-allocate the ros structures as much as possible? Is that useful?

Our choices:

If we invoke explicit pre-allocate RMW function, it should fail in non-static RMW provider.
If we invoke explicit pre-allocate RMW function, it returns OK in non-static RMW provider.

I don't have a hard preference between the two: a non-static provider should not appear in a safety-critical app and hence any choice is good. We could have a QoS policy to select either behavior or a bit flag in rclcpp::init (and preserve backward compatibility). This bit set parameter could serve a lot (<=63) of other uses later on.

/// Initialize communications via the rmw implementation and set up a global signal handler.
/**
 * \param[in] argc Number of arguments.
 * \param[in] argv Argument vector. Will eventually be used for passing options to rclcpp.
 * \param[in] bit_flags A set of bit flags that control various run time aspects.
 */
RCLCPP_PUBLIC
void
init(int argc, char * argv[], uint64_t bit_flags = 0ULL);

gbiggs · 2017-11-16T03:56:44Z

Is pre-allocation at process startup sufficient or does it have to be done at compile time?

I think this is decided by the sort of microprocessor you want to include in your target systems. There are still many micros in use that don't have an OS. Do they support memory allocation in any way (e.g. OS-like, or a hack by the micro's compiler to mimic allocation)? If there is no way to "allocate" memory except when the code is compiled, then you must go with the latter option to support those platforms.

If compile time, do we also need to support pre-allocation at startup?

Having both options available would be the most flexible. If I am targeting a micro that requires it be done at compile time, then obviously that's the route I take, but if my micro can do allocation at run-time then I may prefer the option pre-allocation at startup combined with a chunk of configuration data, because that may be more flexible.

If the answer is compile time, then that makes external configuration hard/impossible in my opinion.

Yes, I think I agree that if it's compile time, it would need to be in the code. Otherwise some kind of custom pre-processor would be needed and that significantly raises maintenance costs for the rmw implementer.

What happens when the underlying rmw implementation does not support pre-allocation, i.e. do we abort or just pre-allocate the ros structures as much as possible? Is that useful?

For compile-time, the obviously the compilation fails. For run-time, I think it would be better to abort than to keep on going and potentially give the developer the wrong idea about what was pre-allocated.

How do we communicate these settings and/or pre-allocated resources to the rmw functions without passing them directly/optionally? either global settings/storage or we start passing a context object to everything (we've discussed doing the latter in the past)

I am generally in favour of using a context object because I think it is a cleaner architectural solution that avoids nasty problems that globals can introduce. But I accept the argument that a global declared and accessible internally leads to an easier public API and reduces the chance of errors made by the developer.

What do we currently allow to be dynamically sized which would not be possible anymore and how do we handle that? e.g. there are "soft" limits on the node name/namespace and topic names but storage for them is dynamically allocated on demand

If the number of nodes and their names are known at compile time, then it should be possible to pre-allocate exactly what is needed, even if that information comes from a separate configuration data chunk. But that would lead to information duplication which is not so nice.

Do we need to assert (at compile time or runtime) that bounded sized messages are being used?

I think we should.

iluetkeb · 2017-11-16T09:03:35Z

I think this is decided by the sort of microprocessor you want to include in your target systems.

From our side, for fairly classical mobile robotics use cases, we're currently looking at ARM M3/M4 class micro-controllers (in other words, fairly powerful ones), and the RTOS's we're considering in the first step do support memory allocation.

I wouldn't rule out smaller devices forever, but I don't see a pull on that, yet. Also, given their other constraints, I would expect that a first step in that direction would start from an independent implementation which can talk DDS-XRCE. This means I don't think we need to consider those requirements for the ROS2 rcl.

btw... I don't think we need to hold up the ROS2 release for all this stuff. Not sure what you guys think, but I think the options being discusses currently can all be introduced in a backwards compatible way.

wjwwood added the question Further information is requested label Feb 22, 2018

wjwwood added this to the untargeted milestone Feb 22, 2018

wjwwood added enhancement New feature or request and removed question Further information is requested labels Feb 22, 2018

wjwwood mentioned this issue Nov 8, 2018

refactor init to allow options to be passed and to not be global #154

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support static topologies #126

Support static topologies #126

jpsamper2009 commented Oct 23, 2017

dirk-thomas commented Oct 23, 2017

jpsamper2009 commented Oct 24, 2017 •

edited

Loading

dirk-thomas commented Oct 24, 2017

serge-nikulin commented Oct 24, 2017 •

edited

Loading

iluetkeb commented Oct 25, 2017 •

edited

Loading

wjwwood commented Oct 26, 2017

serge-nikulin commented Nov 7, 2017

gbiggs commented Nov 16, 2017

iluetkeb commented Nov 16, 2017

Support static topologies #126

Support static topologies #126

Comments

jpsamper2009 commented Oct 23, 2017

Feature request

Motivation

Feature description

Implementation considerations

dirk-thomas commented Oct 23, 2017

jpsamper2009 commented Oct 24, 2017 • edited Loading

dirk-thomas commented Oct 24, 2017

serge-nikulin commented Oct 24, 2017 • edited Loading

iluetkeb commented Oct 25, 2017 • edited Loading

wjwwood commented Oct 26, 2017

serge-nikulin commented Nov 7, 2017

gbiggs commented Nov 16, 2017

iluetkeb commented Nov 16, 2017

jpsamper2009 commented Oct 24, 2017 •

edited

Loading

serge-nikulin commented Oct 24, 2017 •

edited

Loading

iluetkeb commented Oct 25, 2017 •

edited

Loading