Feature/persistent ids #242

arjo129 · 2020-12-31T06:07:51Z

This PR adds persistence to the IDs so that fleet adapters/schedule node may be restarted without the system loosing the IDs. It does it via maintaining a transaction log. An entry is added to this log whenever a participant registers/unregisters. The log also maintains footprint and vicinity information. Upon a restart, the log file will be read and rerun through the Database in sequential order, this will automatically restore the correct ids as ids are issued sequentially.

This PR also adds checks to enforce the (name, owner) pair is unique during registration as this makes it much easier to create a log file. The architecture of the PR is simple: there is a YamlLogger class which is in charge of the actual file I/O and YAML. This class inherits from the AbstractParticipantLogger which defines an interface. If one wishes to use some other format other than YAML that is also possible by simply inheriting AbstractParticipantLogger This is useful for testing purposes as well.

The ParticpantRegistry class is in charge of input validation and acts as a wrapper around the Database class for handling participant registration and unregistration. These are converted to AtomicEvent objects and serialized by whatever AbstractParticipantLogger has been passed to it.

On the node side we have a minor change during initiallization it looks for the rosparam log_file_location and reads from/creates a log file in this location. If no location is passed it defaults to loading or logging in the file .rmf_schedule_node.yml.

This PR also adds scaffolding for unit-testing as I needed to test the correctness of the serialization without entering a tmux hell.
Finally, this PR adds a dependency on yaml-cpp (already used in traffic editor) so you may or may not need to run rosdep again.

Expected behaviours and failure modes

If a log file is not found, the system will create a log file.
If a log file is invalid, the system will immediately shutdown and throw an error. The rationale for this is that in the event user feeds in the wrong file as a logfile, we should not be erasing the users data.
Currently, it does not support updates to a participant's description yet. If a participant description has changed it should unregister first then re-register with a different ID. This will be implemented in a separate PR (likely needs a change within the register_participant in Database.cpp OR a new update_participant call also in Database.cpp).

mxgrey

This is a very well designed and well implemented approach to persistent participant IDs.

Many of my review comments are nitpicks, and most of them are aimed at improving the failure modes. However, there's an important fundamental change that deserves special attention:

After thinking it over very carefully, I've come to the conclusion that we should not unregister participants from the schedule node, ever. The rmf_traffic::schedule::Database class can keep its unregister_participant function, but we should not use it for a distributed system that intends to have consistent, persistent identities for participants.

Because of the RAII design of the rmf_traffic::schedule::Participant class, we will receive an UnregisterParticipant message any time a fleet adapter is gracefully torn down, but we should not treat that as a true unregistering within the schedule node. Instead we should erase the participant's itinerary while keeping it registered. Then if that participant "registers" itself back later, we just give it back its original ID. The motivation for this is to allow fleet adapters to tear down and spin back up without their participant IDs changing.

I can't think of a compelling reason that a realistic deployment would ever want a participant to have its ID reassigned. Also, the memory savings of permanently unregistering a participant are pretty negligible; erasing the itinerary would free up roughly the same amount of resources.

I've left an inline comment that spells out the changes needed for this. Mostly it's replacing the database->unregister_participant(~) function and getting rid of the remove_participant API for the participant logger.

mxgrey · 2021-01-18T05:13:59Z

rmf_traffic_ros2/src/rmf_traffic_ros2/schedule/ParticipantRegistry.cpp

+    return ParticipantDescription::Rx::Unresponsive;
+  if(response == "Responsive")
+    return ParticipantDescription::Rx::Responsive;
+  throw std::runtime_error("Responsiveness field contains invalid identifier");


Nitpick: Since Invalid is one of the identifiers, it would be more accurate to say Responsiveness field contains unknown identifier.

mxgrey · 2021-01-19T05:24:15Z

rmf_traffic_ros2/include/rmf_traffic_ros2/schedule/ParticipantRegistry.hpp

+
+  /// Removes a participant from the registry.
+  /// \param[in] id - participant to remove
+  void remove_participant(ParticipantId id);


Let's get rid of this function for now.

mxgrey · 2021-01-19T05:24:56Z

rmf_traffic_ros2/include/rmf_traffic_ros2/schedule/ParticipantRegistry.hpp

+  enum class OpType : uint8_t
+  {
+    Add = 0,
+    Remove


Let's get rid of OpType::Remove for now.

mxgrey · 2021-01-19T05:37:48Z

rmf_traffic_ros2/include/rmf_traffic_ros2/schedule/ParticipantRegistry.hpp

+
+} // end namespace rmf_traffic_ros2
+
+#endif


Nitpick: Include a newline at the end of the file.

mxgrey · 2021-01-19T05:44:48Z

rmf_traffic_ros2/src/rmf_traffic_ros2/schedule/Node.cpp

+  get_parameter_or<std::string>(
+    "log_file_location", 
+    log_file_name, 
+    ".rmf_schedule_node.yml");


Nitpick: Official docs recommend the use of .yaml as the extension name: https://yaml.org/faq.html

mxgrey · 2021-01-19T15:07:38Z

rmf_traffic_ros2/src/rmf_traffic_ros2/schedule/YamlLogger.cpp

+  Implementation(std::string file_path):
+  _file_path(file_path)
+  {
+    _counter = 0;


Suggested change

_counter = 0;

_counter = 0;

if (!std::filesystem::exists(file_path))

{

std::filesystem::create_directories(

std::filesystem::absolute(file_path).parent_path());

return;

}

If the file doesn't exist, let's create the directory here (to avoid an exception later) and skip trying to read it.

mxgrey · 2021-01-19T15:15:01Z

rmf_traffic_ros2/test/main.cpp

+#define CATCH_CONFIG_MAIN
+#include <rmf_utils/catch.hpp>
+
+// This will create the main(int argc, char* argv[]) entry point for testing


Nitpick: Add new line at the end of the file.

mxgrey · 2021-01-19T15:18:23Z

rmf_traffic_ros2/test/unit/test_ParticipantRegistry.cpp

+      }
+    }
+  }
+}


Nitpick: Add a newline at the end of the file.

mxgrey · 2021-01-19T15:26:56Z

rmf_traffic_ros2/src/rmf_traffic_ros2/schedule/YamlLogger.cpp

+YamlLogger::YamlLogger(std::string file_path): 
+  _pimpl(rmf_utils::make_unique_impl<Implementation>(file_path))
+{
+


Nitpick: I recommend putting something like // Do nothing in any function (including constructors) where the body is empty. It's a way of saying "This function is intentionally left blank".

mxgrey · 2021-01-19T15:28:05Z

rmf_traffic_ros2/test/unit/test_ParticipantRegistry.cpp

+  }
+}
+
+bool file_exists(const char *fileName)


We're using C++17 so you can start using std::filesystem::exists(~) instead of doing this.

- Switched to unique pointers - Added new line at end of the file - Moved serialization functions into internal header. - Removed `remove()`

arjo129 · 2021-01-25T05:16:05Z

@mxgrey I have addressed all the issues. I think its ready for round 2.

mxgrey · 2021-01-26T06:52:20Z

An important detail just occurred to me. When a participant is rebooted, the schedule node will have to tell it what its last known schedule version was so that the rebooted participant can pick back up where it left off. I'm afraid this might require a small API breakage, although maybe I can come up with a clever way to work it in.

I think this will be a blocking issue for this PR, unfortunately. I'll work on a fix for this as soon as time permits.

arjo129 · 2021-02-04T01:05:38Z

Alternatively, we don't actually need to change any API and add persistence on the fleet adapter side as well.

…re into feature/persistent-ids

Signed-off-by: Michael X. Grey <grey@openrobotics.org>

arjo129 added 23 commits December 22, 2020 16:03

Added Database APIs for loading items from config.

71962e2

Changes can be tracked within schedule node itself

1be0e82

Add a participant registry

21d327e

Finished parser

43c42a2

Partially seriallized

7b0fea4

Added unit tests

5fbcebe

Check idempotency of shape type

8c6d9bb

Update header

5796767

Serialize shapecontexts.

108b71f

append-only log rework in progress

5ac4be9

Finalize API

d5bcc1f

fix segfault when removing items

e978cee

remove redundant include

9081b3e

More comprehensive test for ParticipantRegistry

bd9a602

Add support for YAML logging to disk

2812507

fix parsing issues

65db90a

test normal serialization

a4ab4f2

Integrate with main node

cd76278

Undo early experimentation

74c58e9

more undos

353b0ab

clear last disturbances

1aefc03

add mutexes (required to make sure commits are in order)

69fa936

More comprehensive tests

9d3467c

arjo129 requested review from mxgrey and Yadunund January 4, 2021 05:12

mxgrey suggested changes Jan 19, 2021

View reviewed changes

arjo129 added 4 commits January 22, 2021 17:42

Address *some* of the feedback.

6171948

- Switched to unique pointers - Added new line at end of the file - Moved serialization functions into internal header. - Removed `remove()`

More expressive error messages.

7ace85d

Minor style fixes

68d32c4

exceptions on sentinel values.

3e87605

Use std::filesystem instead of hack.

e8d2a62

mxgrey added 5 commits February 24, 2021 12:38

Merge branch 'master' into feature/persistent-ids

aed4e38

Merge branch 'feature/persistent-ids' of ssh://github.com/osrf/rmf_co…

842b7e5

…re into feature/persistent-ids

Added participant restarting

4dfbb8b

Add last Route ID info to registration response

711f85f

Clean up parser implementation

2749ead

mxgrey approved these changes Feb 25, 2021

View reviewed changes

Update CHANGELOG

98231e8

Signed-off-by: Michael X. Grey <grey@openrobotics.org>

mxgrey merged commit e8dc70b into master Feb 25, 2021

mxgrey deleted the feature/persistent-ids branch February 25, 2021 06:57

mxgrey mentioned this pull request Mar 3, 2021

Allow schedule participant profiles (footprint/vicinity) to change/update #309

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/persistent ids #242

Feature/persistent ids #242

arjo129 commented Dec 31, 2020 •

edited

Loading

mxgrey left a comment •

edited

Loading

mxgrey Jan 18, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

mxgrey Jan 19, 2021

arjo129 commented Jan 25, 2021

mxgrey commented Jan 26, 2021

arjo129 commented Feb 4, 2021

-    _counter = 0;
+    _counter = 0;
+    if (!std::filesystem::exists(file_path))
+    {
+      std::filesystem::create_directories(
+            std::filesystem::absolute(file_path).parent_path());
+      return;
+    }

Feature/persistent ids #242

Feature/persistent ids #242

Conversation

arjo129 commented Dec 31, 2020 • edited Loading

Expected behaviours and failure modes

mxgrey left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arjo129 commented Jan 25, 2021

mxgrey commented Jan 26, 2021

arjo129 commented Feb 4, 2021

arjo129 commented Dec 31, 2020 •

edited

Loading

mxgrey left a comment •

edited

Loading