Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction based sqlite3 inserts #225

Merged
merged 25 commits into from Apr 1, 2020
Merged

Conversation

sriramster
Copy link
Contributor

Hi @Karsten1987 in continuation with the discussion we've had about using trasactions.

This is a very high level implementation of how the code looks. Plaes have a look and let me know your thoughts.

https://gist.github.com/sriramster/2f9bdd620730855e6172c7136b56d6fc

Sriram

Copy link
Collaborator

@Karsten1987 Karsten1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start. I'll give it a try ASAP and see if I can extract some meaningful numbers and stats.

For the time being, a few remarks. Also please make sure that apart from colcon build ... you also run colcon test to address linters and style guides throughout your code.

@sriramster
Copy link
Contributor Author

Hi @Karsten1987 Thank you for the review and your time. I've updated some pieces based on your comments.

Let me know if you've a test script, and I'd be more than happy to test and share the numbers with you.

@Karsten1987 Karsten1987 mentioned this pull request Dec 13, 2019
@sriramster
Copy link
Contributor Author

Hi @Karsten1987 the branch compiles with master. Could you please have a look?

@sriramster
Copy link
Contributor Author

Hi @Karsten1987 Have you had time to look at the above patch set?

@sriramster
Copy link
Contributor Author

Hi Karsten,

Just trying to be on your github notifications, for this pull request. Have you had time to review this? Do I need to rework anywhere?

@Karsten1987
Copy link
Collaborator

I'll revisit it by the end of today. For the time being, could you rebase your PR on top of master? It looks like there are conflicting files.

@sriramster
Copy link
Contributor Author

@Karsten1987 done. Rebased.

Copy link
Collaborator

@Karsten1987 Karsten1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments.
We have to polish this a bit more before we can merge though. I am thinking about the ifdef and the hardcoded 10000 here.

@sriramster
Copy link
Contributor Author

Hi @Karsten1987

Fixed the points mentioned above, except for few things.

  • Warning user about active transaction. Do we need this? Since transactions are controlled by the plugin layer. And a user can not interact with a transaction, active or inactive. But it might be a point to debug if this why you intend to have it. I'll add it.
  • Adding the transaction number as a part of storage options, makes us change the open() API, from the base interfaces. Do you mind if I overload the function open() with a different control flow? The problem with this approach is, separation of reader and writer flow control APIs from class loading to sqlite changes.

Please let me know your thoughts on this. I'll update the review accordingly

Sorry for the delay, could'nt work over the weekends.

Copy link
Collaborator

@Karsten1987 Karsten1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a few outstanding comments.

Warning user about active transaction. Do we need this? Since transactions are controlled by the plugin layer. And a user can not interact with a transaction, active or inactive. But it might be a point to debug if this why you intend to have it. I'll add it.

I see your point here. If the transaction handling is completely handled internally, we don't need to do that (previously you had the activate/deactivate transition functions part of the public interface). We might however log to DEBUG instead.

Adding the transaction number as a part of storage options, makes us change the open() API, from the base interfaces. Do you mind if I overload the function open() with a different control flow? The problem with this approach is, separation of reader and writer flow control APIs from class loading to sqlite changes.

We might want to change the io interface then with a struct like rosbag2_storage::IOOptions or rosbag2_storage::IOConfig which includes the existing IOFlags and can then be extended with something like a chunk_size or any generic term which fits to the transactions size.

@sriramster
Copy link
Contributor Author

Hi @Karsten1987 I've cleaned up a bit focussing your review comments. And working on the chunk_size/transaction_size as an option. Kind of stuck at point, just updating you. Have'nt given up yet here.

@sriramster
Copy link
Contributor Author

Hi @Karsten1987 I've pushed in the changes as suggested by you. Could you please have a look and let me know if they fit into the model.

@Karsten1987
Copy link
Collaborator

@sriramster thanks a lot for not giving up on this ;-)

Up front: This PR has to be rebased once more as there're some merge conflicts.

I am seeing two potential ways on how to integrate a generic chunk_size option.

  • The SQLite plugin (in fact, every plugin) gets the chunk size passed in when being opened as read_write. It's then in the plugin's responsibility to deal appropriately with it. The open_read_write() function has therefore to be enhanced. I think this is what you've currently implemented in this PR.
  • We deal with chunk sizes exclusively in the rosbag2_cpp API. That is, we keep implement a cache in the c++ API which holds all incoming messages until it's succeeding the chunk size and only then asks the storage API to write it to disk. We therefore have to enhance the writer interface with an additional write(std::vector<SerializedBagMessage>) method which writes multiple messages at once.

The first option would give more leeway to each individual plugin to come up with an optimal way of dealing with chunk sizes. The second one, however, eases the burden of each plugin as the logic has to be implemented only once and is therefore less error-prone.

I believe I am tending towards the second option here as I imagine that we could technically also leverage sqlite3 transaction when writing multiple messages at once. I further think that keeping the serialized bag messages in a vector within rosbag2_cpp does not introduce a big computation overhead.

So essentially what I would imagine could work nicely:

// in rosbag2_cpp

rosbag2_cpp::Writer::write(
  std::shared_ptr<rosbag2_storage::SerializedBagMessage> message)
{
...
// There might exist an optimal data type for this,
// but at first a `std::vector` might do
cache.push_back(message);
if (cache.size() >= storage_options_.chunk_size) {
  // new signature of write in BaseWriterInterface to write multiple
  storage_->write(cache);
}
// in SqliteStorage plugin

void SqliteStorage::write(
  std::vector<std::shared_ptr<const rosbag2_storage::SerializedBagMessage> & messages)
{
  start_sqlite_transaction();
  for (auto & serialized_bag_message : messages) {
     write(serialized_bag_message);  // call to normal write
  } 
  commit_transaction();
}

Does this make sense what I writing here? What do you think?

@emersonknapp emersonknapp changed the title Transaction based sqlite3 insets Transaction based sqlite3 inserts Jan 29, 2020
@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros2-tooling-wg-next-meeting/12545/1

@sriramster
Copy link
Contributor Author

@Karsten1987 I'm reworking this. Been a bit busy at work. Will update you at the earliest, sorry about the delay.

@sriramster
Copy link
Contributor Author

Hi @Karsten1987 I've reworked the patch set.

I'd designed it based on the first method we'd discussed above to make sure the control of transactios is closer to the sqlite plugins only. But in a way, I feel moving it to a higher layer would give the flexibility of working with transactions and plugins per-say. Not all the plugins might have transactions that is one reason why this was sqlite3 specific. Just stating thought process behind the idea.

Pleae let me know if this patch set looks ok.
Thanks for the time.

@sriramster
Copy link
Contributor Author

Hi @Karsten1987 friendly reminder, have you had time to look into this?

@sriramster
Copy link
Contributor Author

Hi @Karsten1987 remind you again on this. :-)

Copy link
Collaborator

@Karsten1987 Karsten1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sriramster, I like that approach.

Can you please address all outstanding comments? I've added a few new ones, but there are a bunch of previously mentioned comments as well.
Once you've addressed all of them, please make sure to run colcon build and colcon test in your rosbag2 workspace to address all other compilation and style/test failures.

@Karsten1987
Copy link
Collaborator

@sriramster did you have a chance to look at the review? I would love to have this PR merged for the Foxy release.

Sriram Raghunathan and others added 12 commits March 31, 2020 17:08
The idea here is to move the transaction control to a higher layer
i.e rosbag2_cpp from that of lower level fine control.

This gives the flexibilty of independent plugin and caching
implementation wihtin the application

Signed-off-by: Sriram Raghunathan <rsriram7@visteon.com>
rosbag2_compression.

Signed-off-by: Sriram Raghunathan <rsriram7@visteon.com>
Signed-off-by: Sriram Raghunathan <rsriram7@visteon.com>
rosbag2_compression.

Signed-off-by: Sriram Raghunathan <rsriram7@visteon.com>
Signed-off-by: Sriram Raghunathan <rsriram7@visteon.com>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
@Karsten1987
Copy link
Collaborator

new round of CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
Signed-off-by: Karsten Knese <karsten@openrobotics.org>
@Karsten1987
Copy link
Collaborator

Karsten1987 commented Apr 1, 2020

In order to verify that the transactions are used correctly, I had to expose the max_cache_size through the ros2cli interface in bac3558.
I tried to put that step actually in a separate PR, but it turned out that the sqlite3 statements were wrongly placed. That's corrected in 83042cc.

With the option being exposed in the command line interface, I'll have to enhance this PR with some end-to-end tests. I could confirm locally though that the transactions are working currently with the current set changes.

@Karsten1987
Copy link
Collaborator

CI (with cache end to end test):

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Signed-off-by: Karsten Knese <karsten@openrobotics.org>
@Karsten1987
Copy link
Collaborator

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Copy link
Contributor

@zmichaels11 zmichaels11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with using 0 as default since it will be more consistent with max_bagfile_size = 0 also being default.

@sriramster
Copy link
Contributor Author

@Karsten1987 Thanks for considering this. I see a lot of activity here, hope this gets merged soon.

Copy link
Contributor

@piraka9011 piraka9011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending Green CI and Zach's comments addressed.

Signed-off-by: Karsten Knese <karsten@openrobotics.org>
@Karsten1987
Copy link
Collaborator

Karsten1987 commented Apr 1, 2020

one more round:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@Karsten1987 Karsten1987 merged commit f386849 into ros2:master Apr 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants