Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for race condition in rosbag2 recording tests with wait_for_db() #1141

Merged
merged 1 commit into from
Nov 2, 2022

Conversation

MichaelOrlov
Copy link
Contributor

@MichaelOrlov MichaelOrlov commented Oct 29, 2022

RCA

After some analysis @MichaelOrlov came to the conclusion that failure happening due to the race condition for the resource. i.e. database file.
In recording integration tests we usually spawn child process with ros2 bag record command and in parent process calling wait_for_db() function.
The problem is that wait_for_db() inside trying to open the same database file as recorder supposed to create for recording.

void wait_for_db()
{
const auto database_path = get_bag_file_path(0);
while (true) {
try {
std::this_thread::sleep_for(50ms); // wait a bit to not query constantly
if (database_path.exists()) {
rosbag2_storage_plugins::SqliteWrapper db{
database_path.string(), rosbag2_storage::storage_interfaces::IOFlag::READ_ONLY};
return;
}
} catch (const rosbag2_storage_plugins::SqliteException & ex) {
(void) ex; // still waiting
}
}
}

Some times wait_for_db() creating DB reader and locking database at the moment when recorder already created database but hasn't fully initialized it.
Somewhere in between one of the database_->prepare_statement(create_stmt)->execute_and_reset();
database_->prepare_statement(create_stmt)->execute_and_reset();
create_stmt = "CREATE TABLE messages(" \
"id INTEGER PRIMARY KEY," \
"topic_id INTEGER NOT NULL," \
"timestamp INTEGER NOT NULL, " \
"data BLOB NOT NULL);";
database_->prepare_statement(create_stmt)->execute_and_reset();
create_stmt = "CREATE INDEX timestamp_idx ON messages (timestamp ASC);";
database_->prepare_statement(create_stmt)->execute_and_reset();

Recorder fails in this case since database get occupied from another parent process.

Signed-off-by: Michael Orlov michael.orlov@apex.ai

- Get rid from attempt to open DB file in `wait_for_db()` to avoid
race condition with rosbag2 recorder.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
@MichaelOrlov MichaelOrlov marked this pull request as ready for review October 29, 2022 19:53
@MichaelOrlov MichaelOrlov requested a review from a team as a code owner October 29, 2022 19:53
@MichaelOrlov MichaelOrlov requested review from emersonknapp and jhdcs and removed request for a team October 29, 2022 19:53
@MichaelOrlov
Copy link
Contributor Author

Gist: https://gist.githubusercontent.com/MichaelOrlov/7b93540c8d427e4e59d775efd271a784/raw/542cc78201056376b2637c961bca22da156ba526/ros2.repos
BUILD args: --packages-above-and-dependencies rosbag2_tests rosbag2_transport
TEST args: --packages-above rosbag2_tests rosbag2_transport
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/11059

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MichaelOrlov MichaelOrlov merged commit fcafa27 into rolling Nov 2, 2022
@delete-merged-branch delete-merged-branch bot deleted the morlov/race-condition-in-recording-tests branch November 2, 2022 15:03
@MichaelOrlov
Copy link
Contributor Author

https://github.com/Mergifyio backport humble

mergify bot pushed a commit that referenced this pull request Nov 2, 2022
…1141)

- Get rid from attempt to open DB file in `wait_for_db()` to avoid
race condition with rosbag2 recorder.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
(cherry picked from commit fcafa27)
@mergify
Copy link

mergify bot commented Nov 2, 2022

backport humble

✅ Backports have been created

MichaelOrlov added a commit that referenced this pull request Nov 3, 2022
…1141) (#1148)

- Get rid from attempt to open DB file in `wait_for_db()` to avoid
race condition with rosbag2 recorder.

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
(cherry picked from commit fcafa27)

Co-authored-by: Michael Orlov <michael.orlov@apex.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition in rosbag2 recording tests with wait_for_db()
2 participants