Fix callback for second barrier not being attached #435

MathieuBordere · 2023-06-16T12:29:13Z

Related to #372 and possibly #431

The test in this PR fails because when all active open segments are already involved in a barrier, the next barrier will fail to be attached to an open segment and the barrier callback will never fire. My first - hacky - idea, would be to push a new open segment and attach the barrier to that one. Any other inspiration?

cole-miller · 2023-06-16T14:44:28Z

My first - hacky - idea, would be to push a new open segment and attach the barrier to that one.

Would this be a purely in-memory thing with no connection to any disk file?

freeekanayaka · 2023-06-16T15:12:29Z

Related to #372 and possibly #431

The test in this PR fails because when all active open segments are already involved in a barrier, the next barrier will fail to be attached to an open segment and the barrier callback will never fire. My first - hacky - idea, would be to push a new open segment and attach the barrier to that one. Any other inspiration?

The test is using the UvBarrier() internal function, in real world I assume there would be a call to raft_io->truncate() or raft_io->snapshot_put() instead?

I'd suggest to use only public APIs in these tests, so it's easier to understand what real-world scenarios they are exercising, and there's no risk of the test running logic that would not be run in real-world.

freeekanayaka · 2023-06-16T15:15:09Z

I'm mainly trying to first understand what real-world situation this test is simulating.

freeekanayaka · 2023-06-16T15:40:19Z

I didn't go through all the details, but an idea could be to add a queue for all barrier requests that are been submitted after all open segments are already attached to a barrier requests. The callbacks for those queued barrier requests should be fired once the last attached barrier completes.

freeekanayaka · 2023-06-17T15:08:13Z

I didn't go through all the details, but an idea could be to add a queue for all barrier requests that are been submitted after all open segments are already attached to a barrier requests. The callbacks for those queued barrier requests should be fired once the last attached barrier completes.

Or to make this a bit less ad-hoc and more general, all barrier requests could be put in a queue, each barrier would hold a pointer to the open segment they wait to be finalized (which is always the "active" open segment at the time the barrier request was placed). Whenever an open segment gets finalized, the queue should be scanned to see if there are barrier requests that can now be removed and the associated callback would be fired.

MathieuBordere · 2023-06-19T06:50:04Z

I'm mainly trying to first understand what real-world situation this test is simulating.

It happened in a Jepsen run, but unfortunately, a while ago and the test artifacts are no longer available. link

MathieuBordere · 2023-06-19T06:51:21Z

I didn't go through all the details, but an idea could be to add a queue for all barrier requests that are been submitted after all open segments are already attached to a barrier requests. The callbacks for those queued barrier requests should be fired once the last attached barrier completes.

Or to make this a bit less ad-hoc and more general, all barrier requests could be put in a queue, each barrier would hold a pointer to the open segment they wait to be finalized (which is always the "active" open segment at the time the barrier request was placed). Whenever an open segment gets finalized, the queue should be scanned to see if there are barrier requests that can now be removed and the associated callback would be fired.

That sounds interesting, will try to explore that.

freeekanayaka · 2023-06-19T08:25:20Z

I'm mainly trying to first understand what real-world situation this test is simulating.

It happened in a Jepsen run, but unfortunately, a while ago and the test artifacts are no longer available. link

The comment there seems to provide useful information to possibly to come up with a reproducer (perhaps something along the lines of calling raft_io->truncate() while raft_io->put_snapshot() is in progress.

How about trying to change the test in this PR to something along those lines (using public APIs only) and see if it ends up reproducing the broken disk data as in #372?

freeekanayaka · 2023-06-19T08:26:19Z

Just to be sure we really understand the issue before putting in place a solution.

Don't continue writing when someone has asked to stop writing new segments. Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

MathieuBordere · 2023-07-04T14:43:28Z

Happy jepsen run: https://github.com/canonical/jepsen.dqlite/actions/runs/5455855501

This is almost ready for review, still have some cleanup function I want to take a look at, because it will fail under certain scenarios I think.

codecov · 2023-07-05T13:04:14Z

Codecov Report

Merging #435 (02fb737) into master (ce3e318) will increase coverage by 0.06%.
The diff coverage is 84.41%.

@@            Coverage Diff             @@
##           master     canonical/raft#435      +/-   ##
==========================================
+ Coverage   76.73%   76.80%   +0.06%     
==========================================
  Files          51       51              
  Lines        9597     9651      +54     
  Branches     2443     2458      +15     
==========================================
+ Hits         7364     7412      +48     
- Misses       1160     1169       +9     
+ Partials     1073     1070       -3

Impacted Files	Coverage Δ
src/uv_finalize.c	`67.05% <66.66%> (+0.77%)`	⬆️
src/uv_snapshot.c	`55.57% <66.66%> (ø)`
src/uv_append.c	`85.80% <85.07%> (+1.04%)`	⬆️
src/uv_truncate.c	`57.50% <100.00%> (+1.56%)`	⬆️

... and 4 files with indirect coverage changes

MathieuBordere · 2023-07-05T13:20:03Z

I think this is ready for review, it surely fixes a bug where a barrier fails to attach if all segments are already involved in a barrier, I've used both internal and external API to reproduce, however still need to see if we can get into the state with incompatible segments from #372 and #431 , but I've already spent quite some time on this.

The idea is that a UvBarrier now has a queue of UvBarrierReq structs, The UvBarrierReq has a callback, when a segment gets finalized, UvBarrierMaybeTrigger is called and calls the first callback in the queue and then relies on UvUnblock to call UvBarrierMaybeTrigger again. When the queu is empty the uv->barrier can be removed. The advantage is that the barrier logic can be kept as is, without fundamentally reworking it, leading to a potential new can of worms.

MathieuBordere · 2023-07-05T13:29:40Z

jepsen run not entirely happy, couple of core dumps, unfortunately canonical/jepsen.dqlite#105 is getting in the way ...

edit: looks happier after a small fix, one occurrence of #386
edit2: clean run https://github.com/canonical/jepsen.dqlite/actions/runs/5455855501

cole-miller

Nice work! I have a few questions, mostly just making sure I understand what's going on since this is a complicated subsystem.

src/uv_append.c

src/uv_truncate.c

src/uv_append.c

cole-miller · 2023-07-06T13:07:38Z

please test downstream

test/integration/test_uv_truncate_snapshot.c

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

This is for consistency with uvSnapshotPutBarrierCb where this is also done. Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

MathieuBordere · 2023-07-06T13:32:06Z

please test downstream

cole-miller · 2023-07-06T13:56:57Z

src/uv_append.c

+            /* Fire all barrier cb's, this is safe because the barrier cb exits
+             * early when uv->closing is true. */
+            uvBarrierTriggerAll(barrier);
+            RaftHeapFree(barrier);
        }
        /* The segment->barrier field is used:
         *


I'm not sure how to interpret this (pre-existing) comment.

src/uv_append.c

freeekanayaka · 2023-07-06T15:04:11Z

I've just started to look at the diff, before going deep here are few questions to better understand the broader picture.

I think this is ready for review, it surely fixes a bug where a barrier fails to attach if all segments are already involved in a barrier, I've used both internal and external API to reproduce

I guess "I've used external API to reproduce" means that the snapshot_truncate/snapshotThenTruncate test introduced in this PR is triggering the bug that you are mentioning?

, however still need to see if we can get into the state with incompatible segments from #372 and #431 , but I've already spent quite some time on this.

Is there anything specific that came out from inspecting the Jepsen logs?

Also, what about the other idea of eliminating non-blocking barriers? I presume that it was something "unrelated" that you wanted to do regardless of the bug being fixed in this PR, in order to simplify the current logic and then possibly get a more clear picture of the situation?

The idea is that a UvBarrier now has a queue of UvBarrierReq structs, The UvBarrierReq has a callback, when a segment gets finalized, UvBarrierMaybeTrigger is called and calls the first callback in the queue and then relies on UvUnblock to call UvBarrierMaybeTrigger again. When the queu is empty the uv->barrier can be removed. The advantage is that the barrier logic can be kept as is, without fundamentally reworking it, leading to a potential new can of worms.

This would be basically created a "nested" structure? You have a top level barrier object, and then one or more barrier requests can be associated with the top level object. We'd ideally have a flat structure of barrier requests, but the nested approach is a way to minimize changes to existing code. Is that accurate?

MathieuBordere · 2023-07-06T15:17:24Z

I guess "I've used external API to reproduce" means that the snapshot_truncate/snapshotThenTruncate test introduced in this PR is triggering the bug that you are mentioning?

Yeah, it triggers the bug where the second barrier callback never fires.

Is there anything specific that came out from inspecting the Jepsen logs?

The logs are lost but in a comment #372 I say that the truncate barrier cb was never fired, don't have the proof anymore, but I trust the analysis I made back then.

Also, what about the other idea of eliminating non-blocking barriers? I presume that it was something "unrelated" that you wanted to do regardless of the bug being fixed in this PR, in order to simplify the current logic and then possibly get a more clear picture of the situation?

Yeah, that was to simplify the logic a bit, but that got me in other problems and decided it was not worth it.

This would be basically created a "nested" structure? You have a top level barrier object, and then one or more barrier requests can be associated with the top level object. We'd ideally have a flat structure of barrier requests, but the nested approach is a way to minimize changes to existing code. Is that accurate?

That's accurate.

freeekanayaka · 2023-07-06T15:50:59Z

Is there anything specific that came out from inspecting the Jepsen logs?

The logs are lost but in a comment #372 I say that the truncate barrier cb was never fired, don't have the proof anymore, but I trust the analysis I made back then.

Ok, so probably I confused the logs you had been looking at. I assume you were recently inspecting Jepsen logs with failures associated with the change to drop non-blocking barriers? If that's true, a write-up of what issues you found might be helpful in case we want to pick up the idea again later (see below).

One thing that puzzles me is that AFAIU the Jepsen failure in #372 seem to have not manifested again since then. Is that accurate or maybe it did occur and we might simply have missed it?

In theory such a failure should have some statistical occurrence. Even if it's rare, if you have, say, 100 Jepsen runs, it should manifest at least 1 time (I'm making up numbers, just to illustrate the point). It could also be that the failure is so rare that it did not occur once in the runs we had in months, that sounds a bit weird though, but maybe it's the reality.

Also, what about the other idea of eliminating non-blocking barriers? I presume that it was something "unrelated" that you wanted to do regardless of the bug being fixed in this PR, in order to simplify the current logic and then possibly get a more clear picture of the situation?

Yeah, that was to simplify the logic a bit, but that got me in other problems and decided it was not worth it.

This would be basically created a "nested" structure? You have a top level barrier object, and then one or more barrier requests can be associated with the top level object. We'd ideally have a flat structure of barrier requests, but the nested approach is a way to minimize changes to existing code. Is that accurate?

That's accurate.

That'd be fine, although I'm slightly concerned that we might be adding complexity without actually addressing the issue. Lacking better evidence, it's somehow fine to assume that the comment made in #372 is indeed pointing to precisely the bug being fixed here, but it might also be a red herring instead.

If we merge this change, I believe we'd have a couple of aspects that I would consider something we'd want to eventually improve:

non-blocking barriers: as we said, those are needed because we can't differentiate between an open segment containing entry 1, and an open segment containing entries past an installed snapshot, and ideally we'd instead want to introduce some way to differentiate between those two.
nested barrier requests: as we said, we'd ideally want to have a flat structure, which although might require a bit broader change, should result in a bit less complex code.

So, from a high-level point of view, if you are convinced that this PR solves both a Jepsen failure and something that happened in real-world, I'd be in favor of it, provided that we take note of 1. and 2. (perhaps opening issues) so we might possibly work on fixing them later down the road.

Still, the fact that there is no actual Jepsen failure occurring that we can look at is unfortunate.

freeekanayaka · 2023-07-06T18:09:48Z

test/integration/test_uv_truncate_snapshot.c

+
+    SNAPSHOT_CLEANUP();
+    return MUNIT_OK;
+}


Still trying to understand exactly the situation: if we managed to drop non-blocking barriers (by having a way to differentiate between open-segment-starting-at-1 and open-segment-past-a-snapshot), then there should be no need to have a barrier when a snapshot is taken right? I'm wondering if that'd mean that this test would pass.

In other words, the "run a non-blocking barrier when a snapshot is taken" change fixed the differentiation problem, but introduced the bug being fixed here. Is that accurate?

Indeed in the past Barriers were only used for Truncate operations, that are more rare than snapshots. With multiple truncate requests it's probably possible to hit the bug though (that was always there), but it is a lot easier to hit now.

An issue I encountered when removing the barrier when taking a snapshot, was that the truncate operation that runs after the snapshot (to remove a prefix of the log) could remove all closed segments, again leading to confusion if the open segments remaining in the data folder were newer or older than the snapshot. In the case of taking a snapshot and all closed segments being removed, the open segments were older, in the case of installing a snapshot the open segments will be newer, with no meaningful way to discern the situation. I then more or less felt that the barrier before a snapshot made the situation easier to reason about and that it was not that bad in the end (imo :-))

We could have come up with special rules when taking a snapshot like "Don't remove the last closed segment when truncating a prefix of the log otherwise we will be confused", but that feels very ad hoc and the meaning of the Trailing parameter loses its value somewhat. Or we could start adding special files in the directory to discern the situations but that also quickly becomes a mess IMO. The right solution would be to create a new segment format that encodes the start index, but it's a big(ish) change and we break backwards compatibility.

Indeed in the past Barriers were only used for Truncate operations, that are more rare than snapshots. With multiple truncate requests it's probably possible to hit the bug though (that was always there), but it is a lot easier to hit now.

An issue I encountered when removing the barrier when taking a snapshot, was that the truncate operation that runs after the snapshot (to remove a prefix of the log) could remove all closed segments, again leading to confusion if the open segments remaining in the data folder were newer or older than the snapshot. In the case of taking a snapshot and all closed segments being removed, the open segments were older, in the case of installing a snapshot the open segments will be newer, with no meaningful way to discern the situation. I then more or less felt that the barrier before a snapshot made the situation easier to reason about and that it was not that bad in the end (imo :-))

We could have come up with special rules when taking a snapshot like "Don't remove the last closed segment when truncating a prefix of the log otherwise we will be confused", but that feels very ad hoc and the meaning of the Trailing parameter loses its value somewhat. Or we could start adding special files in the directory to discern the situations but that also quickly becomes a mess IMO. The right solution would be to create a new segment format that encodes the start index, but it's a big(ish) change and we break backwards compatibility.

Ok, thanks for the write up, I can better see the whole picture now.

If there's something that simplifies the situation, that'd be good, otherwise ad-hoc rules are just going to flip the problem around.

Just for my understanding: a new segment format would break forward compatibility, right? In the sense that it'd be trick to support downgrades from a version that supports the new format to a version that doesn't. However backward compatibility should be fine (upgrading from old version to new version). Is that accurate?

Yes that's accurate, got confused by the terminology.

Ok, what about merging this change and at the same time opening a (longer term) issue to investigate the format change or other ideas?

If and when we find a solution that avoids the downsides you described, we could then simplify the code by removing non-blocking barriers and assume that barriers are used only for truncation (that assumption should help too for reducing complexity).

https://github.com/canonical/raft/issues/453

#453

Nice write-up, thanks.

MathieuBordere force-pushed the 2barriers branch from 54031b7 to 13a23af Compare June 23, 2023 12:38

MathieuBordere mentioned this pull request Jun 28, 2023

Remove blocking barrier distinctions #446

Closed

MathieuBordere force-pushed the 2barriers branch 2 times, most recently from 7accfb9 to de93e52 Compare July 4, 2023 10:11

barrier: Upgrade non-blocking barrier to blocking.

ad11428

Don't continue writing when someone has asked to stop writing new segments. Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

MathieuBordere force-pushed the 2barriers branch from de93e52 to fb80643 Compare July 4, 2023 14:13

MathieuBordere force-pushed the 2barriers branch 3 times, most recently from 7e8f867 to 2e42a07 Compare July 5, 2023 13:03

MathieuBordere marked this pull request as ready for review July 5, 2023 13:20

MathieuBordere force-pushed the 2barriers branch from 2e42a07 to 25a842a Compare July 5, 2023 13:43

MathieuBordere requested review from freeekanayaka and cole-miller July 5, 2023 14:12

cole-miller changed the title ~~test_uv_append: Failing test 2 barriers.~~ Fix callback for second barrier not being attached Jul 5, 2023

cole-miller reviewed Jul 5, 2023

View reviewed changes

src/uv_append.c Show resolved Hide resolved

src/uv_append.c Show resolved Hide resolved

src/uv_truncate.c Show resolved Hide resolved

MathieuBordere commented Jul 6, 2023

View reviewed changes

src/uv_append.c Show resolved Hide resolved

MathieuBordere force-pushed the 2barriers branch from 25a842a to bb92ccf Compare July 6, 2023 06:51

MathieuBordere commented Jul 6, 2023

View reviewed changes

src/uv_append.c Outdated Show resolved Hide resolved

cole-miller reviewed Jul 6, 2023

View reviewed changes

test/integration/test_uv_truncate_snapshot.c Outdated Show resolved Hide resolved

Mathieu Borderé added 5 commits July 6, 2023 15:14

test: Failing tests with multiple barriers.

2c499bb

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

uv: Trigger barrier callback(s) through function.

d5bc0a6

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

UvBarrier: A barrier can have multiple barrier requests.

50d72f2

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

uv_append: Only set uv->barrier = NULL when it's safe to do so.

23eb201

Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

uv_truncate: call uvMaybeFireCloseCb when closing in barrier cb.

02fb737

This is for consistency with uvSnapshotPutBarrierCb where this is also done. Signed-off-by: Mathieu Borderé <mathieu.bordere@canonical.com>

MathieuBordere force-pushed the 2barriers branch from 54915d6 to 02fb737 Compare July 6, 2023 13:14

cole-miller reviewed Jul 6, 2023

View reviewed changes

cole-miller approved these changes Jul 6, 2023

View reviewed changes

freeekanayaka reviewed Jul 6, 2023

View reviewed changes

MathieuBordere merged commit 1783708 into canonical:master Jul 7, 2023
19 checks passed

MathieuBordere mentioned this pull request Jul 7, 2023

tackle open segment start index ambiguity canonical/dqlite#578

Open

MathieuBordere mentioned this pull request Jul 27, 2023

Segment writes blocked when taking a snapshot. #461

Closed

cole-miller mentioned this pull request Jul 27, 2023

raft_start(): io: closed segment 0000000000003069-0000000000003157 is past last snapshot #372

Closed

MathieuBordere mentioned this pull request Aug 3, 2023

raft_start(): io: closed segment xxx is past last snapshot xxx #431

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix callback for second barrier not being attached #435

Fix callback for second barrier not being attached #435

MathieuBordere commented Jun 16, 2023 •

edited

Loading

cole-miller commented Jun 16, 2023

freeekanayaka commented Jun 16, 2023

freeekanayaka commented Jun 16, 2023

freeekanayaka commented Jun 16, 2023

freeekanayaka commented Jun 17, 2023

MathieuBordere commented Jun 19, 2023

MathieuBordere commented Jun 19, 2023

freeekanayaka commented Jun 19, 2023

freeekanayaka commented Jun 19, 2023

MathieuBordere commented Jul 4, 2023

codecov bot commented Jul 5, 2023 •

edited

Loading

MathieuBordere commented Jul 5, 2023

MathieuBordere commented Jul 5, 2023 •

edited

Loading

cole-miller left a comment

cole-miller commented Jul 6, 2023

MathieuBordere commented Jul 6, 2023

cole-miller Jul 6, 2023 •

edited

Loading

freeekanayaka commented Jul 6, 2023

MathieuBordere commented Jul 6, 2023

freeekanayaka commented Jul 6, 2023 •

edited

Loading

freeekanayaka Jul 6, 2023

MathieuBordere Jul 7, 2023 •

edited

Loading

freeekanayaka Jul 7, 2023 •

edited

Loading

MathieuBordere Jul 7, 2023

freeekanayaka Jul 7, 2023

MathieuBordere Jul 7, 2023

freeekanayaka Jul 7, 2023

Fix callback for second barrier not being attached #435

Fix callback for second barrier not being attached #435

Conversation

MathieuBordere commented Jun 16, 2023 • edited Loading

cole-miller commented Jun 16, 2023

freeekanayaka commented Jun 16, 2023

freeekanayaka commented Jun 16, 2023

freeekanayaka commented Jun 16, 2023

freeekanayaka commented Jun 17, 2023

MathieuBordere commented Jun 19, 2023

MathieuBordere commented Jun 19, 2023

freeekanayaka commented Jun 19, 2023

freeekanayaka commented Jun 19, 2023

MathieuBordere commented Jul 4, 2023

codecov bot commented Jul 5, 2023 • edited Loading

Codecov Report

MathieuBordere commented Jul 5, 2023

MathieuBordere commented Jul 5, 2023 • edited Loading

cole-miller left a comment

Choose a reason for hiding this comment

cole-miller commented Jul 6, 2023

MathieuBordere commented Jul 6, 2023

cole-miller Jul 6, 2023 • edited Loading

Choose a reason for hiding this comment

freeekanayaka commented Jul 6, 2023

MathieuBordere commented Jul 6, 2023

freeekanayaka commented Jul 6, 2023 • edited Loading

freeekanayaka Jul 6, 2023

Choose a reason for hiding this comment

MathieuBordere Jul 7, 2023 • edited Loading

Choose a reason for hiding this comment

freeekanayaka Jul 7, 2023 • edited Loading

Choose a reason for hiding this comment

MathieuBordere Jul 7, 2023

Choose a reason for hiding this comment

freeekanayaka Jul 7, 2023

Choose a reason for hiding this comment

MathieuBordere Jul 7, 2023

Choose a reason for hiding this comment

freeekanayaka Jul 7, 2023

Choose a reason for hiding this comment

MathieuBordere commented Jun 16, 2023 •

edited

Loading

codecov bot commented Jul 5, 2023 •

edited

Loading

MathieuBordere commented Jul 5, 2023 •

edited

Loading

cole-miller Jul 6, 2023 •

edited

Loading

freeekanayaka commented Jul 6, 2023 •

edited

Loading

MathieuBordere Jul 7, 2023 •

edited

Loading

freeekanayaka Jul 7, 2023 •

edited

Loading