Skip to content

fix: handle read-while-write race on online redo LWN blocks#10

Merged
rophy merged 3 commits intomasterfrom
fix/online-redo-lwn-race
Mar 25, 2026
Merged

fix: handle read-while-write race on online redo LWN blocks#10
rophy merged 3 commits intomasterfrom
fix/online-redo-lwn-race

Conversation

@rophy
Copy link
Owner

@rophy rophy commented Mar 25, 2026

Summary

Fixes #8. OLR crashed with ERROR 50051/50053 when reading online redo logs
under heavy write load due to a read-while-write race on LWN block headers.

Root Cause

Oracle writes redo blocks in stages. OLR can read a block where the
header/CRC is valid but the LWN vld flag or record size fields are
not yet complete. Previously OLR threw a fatal error and crashed.

Confirmed by hex-dump: crash block showed vld=0x05 (valid LWN)
after OLR crashed — Oracle completed the write between OLR's read and
the file copy.

Fix

For online redo (group != 0): treat incomplete LWN blocks as "not
ready yet" — break and let the Reader re-read the block later. Timeout
after 30s with a clear error message.

For archived redo (group == 0): unchanged — throw immediately since
archives are fully written.

Test plan

  • Before fix: Swingbench 4 users + validator → OLR crashed on attempt 1
  • After fix: 612,092 events matched between LogMiner and OLR, zero crashes
  • RAC e2e regression: 54 passed, 0 failures

Summary by CodeRabbit

  • Bug Fixes
    • Online redo parsing now retries re-reading partially written or incomplete blocks with short timed retries instead of failing immediately; header/member validation and oversized-record checks in online mode will retry and only time out after repeated attempts.
  • Tests
    • Validation updated to skip BENCHMARK_RESULTS events when generating match keys.

When reading online redo logs, Oracle may be in the process of writing
a block. The block header (sequence, CRC) can be valid while the LWN
vld flag or record size fields are not yet complete. Previously OLR
crashed with ERROR 50051 or 50053.

For online redo (group != 0): treat incomplete LWN blocks as "not
ready yet" — break and let the Reader re-read the block later. Track
how long we've been waiting and timeout after 30s with a clear error.

For archived redo (group == 0): keep existing behavior (throw
immediately) since archives are fully written.

Confirmed by hex-dump: crash block showed valid vld=0x05 after OLR
crashed, proving Oracle completed the write between OLR's read and
the file copy. Reproduced reliably with Swingbench under load.
@coderabbitai
Copy link

coderabbitai bot commented Mar 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough
📝 Walkthrough
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title precisely matches the main change: implementing retry logic for read-while-write races on online redo LWN blocks.
Linked Issues check ✅ Passed The code changes fully address issue #8: LWN validation failures now retry with exponential backoff instead of failing immediately for online redo, preserving immediate failure for archived redo.
Out of Scope Changes check ✅ Passed The validator.py change to exclude BENCHMARK_RESULTS from event matching is a minor adjustment supporting the perf test scenario and is reasonably related to validating the fix under load.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/online-redo-lwn-race

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rophy
Copy link
Owner Author

rophy commented Mar 25, 2026

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 25, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/parser/Parser.cpp`:
- Around line 1366-1367: The local variable lwnMidWriteStart and its constexpr
LWN_MID_WRITE_TIMEOUT_S must be converted into Parser state so the mid-write
deadline persists across yields/resumes and is not reinitialized to 0 on every
parse() entry; move lwnMidWriteStart into the Parser class (e.g. as a member
like m_lwnMidWriteStart) and keep LWN_MID_WRITE_TIMEOUT_S as a constant, then
update parse() to set the start time only the first time the parser begins
waiting for the mid-write and to check the timeout against the persistent member
on subsequent resumes; additionally clear/reset this member when the exact
pending header or record is recovered/validated (i.e. when the pending LWN
header becomes valid or a recovered record-size wait completes) so old deadlines
cannot leak into later waits within the same LWN group.
- Around line 1474-1485: The retry logic currently does a local `break` which
still lets the outer loop advance `currentBlock` and `confirmedBufferStart` and
leaves `lwnAllocated` incremented; instead, stop advancing the block and bubble
the retry to the outer block loop: when detecting a mid-write (the branch with
lwnMidWriteStart and LWN_MID_WRITE_TIMEOUT_S) set a retry flag (or use continue
on the outer loop) so you do not increment `currentBlock` or
`confirmedBufferStart`, and roll back or defer any `lwnAllocated`/chunk
allocation performed earlier (undo increment of lwnAllocated or postpone
allocation until recordSize is validated) to avoid exhausting MAX_LWN_CHUNKS;
ensure the code path that currently does `break` instead either continues the
outer block-processing loop or returns to re-read the same block so the Reader
will re-fetch the same block for retry.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f957a12f-b19e-43ce-a614-37db9705785a

📥 Commits

Reviewing files that changed from the base of the PR and between 2d5bc8a and aaaa329.

📒 Files selected for processing (1)
  • src/parser/Parser.cpp

Comment on lines +1366 to +1367
time_t lwnMidWriteStart = 0;
static constexpr time_t LWN_MID_WRITE_TIMEOUT_S = 30;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Persist the mid-write timer across resumes, and reset it on record recovery too.

If this parser yields while waiting, lwnMidWriteStart is rebuilt as 0 on the next parse() call, so the 30s budget never accumulates. It's also only cleared on a valid LWN header here, so a recovered record-size wait can leak its old deadline into a later wait in the same LWN group. This needs to live in Parser state and be cleared when the exact pending header/record becomes valid again.

Also applies to: 1406-1406

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/parser/Parser.cpp` around lines 1366 - 1367, The local variable
lwnMidWriteStart and its constexpr LWN_MID_WRITE_TIMEOUT_S must be converted
into Parser state so the mid-write deadline persists across yields/resumes and
is not reinitialized to 0 on every parse() entry; move lwnMidWriteStart into the
Parser class (e.g. as a member like m_lwnMidWriteStart) and keep
LWN_MID_WRITE_TIMEOUT_S as a constant, then update parse() to set the start time
only the first time the parser begins waiting for the mid-write and to check the
timeout against the persistent member on subsequent resumes; additionally
clear/reset this member when the exact pending header or record is
recovered/validated (i.e. when the pending LWN header becomes valid or a
recovered record-size wait completes) so old deadlines cannot leak into later
waits within the same LWN group.

@rophy rophy force-pushed the fix/online-redo-lwn-race branch from 39c0627 to 5d4780e Compare March 25, 2026 07:43
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/reader/Reader.h (1)

140-140: Mark reReadBlock as [[nodiscard]] to prevent ignored read results.

At Line 140, this method’s return value is operationally important in retry logic. Marking it [[nodiscard]] helps catch accidental ignored failures at compile time.

Suggested API tweak
-        int reReadBlock(uint8_t* buffer, uint64_t offset, uint size);
+        [[nodiscard]] int reReadBlock(uint8_t* buffer, uint64_t offset, uint size);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/reader/Reader.h` at line 140, The declaration of reReadBlock(uint8_t*
buffer, uint64_t offset, uint size) should be marked [[nodiscard]] so callers
cannot accidentally ignore its return value; update the declaration in Reader.h
to [[nodiscard]] int reReadBlock(...), and make the corresponding change to its
definition/override(s) in the implementation files (e.g., Reader.cpp and any
subclasses) so the attribute is present everywhere the symbol is defined to
ensure compile-time warnings when return values are ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/reader/Reader.cpp`:
- Around line 763-765: Reader::reReadBlock currently forwards to redoRead and
returns whatever byte count redoRead returns, but callers expect a full-block
re-read; a short read leaves stale bytes in buffer. Change reReadBlock (the
method named reReadBlock) to call redoRead(buffer, offset, size), check the
returned byte count, and if it is not equal to size treat that as a failure
(return an error code such as -1 or another existing failure sentinel used by
the codebase) so callers can detect and retry/abort; do not let partial-byte
counts propagate as success.

---

Nitpick comments:
In `@src/reader/Reader.h`:
- Line 140: The declaration of reReadBlock(uint8_t* buffer, uint64_t offset,
uint size) should be marked [[nodiscard]] so callers cannot accidentally ignore
its return value; update the declaration in Reader.h to [[nodiscard]] int
reReadBlock(...), and make the corresponding change to its
definition/override(s) in the implementation files (e.g., Reader.cpp and any
subclasses) so the attribute is present everywhere the symbol is defined to
ensure compile-time warnings when return values are ignored.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 53b8143b-4cb0-4755-9eac-6600609ed836

📥 Commits

Reviewing files that changed from the base of the PR and between 39c0627 and 5d4780e.

📒 Files selected for processing (3)
  • src/parser/Parser.cpp
  • src/reader/Reader.cpp
  • src/reader/Reader.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/parser/Parser.cpp

@rophy rophy force-pushed the fix/online-redo-lwn-race branch 2 times, most recently from af93fdb to 4617808 Compare March 25, 2026 13:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/parser/Parser.cpp`:
- Around line 1423-1428: The reReadBlock calls in the group retry branches
(triggered when group != 0 with lwnRereadAttempts++) can silently overwrite
redoBuffer bytes with raw data and return only >0/-1; instead, after
reader->reReadBlock(...) succeeds you must revalidate the freshly-read buffer
with Reader::checkBlockHeader (the same validation used on the normal path) and
handle special results like OVERWRITTEN or EMPTY by retrying (e.g., goto
lwnBlockRetry or increment attempts) or failing appropriately; update each
occurrence (the blocks that call reader->reReadBlock and then continue parsing —
e.g., the branch containing lwnRereadAttempts/MAX_REREAD_ATTEMPTS, and the
similar sections at the other noted locations) to run checkBlockHeader on the
buffer and conditionally proceed only when the header is valid.
- Around line 1498-1513: The loop can mark resolved=true even when recordSize4
becomes 0 after rereads, which leads to creating a zero-length LwnMember; update
the retry/validation logic around recordSize4 (the value filled by ctx->read32)
so you only set resolved = true and break when recordSize4 > 0 and the size
check (((*recordSize + sizeof(LwnMember) + recordSize4 + 7) & 0xFFFFFFF8) <=
Ctx::MEMORY_CHUNK_SIZE_MB * 1024 * 1024) passes; if recordSize4 == 0 continue
retrying instead of breaking, and keep the existing RedoLogException paths
(50053) for timeouts/oversize unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 07fe9cd8-ae73-4e14-a5d5-a3593e2c74b9

📥 Commits

Reviewing files that changed from the base of the PR and between af93fdb and 4617808.

📒 Files selected for processing (4)
  • src/parser/Parser.cpp
  • src/reader/Reader.cpp
  • src/reader/Reader.h
  • tests/sql/environments/rac/debezium/perf/validator.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/reader/Reader.h

@rophy
Copy link
Owner Author

rophy commented Mar 25, 2026

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 25, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/parser/Parser.cpp (2)

1498-1503: ⚠️ Potential issue | 🔴 Critical

Guard against recordSize4 == 0 to avoid constructing zero-length records.

After the reread at line 1498, recordSize4 could become 0 (e.g., if Oracle hasn't written the size field yet). The condition at line 1499 only checks if the size fits within the memory chunk, which is true when recordSize4 == 0. This causes:

  1. resolved = true and the loop exits
  2. An LwnMember is created with size = 0 (line 1517)
  3. Later in analyzeLwn (line 130-133), this triggers error 50046 or produces undefined behavior
🐛 Proposed fix
                                        recordSize4 = (static_cast<uint64_t>(ctx->read32(redoBlock + blockOffset)) + 3U) & 0xFFFFFFFC;
-                                        if (((*recordSize + sizeof(LwnMember) + recordSize4 + 7) & 0xFFFFFFF8) <= Ctx::MEMORY_CHUNK_SIZE_MB * 1024 * 1024) {
+                                        if (recordSize4 > 0 &&
+                                            ((*recordSize + sizeof(LwnMember) + recordSize4 + 7) & 0xFFFFFFF8) <= Ctx::MEMORY_CHUNK_SIZE_MB * 1024 * 1024) {
                                             resolved = true;
                                             break;
                                         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/parser/Parser.cpp` around lines 1498 - 1503, After re-reading
`recordSize4` you must guard against the case where `recordSize4 == 0` before
accepting `resolved = true`; update the conditional that currently tests the fit
against `Ctx::MEMORY_CHUNK_SIZE_MB * 1024 * 1024` to also require `recordSize4
!= 0` (so you don't create an `LwnMember` with zero size), ensuring any logic
that builds an `LwnMember` (the code path that uses `*recordSize`,
`sizeof(LwnMember)`, and `recordSize4`) only proceeds when `recordSize4` is
non-zero; this prevents the zero-length record reaching `analyzeLwn`.

1423-1428: ⚠️ Potential issue | 🟠 Major

Missing block header validation after reread may cause silent data corruption.

After reReadBlock() succeeds, the code immediately jumps to lwnBlockRetry and continues parsing without revalidating the block header. The normal read path uses Reader::checkBlockHeader (per context snippet at src/reader/Reader.cpp:110-169) to verify CRC, sequence, and detect OVERWRITTEN/EMPTY states. These retry paths bypass that validation.

If Oracle overwrites the block with a different sequence during the 30-second retry window, the parser could process stale or corrupted data without detecting the overwrite.

🛡️ Suggested fix pattern
                                if (reader->reReadBlock(reader->redoBufferList[redoBufferNum] + redoBufferPos,
                                                        static_cast<uint64_t>(currentBlock) * reader->getBlockSize(),
-                                                        reader->getBlockSize()) > 0)
+                                                        reader->getBlockSize()) > 0) {
+                                    // Revalidate block header after reread
+                                    auto headerCheck = reader->checkBlockHeader(
+                                        reader->redoBufferList[redoBufferNum] + redoBufferPos, currentBlock, false);
+                                    if (headerCheck == Reader::REDO_CODE::OK)
                                        goto lwnBlockRetry;
+                                    else if (headerCheck == Reader::REDO_CODE::OVERWRITTEN)
+                                        throw RedoLogException(50056, "block overwritten during retry at offset: " + 
+                                                               confirmedBufferStart.toString());
+                                    // else continue retrying
+                                }

This concern also applies to the retry blocks at lines 1435-1440, 1454-1459, and 1491-1496.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/parser/Parser.cpp` around lines 1423 - 1428, After a successful
reReadBlock call the code currently jumps to lwnBlockRetry without revalidating
the block header; update the retry paths so that after reader->reReadBlock(...)
returns > 0 you call Reader::checkBlockHeader (or the same header validation
used in the normal read path) on the buffer at
reader->redoBufferList[redoBufferNum] + redoBufferPos (using currentBlock and
reader->getBlockSize() as before), and only jump to lwnBlockRetry if
checkBlockHeader passes; apply the same addition to the other retry blocks that
use lwnRereadAttempts/reReadBlock (the blocks that mirror this pattern) so
CRC/sequence/OVERWRITTEN/EMPTY checks are enforced before resuming parsing.
🧹 Nitpick comments (1)
src/parser/Parser.cpp (1)

1401-1403: Consider consolidating duplicate MAX_REREAD_ATTEMPTS definitions.

The constant MAX_REREAD_ATTEMPTS = 300 is defined twice: once at line 1401 (for LWN header retries) and again at line 1489 (for record size retries). While this works due to different scopes, consolidating to a single class-level or file-level constant would improve maintainability.

🔧 Suggested refactor
+namespace {
+    constexpr int MAX_REREAD_ATTEMPTS = 300;  // 300 * 100ms = 30s
+}
+
 // ... in parse() function ...
-                    static constexpr int MAX_REREAD_ATTEMPTS = 300;  // 300 * 100ms = 30s
                     int lwnRereadAttempts = 0;
 // ...
-                                    static constexpr int MAX_REREAD_ATTEMPTS = 300;
                                     bool resolved = false;

Also applies to: 1489-1489

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/parser/Parser.cpp` around lines 1401 - 1403, Consolidate the duplicate
MAX_REREAD_ATTEMPTS definitions by promoting it from the local blocks to a
single shared constant (e.g., a static constexpr in the Parser class or a
file-level anonymous-namespace constant) and remove the local definitions;
update uses in the LWN retry logic (lwnRereadAttempts and the lwnBlockRetry
loop) and the record-size retry logic (recordSizeRereadAttempts) to reference
that single constant so both retry loops use the same MAX_REREAD_ATTEMPTS
symbol.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/parser/Parser.cpp`:
- Around line 1498-1503: After re-reading `recordSize4` you must guard against
the case where `recordSize4 == 0` before accepting `resolved = true`; update the
conditional that currently tests the fit against `Ctx::MEMORY_CHUNK_SIZE_MB *
1024 * 1024` to also require `recordSize4 != 0` (so you don't create an
`LwnMember` with zero size), ensuring any logic that builds an `LwnMember` (the
code path that uses `*recordSize`, `sizeof(LwnMember)`, and `recordSize4`) only
proceeds when `recordSize4` is non-zero; this prevents the zero-length record
reaching `analyzeLwn`.
- Around line 1423-1428: After a successful reReadBlock call the code currently
jumps to lwnBlockRetry without revalidating the block header; update the retry
paths so that after reader->reReadBlock(...) returns > 0 you call
Reader::checkBlockHeader (or the same header validation used in the normal read
path) on the buffer at reader->redoBufferList[redoBufferNum] + redoBufferPos
(using currentBlock and reader->getBlockSize() as before), and only jump to
lwnBlockRetry if checkBlockHeader passes; apply the same addition to the other
retry blocks that use lwnRereadAttempts/reReadBlock (the blocks that mirror this
pattern) so CRC/sequence/OVERWRITTEN/EMPTY checks are enforced before resuming
parsing.

---

Nitpick comments:
In `@src/parser/Parser.cpp`:
- Around line 1401-1403: Consolidate the duplicate MAX_REREAD_ATTEMPTS
definitions by promoting it from the local blocks to a single shared constant
(e.g., a static constexpr in the Parser class or a file-level
anonymous-namespace constant) and remove the local definitions; update uses in
the LWN retry logic (lwnRereadAttempts and the lwnBlockRetry loop) and the
record-size retry logic (recordSizeRereadAttempts) to reference that single
constant so both retry loops use the same MAX_REREAD_ATTEMPTS symbol.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a3599b82-7ef3-4374-83e0-4185951dedfc

📥 Commits

Reviewing files that changed from the base of the PR and between af93fdb and 4617808.

📒 Files selected for processing (4)
  • src/parser/Parser.cpp
  • src/reader/Reader.cpp
  • src/reader/Reader.h
  • tests/sql/environments/rac/debezium/perf/validator.py
✅ Files skipped from review due to trivial changes (1)
  • tests/sql/environments/rac/debezium/perf/validator.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/reader/Reader.cpp
  • src/reader/Reader.h

When reading online redo logs, Oracle may be writing a block that has a
valid header/CRC but incomplete LWN content. The Reader's buffer caches
the stale copy.

For online redo (group != 0): when LWN validation fails (50049, 50050,
50051) or record size is invalid (50053), re-read the block from disk
via Reader::reReadAndValidate() — which does pread + checkBlockHeader
to verify the re-read is valid (correct seq, CRC, not OVERWRITTEN).
Retry up to 300 times (30s). On OVERWRITTEN, stop gracefully.

For archived redo (group == 0): unchanged, throw immediately.

Also excludes BENCHMARK_RESULTS from validator (Swingbench internal
table with timing mismatch at test end).

Validated: 517,995 events matched under Swingbench 4-user load,
126/126 redo fixture tests pass, 54/54 RAC SQL e2e pass.
@rophy rophy force-pushed the fix/online-redo-lwn-race branch from 4617808 to b534546 Compare March 25, 2026 14:37
@rophy rophy merged commit 26bacac into master Mar 25, 2026
2 checks passed
@rophy rophy deleted the fix/online-redo-lwn-race branch March 25, 2026 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: OLR crashes reading online redo with stale/long-lived sequence

1 participant