Skip to content

oracledb_cdc: handle LOB_TRIM redo SQL separately from SELECT_LOB_LOCATOR#4430

Merged
josephwoodward merged 4 commits into
mainfrom
jw/oracledblobtrim
May 19, 2026
Merged

oracledb_cdc: handle LOB_TRIM redo SQL separately from SELECT_LOB_LOCATOR#4430
josephwoodward merged 4 commits into
mainfrom
jw/oracledblobtrim

Conversation

@josephwoodward
Copy link
Copy Markdown
Contributor

@josephwoodward josephwoodward commented May 15, 2026

When updating an existing LOB field via a SQL UPDATE command, OracleDB will handle that update in a number of ways.

  1. SecureFile (Oracle Free 23c default): emits LOB_WRITE(s) → LOB_TRIM(N). The trim finalises the value after fragments are written, so clearing fragments on LOB_TRIM was the bug. May emit no DML UPDATE — in that case a synthetic UPDATE is synthesized from the assembled accumulator.

  2. BASICFILE: emits LOB_TRIM(0) → LOB_WRITE(s) (clear-then-write). The DML UPDATE is always present; fragments arrive after the trim, so LOB_TRIM(0) must be a no-op against the fragment list.

  3. BASICFILE out-of-row (DISABLE STORAGE IN ROW): no SELECT_LOB_LOCATOR is emitted. The inferLOBLocator path creates the accumulator from an existing DML event rather than from the locator SQL, and the DML UPDATE is always present for PK matching.

This change ensures we support these three variants and includes a test to verify each.

image

@claude
Copy link
Copy Markdown

claude Bot commented May 15, 2026

Commits
LGTM

Review
This is a focused bug fix that splits OpLobTrim handling out of the OpSelectLobLocator case in internal/impl/oracledb/logminer/logminer.go. The previous code path attempted to parse the LOB_TRIM SQL with ParseSelectLobLocator, which would have failed because LOB_TRIM's redo SQL is dbms_lob.trim(loc_b, N) rather than a SELECT ... INTO ... FROM statement. The new case correctly relies on the active LOB key established by a preceding SELECT_LOB_LOCATOR and clears the accumulator fragments to prepare for subsequent LOB_WRITE events. Nil/existence guards are in place.

LGTM

Copy link
Copy Markdown
Contributor

@Jeffail Jeffail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storage-variant breakdown in the description and the integration tests across SecureFile, BASICFILE, and BASICFILE out-of-row are a meaningful improvement, and the dbms_lob.trim(loc, N) finalisation vs. clear-then-write distinction is now correctly drawn. Approving overall.

A few non-blocking points worth surfacing:

1. Pass 3 fallback (internal/impl/oracledb/logminer/sqlredo/lob.go:189-201) — the code unconditionally selects the most-recent matching event, while the comment describes "exactly one candidate ... otherwise most recent". Largely a documentation cleanup, but there is a latent edge case: if a transaction contains two LOB-only UPDATEs on the same table for different rows and Oracle strips PK columns from both WHERE clauses, both accumulators bypass Pass 1/2 and converge on the same Pass 3 target — one row's LOB silently overwrites the other's. Likely rare, but worth either tightening the predicate or adding a warn-log when multiple accumulators target the same event.

2. LOB_TRIM(N) warning predicate (internal/impl/oracledb/logminer/logminer.go:362-371) — the warning only fires when N > 0 && len(Fragments) == 0. Two adjacent cases also produce a potentially incorrect assembled value but do not warn:

  • N < M (writes extend past N): Assemble produces M bytes; the real LOB is N bytes.
  • N > M with M > 0: a preserved prefix exists beyond what we have.

Assemble is also not currently truncated to N. The current scope catches the most blatant case (nothing to publish at all) and is defensible given SecureFile's typical full-rewrite UPDATE pattern; worth noting in the comment that this is an explicit tradeoff rather than a complete check.

3. Synthetic UPDATE shape (internal/impl/oracledb/logminer/logminer.go:433-442) — the synthesized event publishes only {<pk_columns>, <lob_column>}; other row columns are unavailable from redo. Downstream consumers expecting a full-row UPDATE will see a sparse payload. Is sparse-row UPDATE the intended downstream contract? It may be worth documenting, or carrying over column values from a prior INSERT/UPDATE in the same transaction when one is present.

4. (Minor) Form A in OpLobTrim (logminer.go:340) overwrites state.Accumulators[key] unconditionally, whereas OpSelectLobLocator (line 298) uses if !exists. The comment "establish (or reset)" signals intent, but if Form A can ever fire after fragments have already accumulated for the same key, those fragments would be lost. Worth confirming Oracle does not emit that sequence.

None of the above are blockers — happy for this to land.

@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Commits

  1. Both commits use sentence-case after the system prefix (Add, Fix), but the commit policy requires the message after system: to start lowercase in imperative mood (e.g., oracledb_cdc: add LOB_TRIM failing test, oracledb_cdc: fix LOB_TRIM handling ...).

Review
LGTM. The split of OpLobTrim from OpSelectLobLocator and the dual-form handling in logminer.go#L309-L370 is well-commented and tested against three storage variants. The MergeLOBsIntoDMLEvents return-unmerged change in lob.go#L120-L208 and synthesized UPDATE in logminer.go#L423-L443 correctly handle the SecureFile no-DML-UPDATE case.

LGTM

@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Commits

  1. Commits 1 and 2 have capitalized subject text after the system: prefix: oracledb_cdc: Add LOB_TRIM failing test and oracledb_cdc: Fix LOB_TRIM handling and LOB merge for all storage variants. Per the commit policy, after the system: prefix the message must start lowercase (add, fix). Commit 3 follows this correctly.
  2. Commit 1 introduces failing tests that only pass after Commit 2's fix lands, so commit 1 is not a self-contained logical change (CI on commit 1 alone would fail). These two commits should be squashed.

Review
LOB_TRIM is correctly split out from SELECT_LOB_LOCATOR with Form A / Form B handling, the synthetic UPDATE path looks consistent (OldValues and PKValues are both map[string]any), and Pass 3 of MergeLOBsIntoDMLEvents is appropriately conservative when no PK is available. Integration tests cover all three Oracle LOB storage variants.

LGTM

Comment thread internal/impl/oracledb/logminer/logminer.go Outdated
@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Commits

  1. oracledb_cdc: Add LOB_TRIM failing test — message after : should start lowercase (add, not Add) per the project commit policy ("message starts lowercase and uses imperative mood").
  2. oracledb_cdc: Fix LOB_TRIM handling and LOB merge for all storage variants — same: fix rather than Fix.

Commits 3 and 4 LGTM.

Review
The fix correctly splits LOB_TRIM from SELECT_LOB_LOCATOR, leaves the accumulator untouched in Form B (so SecureFile's write-then-trim ordering no longer destroys data), and adds an unmerged-accumulator return from MergeLOBsIntoDMLEvents plus synthetic UPDATE generation for the no-DML-UPDATE case. The Pass 3 schema/table fallback is correctly gated on a single candidate to avoid silent row collisions. Integration coverage exercises all three storage variants.

One minor issue (inline): the new doc block describing Form B in logminer.go still claims "we just clear accumulated fragments" / "clearing fragments is correct" when N=0, which contradicts the actual implementation and the inline comment a few lines below — see logminer.go#L319-L327.

@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Commits

  1. Commit 8e55d11 (oracledb_cdc: Add LOB_TRIM failing test) and commit 8476c42 (oracledb_cdc: Fix LOB_TRIM handling and LOB merge for all storage var…) start the message portion with an uppercase letter. Per the commit policy, after system: the message must start lowercase and use imperative mood — add LOB_TRIM failing test and fix LOB_TRIM handling … would conform.
  2. Commit 8476c42's subject line is truncated mid-word (…var…) with the remainder (…iants…) spilling into the body. The first line of the body is a continuation of the subject rather than a standalone paragraph. The subject line should be a self-contained sentence under the conventional length limit, with the explanatory paragraph starting fresh in the body.

Review

The change cleanly separates LOB_TRIM handling from SELECT_LOB_LOCATOR, documents the three Oracle LOB storage variants thoroughly, and the integration tests cover each variant end-to-end. The Pass 3 schema/table fallback is appropriately conservative (only merging on a single unambiguous candidate). The synthesized UPDATE flow is well-documented.

LGTM

…iants

SecureFile (Oracle Free 23c default): emits LOB_WRITE(s) → LOB_TRIM(N). The trim
finalises the value after fragments are written, so clearing fragments on LOB_TRIM was
the bug. May emit no DML UPDATE — in that case a synthetic UPDATE is synthesized from the
assembled accumulator.

BASICFILE: emits LOB_TRIM(0) → LOB_WRITE(s) (clear-then-write). The DML UPDATE is always
present; fragments arrive after the trim, so LOB_TRIM(0) must be a no-op against the
fragment list.

BASICFILE out-of-row (DISABLE STORAGE IN ROW): no SELECT_LOB_LOCATOR is emitted. The
inferLOBLocator path creates the accumulator from an existing DML event rather than from
the locator SQL, and the DML UPDATE is always present for PK matching.
Pass 3 of MergeLOBsIntoDMLEvents previously took the most-recent matching event unconditionally when no PK columns were available in the WHERE clause
(ROWID-only). If two LOB-only UPDATEs for different rows in the same transaction both reached Pass 3, the second accumulator's value would silently overwrite the
first on the same target event.

Pass 3 now collects all candidate events for the schema/table first. It only merges when exactly one candidate exists — the unambiguous case. When multiple
candidates exist, it warns and leaves the accumulator unmerged so the caller synthesizes a separate UPDATE per row instead. The comment previously described
"exactly one candidate" behaviour that the code did not implement; they now agree.
@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Commits

  1. Commits 1 and 2 capitalize the first word of the message after the system prefix (Add, Fix). Per the commit policy, message must start lowercase in imperative mood — should be oracledb_cdc: add LOB_TRIM failing test and oracledb_cdc: fix LOB_TRIM handling and LOB merge for all storage variants.
  2. Commit 2's subject line ends with var… (an ellipsis character), leaving the headline truncated/incomplete. Subject lines should be self-contained — the word variants (or similar) is missing.
  3. Granularity: commit 1 adds a failing integration test and commit 2 contains the fix. Splitting the failing test from its fix is not a self-contained logical change — they would typically be squashed together (the integration test is gated by integration.CheckSkip, so CI is not actually broken, but the split still violates the "one self-contained logical change" rule).

Review
LGTM.

@josephwoodward josephwoodward merged commit 6a606d2 into main May 19, 2026
8 checks passed
@josephwoodward josephwoodward deleted the jw/oracledblobtrim branch May 19, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants