Skip to content

fix(connectors): replace overloaded InvalidRecord with distinct error variants#3194

Open
atharvalade wants to merge 5 commits into
apache:masterfrom
atharvalade:fix/distinct-iceberg-sink-error-variants
Open

fix(connectors): replace overloaded InvalidRecord with distinct error variants#3194
atharvalade wants to merge 5 commits into
apache:masterfrom
atharvalade:fix/distinct-iceberg-sink-error-variants

Conversation

@atharvalade
Copy link
Copy Markdown
Contributor

@atharvalade atharvalade commented Apr 29, 2026

Which issue does this PR close?

Closes #3176

Rationale

Error::InvalidRecord was used for five unrelated failure modes in the Iceberg sink's write_data function, making it impossible for callers to distinguish schema mismatches from I/O failures from catalog outages.

What changed?

The Iceberg sink mapped Arrow schema conversion errors, Parquet write failures, and Iceberg catalog transaction failures all to Error::InvalidRecord. Callers could not programmatically decide whether to fix a table definition, skip a corrupt message, or retry a catalog outage.

Three new SDK error variants — SchemaMismatch(String), WriteFailure(String), CatalogError(String) — replace the overloaded InvalidRecord at the appropriate call sites. InvalidRecord is preserved only for the genuine record-batch deserialization error.

Local Execution

  • Passed
  • Pre-commit hooks ran

AI Usage

  • Opu 4.6
  • Writing comments, writing PR Description
  • Yes

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.49%. Comparing base (93227c1) to head (e3def27).
⚠️ Report is 30 commits behind head on master.

Files with missing lines Patch % Lines
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 0.00% 5 Missing ⚠️

❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (50.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@              Coverage Diff              @@
##             master    #3194       +/-   ##
=============================================
- Coverage     74.10%   58.49%   -15.62%     
  Complexity      943      943               
=============================================
  Files          1164     1163        -1     
  Lines        103048    92923    -10125     
  Branches      80081    69957    -10124     
=============================================
- Hits          76368    54351    -22017     
- Misses        23995    35971    +11976     
+ Partials       2685     2601       -84     
Components Coverage Δ
Rust Core 54.49% <0.00%> (-20.83%) ⬇️
Java SDK 60.14% <ø> (ø)
C# SDK 69.38% <ø> (-0.06%) ⬇️
Python SDK 81.43% <ø> (ø)
Node SDK 91.40% <ø> (-0.13%) ⬇️
Go SDK 39.60% <ø> (ø)
Files with missing lines Coverage Δ
core/connectors/sdk/src/lib.rs 56.17% <ø> (ø)
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 39.23% <0.00%> (ø)

... and 248 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread core/connectors/sdk/src/lib.rs Outdated
#[error("Write failure: {0}")]
WriteFailure(String),
/// A catalog or transaction-level failure (e.g. applying or committing an
/// Iceberg transaction). Callers may retry on transient catalog outages.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc "callers may retry on transient catalog outages" is misleading. action.apply() at router/mod.rs:213 is in-memory transaction prep - deterministic failures (invalid partition spec, schema validation) cannot be retried. only tx.commit(catalog) at router/mod.rs:222 hits the network. suggest dropping the retry claim, or splitting into ApplyError (deterministic) vs CommitError (transient-eligible).

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR.

Thank you for your contribution!

@github-actions github-actions Bot added stale Inactive issue or pull request and removed stale Inactive issue or pull request labels May 13, 2026
@hubcio
Copy link
Copy Markdown
Contributor

hubcio commented May 14, 2026

/author

@github-actions github-actions Bot added the S-waiting-on-author PR is waiting on author response label May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author PR is waiting on author response

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error::InvalidRecord is overloaded across unrelated failure modes

3 participants