Skip to content

feat(contrib): introduce contrib extension SPI#4339

Draft
schenksj wants to merge 27 commits into
apache:mainfrom
schenksj:comet-contrib-spi
Draft

feat(contrib): introduce contrib extension SPI#4339
schenksj wants to merge 27 commits into
apache:mainfrom
schenksj:comet-contrib-spi

Conversation

@schenksj
Copy link
Copy Markdown

@schenksj schenksj commented May 15, 2026

feat(contrib): introduce contrib extension SPI

Branch: comet-contrib-spi
Base: main
Commits: 18 (one architectural cluster + four review-fix rounds + a doc-completeness pass + the symmetric-distribution refactor)

Summary

Adds the infrastructure for contrib extensions — self-contained modules that ship as
part of Comet's release but plug in via a stable SPI rather than being hard-wired into
core. Core gains no functional behaviour change; with no contrib feature/profile
enabled, the dispatch hooks are no-ops and registered_contrib_kinds() is empty.

The first concrete contrib (contrib/example/) is a worked reference: Scala extension
classes, a CometScanRuleExtension implementation, a Rust rlib registering a
ContribOperatorPlanner, a META-INF/services/ entry, its own .proto schema with
prost-build wiring, and unit tests covering registration, proto decode round-trip, and
the error path. New contrib authors copy this directory layout.

This is the first of a two-PR sequence. PR2 (a follow-up off main after this
lands) will port the existing draft PR for delta-kernel-rs integration onto this SPI and into
contrib/delta/. The SPI shape has been validated end-to-end against a real Delta port
on a separate branch — see the testing section below.

Distribution model

Both halves of a contrib are bundled into Comet's released artifacts at build time when
their matching flags are enabled. Contribs are not independently distributable — they
ship inside Comet's release.

  • JVM half — Scala/Java sources under contrib/<name>/src/main/scala/, compiled
    INTO comet-spark.jar by activating -Pcontrib-<name> on spark/pom.xml. The
    contrib's META-INF/services/ entries go along for the ride; ServiceLoader at
    runtime discovers them from inside comet-spark.jar itself. The contrib has a tiny
    <packaging>pom</packaging> Maven pom that exists solely to enumerate external
    deps (e.g., a Delta contrib's pom would carry <dependency>io.delta:delta-spark</dependency>).
  • Native half — a Rust rlib crate (NOT cdylib) linked INTO libcomet via the
    matching --features contrib-<name> Cargo flag on native/core. The contrib's
    #[ctor] registers its operator planners during library load.

mvn install -Pcontrib-example && cargo build --features contrib-example produces a
Comet build that includes the example contrib in both comet-spark.jar and libcomet.
A vanilla mvn install && cargo build produces a build with zero contrib surface.

The wire format between JVM and native uses a single generic envelope on the operator
proto, ContribOp { kind, payload }. Core's planner dispatches by kind; the contrib's
native crate registers planners against the same kind string the contrib's JVM code
writes into the proto.

Architecture

                       JVM                                Native (single libcomet)

  CometSparkSessionExtensions.apply()         #[cfg(feature = "contrib-example")]
    │                                         extern crate comet_contrib_example;
    ├─ injectQueryStagePrepRule(                            │
    │    CometScanRule._apply                               ▼
    │      ├─ CometExtensionRegistry.load()    contrib's #[ctor] runs at lib load
    │      │    (lazy, first invocation)                    │
    │      │    discovers META-INF/services                 ▼
    │      │    inside comet-spark.jar         comet_contrib_spi::registry (ArcSwap)
    │      │                                   { "delta-scan"           → DeltaPlanner,
    │      ├─ preTransform fold pass             "example-no-op"        → NoOpPlanner,
    │      │   (V1 only, gated on                "example-constant-scan"→ ConstantScanPlanner }
    │      │   COMET_NATIVE_SCAN_ENABLED)                   │
    │      │                                                │
    │      └─ per-scan dispatch:               core's `OpStruct::ContribOp` arm:
    │           iterate registered exts          lookup_contrib_planner_by_kind(co.kind)
    │           first match wins                 .plan(ctx, co.payload, children)
    │                                                       │
    └─ injectQueryStagePrepRule(                ctx: ContribPlannerContext, gives the
         CometExecRule._apply                     contrib core's planner services:
           ├─ CometExtensionRegistry.load()       - session_ctx
           │    (lazy mirror of above)            - build_physical_expr
           │                                      - convert_spark_schema
           └─ (allExecs ∪                         - prepare_object_store
               CometExtensionRegistry              - build_parquet_datasource_exec
                 .mergedSerdes)
                .get(op.getClass)
       )
                       ───── wire format ─────
                           OpStruct.contrib_op {
                             kind:    "delta-scan",
                             payload: <contrib-private bytes,
                                       decoded by the contrib's own prost::Message>,
                             reserved 3 to 9;
                           }

What's in this PR (18 commits)

Foundational SPI (5 commits)

Commit What
51eb0fff ContribOp { kind, payload } proto envelope; Rust planner registry SPI
f448693b Native OpStruct::ContribOp dispatcher arm in planner.rs
f23500df Scala SPI: CometScanRuleExtension, CometOperatorSerdeExtension, CometExtensionRegistry
42234b96 CometScanRule (V1 + V2) and CometExecRule consult the registry
8b694715 CometSparkSessionExtensions.apply hooks the registry

Worked-reference contrib (3 commits)

Commit What
d1553b55 Rust half of contrib/example/: rlib crate, #[ctor] registration; introduced native/contrib-spi/ leaf crate to break a cyclic dep
5cb7099a JVM half of contrib/example/: Scala extension, ServiceLoader entry, integration test
8508ec50 First version of the contributor guide

SPI shape refinements (2 commits)

Commit What
e018076d Refinements from a Delta-port confidence check: preTransform tree-level hook on CometScanRuleExtension, proto layer in contrib/example/, class-keyed dispatch convention documented
14e49448 ContribPlannerContext trait + ParquetDatasourceParams argument bundle (SPI gap #4 — see "Delta-port confidence check" below)

Review-fix rounds (4 commits)

Commit What
8930b698 First review pass: test isolation, V2-asymmetry doc, #[non_exhaustive] markers, preTransform corruption guard, #[ctor] panic safety, drop contrib-example from default features, gate registry load lazily, cache mergedSerdes, multi-extension dispatch semantics, prepare_object_store returns Path, gate preTransform on COMET_NATIVE_SCAN_ENABLED, 16 MiB payload cap, "none discovered" diagnostic
68fff43f Second pass: CometExecRule self-loads, stale docstring, dead doc refs, Display wildcard fix, corruption-guard rewrite (identity check, class-changing replacements caught), ContribOp size guard ordering, encryption-asymmetry positional-arg test, owned String for session_timezone, production-canary #[cfg] test
e4e6e6c6 Third pass: IdentityHashMap survivors set, synchronized load() publication order, empty/whitespace kind rejection, doc accuracy, scope notes
6652963c Fourth pass: identity survivors + cost comment, resetForTesting synchronized, doc trims, whitespace kind rejection, #[cfg(not(any(...)))] form, public resetForTesting, wildcard-arm comment, Display debug repr, ContribOp proto reserved block, payload cap doc

Contributor guide completeness (2 commits)

Commit What
91c40e0a Comprehensive rewrite: prerequisites, JVM-side proto guidance, full plan() body walkthrough using every ContribPlannerContext method, CometOperatorSerde[T] contract, diagnostics story, multi-Spark-version, end-to-end testing recipe, Cargo-canary maintenance note
2c46552c Second-pass review fixes

Architectural pivot to symmetric distribution (2 commits)

Commit What
cf5253ed Bundle JVM half INTO comet-spark.jar via Maven profile. Mirrors the native side's "Cargo feature pulls rlib into libcomet" model. No more separate contrib JARs. ~70 lines of protobuf-shading boilerplate deleted from the contributor guide (shading now handled automatically by comet-spark's existing shade execution).
c7656fcc Deps-only pom per contrib so contribs like Delta can pull in external Maven deps (e.g. delta-spark) without recreating a Maven reactor cycle. Registry primitive: RwLock<HashMap>ArcSwap<HashMap> for lock-free reads on the dispatch hot path.

Notable design decisions

Why bundle into one artifact?

Previously the contrib's JVM half shipped as a separate Maven JAR users would
--packages or --jars onto their classpath. That asymmetry made no sense given the
native side already requires a Comet rebuild (Cargo feature flag) for the contrib to
work — pretending the JVM half was distributable independently was a fiction. The
symmetric design has one artifact per side, both varying based on which contribs were
enabled at Comet build time. This eliminated the protobuf-shading recipe that
externally-published contrib JARs needed (which was the single biggest source of doc
complexity).

Why a separate comet-contrib-spi Rust crate?

Cycle break. Core would need to depend on contribs (to link them); contribs need core's
trait types (ContribOperatorPlanner). Solution: a leaf crate both depend on, with
nothing depending back on core from a contrib.

Why ContribPlannerContext?

Surfaced by the Delta-port confidence check. A real file-scan contrib needs five
core-side facilities: init_datasource_exec, prepare_object_store_with_configs,
convert_spark_types_to_arrow_schema, expression-planning (create_expr), and a
SessionContext handle. Exposing these as a trait core implements (and contribs use
via &dyn ContribPlannerContext) avoids a back-dep on core while giving contribs
everything they need to compose with Comet's tuned parquet path.

Why ArcSwap instead of RwLock for the registry?

Reads are on the dispatch hot path; writes happen exclusively during library init from
sequential #[ctor]s. The init-once / read-many access pattern is what ArcSwap is
designed for. The original RwLock<HashMap> would have introduced reader-writer
contention with no actual concurrent-write workload to justify it.

Why bundling-via-source-injection rather than bundling-via-shaded-JAR?

A separate Maven module per contrib whose JAR gets shaded into comet-spark.jar
would form a Maven reactor cycle (contrib's pom depends on comet-spark for SPI
types; comet-spark's contrib profile depends on contrib's JAR). Source-injection
avoids the cycle: the contrib's Scala sources are compiled INSIDE comet-spark's
own compilation pass (via build-helper-maven-plugin's add-source goal); no
separate compile, no per-contrib JAR, no cycle. External Maven deps (delta-spark,
etc.) flow through the contrib's separate <packaging>pom</packaging> artifact.

The Delta-port confidence check

Before opening this PR, I ported Comet's existing Delta integration (~3,200 lines on
the delta-kernel-phase-1 branch) onto this SPI as a confidence check. The port
itself is not committed here — its purpose was to surface SPI gaps before review.

Four gaps were found, all addressed in this PR:

  1. No tree-level pre-pass hook → added CometScanRuleExtension.preTransform.
  2. No reference for the proto layer in contrib/example/ → added a trivial
    ExampleConstantScan message, build.rs, prost-build wiring, and tests.
  3. Class-keyed serde dispatch convention undocumented → documented in the
    contributor guide.
  4. ContribOperatorPlanner::plan lacked access to core's parquet / expression
    machinery
    → introduced ContribPlannerContext trait + ParquetDatasourceParams
    bundle in commit 14e49448. The full ~150-line Delta dispatcher body compiled
    clean against the new SPI surface; every trait method was exercised end-to-end.

Full findings live in PR1-delta-port-findings.md (not committed; review-prep
artifact).

Net conclusion: the SPI is the right shape for a real consumer. PR2's Delta port can
proceed mechanically with no further SPI surprises expected.

Review iterations

The branch went through four independent clean-context code review passes (general-purpose
subagent reviews launched fresh on each iteration's HEAD). Each pass surfaced a different
class of issue:

  • Pass 1 (review of 14e49448, fixed by 8930b698): 6 blockers + 10 important +
    nits. Test isolation, SemVer markers, V2 asymmetry, #[ctor] panic safety, default
    feature leakage, ServiceLoader gating, more.
  • Pass 2 (review of 8930b698, fixed by 68fff43f): 3 regressions + 4 polish
    items + 10 new findings. Stale class docstring, dead doc refs, Display
    info-loss, corruption-guard class-change bypass.
  • Pass 3 (review of 68fff43f, fixed by e4e6e6c6): 2 regressions + 8 polish.
    Survivors-set false positive, load() publication race.
  • Pass 4 (review of e4e6e6c6, fixed by 6652963c): no blockers found; 6 polish.
    Cost-comment accuracy, identity-map upgrade, payload-guard message ordering.

A separate completeness validation of the contributor guide (the doc-only audit pass)
identified ~13 gaps a real contrib author would hit. Closed in 91c40e0a and
2c46552c.

Build verification

  • cargo check (default features): green.
  • cargo check --no-default-features: green; zero contrib surface in the resulting
    libcomet.
  • cargo test -p comet-contrib-spi: 7 tests pass (registry round-trip, scoped
    registration, kinds snapshot, params constructor/setters, ContribError Display
    preservation).
  • cargo test -p datafusion-comet --lib -- execution::planner::contrib: 5 tests pass,
    including the production-canary that asserts default cdylib has no registered
    contribs and the encryption-asymmetry test that catches positional-arg swaps in
    init_datasource_exec.
  • cargo test -p comet-contrib-example: 4 tests pass (ctor registration, decode +
    build, zero rows, bad payload).
  • Maven -Pcontrib-example activates the profile cleanly; comet-contrib-example-deps
    builds first, then comet-spark with the contrib's sources injected. No reactor
    cycle.

What this PR is NOT

  • It does NOT migrate Delta / kernel-rs to the new SPI. That's PR2.
  • It does NOT exercise CometOperatorSerdeExtension from the example contrib (the
    example only demonstrates CometScanRuleExtension; the trait surface is documented
    and validated against the Delta-port confidence check).
  • The Maven BanDuplicateClasses enforcer is no longer overridden per-contrib
    (contribs are no longer separate Maven modules); the rule applies to comet-spark
    itself as before.

How to review

Suggested reading order:

  1. docs/source/contributor-guide/contrib-extensions.md — author-facing guide; doubles
    as the architectural overview.
  2. native/proto/src/proto/operator.proto — the ContribOp envelope (small, look for
    the new variant on OpStruct).
  3. native/contrib-spi/src/lib.rs — the leaf SPI crate (~370 lines incl. tests).
  4. spark/src/main/scala/org/apache/comet/spi/ — three small files defining the JVM
    SPI.
  5. native/core/src/execution/planner.rs — the OpStruct::ContribOp dispatcher arm
    (~lines 1960–2020).
  6. native/core/src/execution/planner/contrib.rsCorePlannerContext adapter that
    exposes core's parquet/expression infrastructure to contribs through the SPI trait.
  7. spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala and
    CometExecRule.scala — the integration hooks. Each is a small insertion at the
    top of _apply; the preTransform fold runs once per plan.
  8. contrib/example/ — the worked reference (deps-pom + Scala source + Cargo crate).
  9. spark/pom.xml's contrib-example profile — the template for wiring a new
    contrib into the build.

Risks / follow-ups (tracked for PR2)

  • CometOperatorSerdeExtension not yet exercised by a contrib. The example
    contrib only demonstrates CometScanRuleExtension. PR2's Delta port will exercise
    the operator serde path via CometDeltaNativeScanExec's dedicated class.
  • Native test runner needs libjvm on the dyld path. Running
    cargo test -p datafusion-comet --lib on macOS requires
    DYLD_LIBRARY_PATH=$JAVA_HOME/lib/server (only relevant when the test binary
    transitively links against the JNI crate). Preexisting on main — not introduced
    by this PR — but worth documenting.
  • CI matrix should add a -Pcontrib-example,--features contrib-example row so the
    bundling path is exercised in CI on every PR. Today only the slim build is in CI.

Implemented with Claude code - opus 4.7[1m]

schenksj and others added 19 commits May 13, 2026 22:15
First two pieces of the contrib extension SPI (PR1.1 + PR1.2 from
docs/contrib-delta-migration-plan.md). No consumers yet -- the
dispatcher arm, Scala SPI, and integration hooks land in subsequent
commits on this branch.

Proto envelope (operator.proto):
  - New ContribOp { kind: string, payload: bytes } message added as
    variant 117 on the OpStruct oneof. Contrib operators travel
    through this envelope so core's proto stays stable when contribs
    ship and evolve independently.

Rust SPI (planner/contrib.rs):
  - register_contrib_planner(kind, planner: Arc<dyn OperatorBuilder>)
    -- intended to be called from a contrib crate's #[ctor] at lib
    init time. Last-write-wins on duplicate kinds (test re-registration
    convenience; production contribs only register once).
  - lookup_contrib_planner_by_kind(kind) -> Option<Arc<dyn OperatorBuilder>>
    -- read path the dispatcher (PR1.3) will use.
  - registered_contrib_kinds() -> Vec<String> -- diagnostics.
  - 2 unit tests covering registration round-trip + duplicate-kind
    replacement; both pass.

Exhaustive-match accommodations (operator_registry.rs, jni_api.rs):
  - operator_registry::get_operator_type returns None for ContribOp;
    PR1.3 will add the dispatcher arm in planner.rs that bypasses
    this lookup and goes through the contrib registry instead.
  - jni_api::op_name returns "ContribOp" for the new variant
    (informational tracing label).

The reused trait is core's existing OperatorBuilder rather than a new
ContribOperatorPlanner trait -- their signatures are identical and
duplicating would force contribs to maintain a parallel definition.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR1.3 from docs/contrib-delta-migration-plan.md. The Rust planner now
dispatches OpStruct::ContribOp through the registry added in PR1.2 by
calling lookup_contrib_planner_by_kind(kind) and delegating to the
returned OperatorBuilder. When no planner is registered for the kind,
surfaces a clear ExecutionError that names the missing Cargo feature
-- this is the typical case when the contrib's JVM JAR is on the
classpath but core was built without the matching `contrib-<name>`
feature.

No behaviour change for any existing operator. Contribs activate once
their rlib is linked into core's cdylib via the Cargo feature and
their #[ctor] runs at lib-init time. PR1.7 (contrib/example/) will
exercise this end-to-end with the first concrete contrib.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR1.4 from docs/contrib-delta-migration-plan.md. Three new files under
spark/src/main/scala/org/apache/comet/spi/:

  CometScanRuleExtension.scala
    Trait that contrib JARs implement to intercept scan transformation
    in CometScanRule. Exposes both matchesV1/transformV1 (FileSourceScanExec)
    and matchesV2/transformV2 (BatchScanExec) overrides; both default to
    "doesn't match", letting contribs claim only the scan flavour they
    care about.

  CometOperatorSerdeExtension.scala
    Trait that contribs implement to contribute additional
    SparkPlan-class to CometOperatorSerde mappings. CometExecRule (PR1.5)
    merges these with its built-in allExecs.

  CometExtensionRegistry.scala
    ServiceLoader-backed process-wide singleton. Idempotent `load()` that
    discovers contrib JARs on the classpath via standard
    META-INF/services entries. Failures to instantiate individual
    extensions are logged but never fatal -- one broken contrib JAR
    doesn't take down the Spark session. Test-only resetForTesting() hook.

No callers yet; PR1.5 wires CometScanRule and CometExecRule to consult
the registry, and PR1.6 wires CometSparkSessionExtensions to call
load() during installation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR1.5 from docs/contrib-delta-migration-plan.md. Three integration
hooks added; all are no-ops until contribs are present on the classpath
and PR1.6 calls CometExtensionRegistry.load() during extension install.

CometScanRule.transformV1Scan
  After the Spark 3.4 AQE-DPP gate, iterate
  CometExtensionRegistry.scanExtensions. First extension whose
  `matchesV1` returns true gets `transformV1` called. Some result
  replaces the scan; None falls through to core's existing
  file-format dispatch.

CometScanRule.transformV2Scan
  Same shape for BatchScanExec via matchesV2/transformV2.

CometExecRule.transform.convertNode (operator dispatch)
  When a non-shuffle, non-broadcast operator has all-native children,
  the lookup now consults `(allExecs ++ contribSerdes)` where
  `contribSerdes` is the union of every registered
  CometOperatorSerdeExtension's `serdes` map. Contrib operator classes
  (e.g. a future Delta-contrib's CometDeltaNativeScanExec) get matched
  here without core having to know about them.

Iteration order is registration order (i.e. ServiceLoader discovery
order, which is classpath-stable per JVM run). Contribs that need
priority should be the first META-INF/services entry on the
classpath; in practice contribs claim disjoint scan types so the
ordering rarely matters.

No regression risk: with no extensions loaded (the state on this
branch and on main today), every hook short-circuits in O(1) and
falls through to the existing code path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR1.6 from docs/contrib-delta-migration-plan.md. Adds a single call to
CometExtensionRegistry.load() at the top of
CometSparkSessionExtensions.apply, before any of Comet's rules are
registered. Discovery happens once per JVM (idempotent), so subsequent
SparkSession installs are no-ops.

With no contrib JARs on the classpath the call discovers nothing and
returns; with contribs present, their META-INF/services entries are
enumerated and the registered extensions become visible to
CometScanRule (PR1.5) and CometExecRule (PR1.5).

Closes the JVM half of the contrib SPI: every PR1 piece for the JVM
side is now in place. Remaining PR1 deliverables are the
contrib/example/ minimum example (PR1.7) and the contributor guide
(PR1.8).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Substantive piece of PR1.7. Two structural changes break what was
otherwise going to be a cyclic dependency between core and contribs:

1. New crate native/contrib-spi/ (`comet-contrib-spi`):
   - Defines the `ContribOperatorPlanner` trait that contribs implement.
   - Owns the process-wide registry (`register_contrib_planner`,
     `lookup_contrib_planner_by_kind`, `registered_contrib_kinds`).
   - Light-weight `ContribError` enum for SPI errors. Core converts to
     its own `ExecutionError` at the dispatch site.
   - 1 unit test covering registration round-trip.
   - Only deps: `datafusion` + `log`. No deps on core, no deps from
     core back. The SPI is the leaf both core and contribs depend on.

2. New crate contrib/example/native/ (`comet-contrib-example`):
   - rlib (NOT cdylib) -- linked INTO core's libcomet via the
     `contrib-example` Cargo feature flag on core.
   - Registers a `NoOpPlanner` against kind `"example-no-op"` via
     `#[ctor::ctor]`. The planner returns a sentinel error so tests
     can verify the full JVM->JNI->native->contrib dispatch chain.
   - Depends on `comet-contrib-spi` (NOT on core).
   - Real contribs follow the same shape: rlib + #[ctor] + thin
     dependency on the SPI crate.

Core rewiring:
   - native/core/Cargo.toml: `comet-contrib-spi` mandatory dep;
     `comet-contrib-example` optional dep gated by feature
     `contrib-example` (default-on so released builds ship the
     example registered).
   - native/core/src/lib.rs: `extern crate comet_contrib_example` gated
     by the feature so #[ctor] runs at libcomet load.
   - native/core/src/execution/planner/contrib.rs: now just re-exports
     the SPI surface for backwards-compatible imports within core.
   - native/core/src/execution/planner.rs: ContribOp dispatcher now
     recursively builds native children, calls the SPI's `plan(payload,
     children) -> Arc<dyn ExecutionPlan>`, wraps in `SparkPlan`. Maps
     `ContribError` to `ExecutionError::GeneralError` with a clear
     contrib-identified prefix.

Workspace wiring:
   - native/Cargo.toml: adds `contrib-spi` to default-members so the
     SPI crate is built/checked with the rest of the workspace.
   - Adds `../contrib/example/native` to workspace members (NOT
     default-members) so it shares the workspace lockfile and
     dependency overrides but isn't compiled standalone.

Build state: `cargo check` on all three crates (core, contrib-spi,
contrib-example) is clean. SPI unit test passes. The Maven side of the
example contrib (pom.xml, Scala extension, ServiceLoader entry,
integration test) is NOT in this commit -- it lands in a follow-up
on the same branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR1.7 part 2 (completes PR1.7). The Rust half landed in d1553b5; this
commit lands the Maven module + Scala extension + ServiceLoader entry
so `mvn install` from the repo root produces a published
comet-contrib-example-* JAR alongside core's comet-spark-*.

New files:
  contrib/example/pom.xml
    Maven module. Inherits the parent pom; depends on comet-spark
    (provided scope, transitive Spark/Scala). Disables the parent
    pom's BanDuplicateClasses enforcer for this contrib because the
    parent rule was tuned for core (comet-spark shades
    scala-collection-compat and Spark drags in the same classes
    unshaded -- a per-module override is cleaner than reshaping the
    parent rule for every future contrib).

  contrib/example/src/main/scala/.../ExampleScanRuleExtension.scala
    Trivial CometScanRuleExtension impl. matchesV1 keys on a
    test-only marker option so the SPI can be exercised
    deterministically; matchesV2 / transformV1 / transformV2 inherit
    trait defaults. Real contribs replace these with their own
    file-format probes + native dispatch.

  contrib/example/src/main/resources/META-INF/services/
    org.apache.comet.spi.CometScanRuleExtension
    ServiceLoader manifest entry. This is the single line that makes
    the contrib JVM-discoverable.

  contrib/example/src/test/scala/.../ExampleScanRuleExtensionSuite.scala
    Two tests:
      1. ServiceLoader discovers ExampleScanRuleExtension via
         CometExtensionRegistry.load() with no other configuration.
      2. matchesV1 honours the test marker option.

Root pom.xml:
  Adds `<module>contrib/example</module>` to the modules list so
  `mvn install` from the repo root builds and installs the contrib
  alongside core.

Build state: `mvn install -DskipTests -Pspark-3.5` builds the new
module successfully. Native-side Rust artifact (rlib linked into
libcomet via Cargo feature `contrib-example`) was already committed
in d1553b5. PR1.7 closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR1.8 from docs/contrib-delta-migration-plan.md. New doc at
docs/source/contributor-guide/contrib-extensions.md walks future
contrib authors through:

  * Architecture overview (JVM JAR + native rlib linked into libcomet
    via Cargo feature flag, single cdylib at runtime, ContribOp proto
    envelope as the dispatch hop).
  * The SPI surface on both sides: traits, registry, error type.
  * Required files for a new contrib, mirroring contrib/example/.
  * The three existing-file edits needed (root pom, native workspace,
    core Cargo features).
  * End-to-end wire-format flow.
  * Cargo feature gating semantics (--no-default-features for slim
    builds; the JVM side is always classpath-driven).
  * Testing recommendations modeled on contrib/example/'s suite.
  * Cross-references to the migration plan and the SPI crate.

Closes PR1.8. With this commit, all eight PR1 deliverables from
docs/contrib-delta-migration-plan.md are in place on the
comet-contrib-spi branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three additions surfaced by porting Delta onto the SPI as a local
confidence check (port not committed; see PR1-delta-port-findings.md):

1. CometScanRuleExtension.preTransform tree-level hook

   Default-identity method that runs once per plan in CometScanRule._apply
   before per-scan dispatch. Lets contribs undo wrapper rewrites their own
   Catalyst strategies applied (Delta's PreprocessTableWithDVs is the
   motivating case; its strategy wraps DV-bearing scans in
   Project(Filter(...)) referencing a synthetic column Comet's reader
   can't produce). Without this hook, Delta couldn't move into a contrib
   at all without losing the unwrap step.

   Shared state between preTransform and transformV1 is the contrib's
   problem -- the recommended pattern (documented) is Spark's TreeNodeTag
   mechanism, which the existing CometSpark34AqeDppFallbackRule already
   uses.

2. Proto layer in contrib/example/

   Each contrib now ships its own .proto schema, build.rs running
   prost-build, and gitignored src/generated/. contrib/example/ carries
   a trivial ExampleConstantScan { row_count } message; a new
   ConstantScanPlanner registered under kind="example-constant-scan"
   decodes the payload via prost::Message::decode and returns an
   EmptyExec sized by the field. Three new tests:

   * ctor registers both planners
   * payload decode-and-build round-trip
   * bad payload surfaces ContribError::BadPayload

   This makes the worked reference complete -- future contrib authors
   have a runnable proto setup to copy.

3. Class-subclass convention documented

   CometExecRule dispatches by op.getClass. Documented the convention
   that contribs needing a custom executor should define their own
   CometScanExec subclass (or similar) and register the serde keyed on
   that class, rather than reusing a generic class with a stringly-typed
   scanImpl tag (the legacy Delta pattern that has no analogue in the
   class-based SPI dispatch).

Files touched:
  spark/src/main/scala/org/apache/comet/spi/CometScanRuleExtension.scala
  spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala
  contrib/example/native/{build.rs, Cargo.toml, src/lib.rs}
  contrib/example/native/src/proto/example_op.proto
  docs/source/contributor-guide/contrib-extensions.md
  .gitignore + native/Cargo.lock

Build state: cargo check across core + contrib-spi + contrib-example
clean. cargo test -p comet-contrib-example: 3/3 pass. cargo test -p
comet-contrib-spi: 1/1 pass. mvn install all modules: clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apache#4)

Extends the contrib SPI so file-scan contribs can build a parquet scan
through core without depending on core. Adds to comet-contrib-spi:

  * ContribPlannerContext trait -- contribs receive a &dyn impl in their
    plan() call. Methods: session_ctx, build_physical_expr (Catalyst Expr
    proto -> PhysicalExpr), convert_spark_schema, prepare_object_store,
    build_parquet_datasource_exec.
  * ParquetDatasourceParams struct -- 15-field argument bundle mirroring
    core's init_datasource_exec one-to-one.
  * ContribOperatorPlanner::plan now takes &dyn ContribPlannerContext as
    its first argument.

Core implements the trait via CorePlannerContext, a thin adapter that
borrows &PhysicalPlanner. Dispatcher constructs one per ContribOp arm.

Updates the example contrib to take and ignore the new ctx param; tests
now use a TestCtx with unimplemented panics for unused trait methods.

Surfaced and validated by attempting to host the full Delta dispatcher
(~150 lines from delta-kernel-phase-1's OpStruct::DeltaScan arm) on the
SPI -- branch contrib-delta-port carries that work. The validation port
compiled clean, linked into core's cdylib, and exercised every trait
method end-to-end (column-mapping rewrites, DV filter wrapping, schema
conversion, expression-planner round-trip, parquet exec construction).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses every finding from the first review:

Blockers
- B1: Test isolation via ScopedContribPlannerRegistration RAII guard +
  _clear_for_test escape hatch (cfg-gated on test or "test-utils" feature).
  Negative-lookup test added.
- B2: preTransform documented as V1-only; transformV2 explicitly does not
  receive a plan-tree reference. Trait + dispatcher docs aligned.
- B3/B4: #[non_exhaustive] on ParquetDatasourceParams and ContribError.
  Constructor (`new`) + `with_*` setters on the params struct so contribs
  don't use struct-literal syntax. WrongChildCount.expected switched from
  &'static str to String.
- B5: CometScanRule preTransform corruption guard -- log warning when an
  extension replaces a FileSourceScanExec whose relation it does not claim.
- B6: Example contrib's #[ctor] wrapped in catch_unwind. Contributor guide
  documents panic semantics, the logger-not-ready issue (use eprintln!),
  cross-platform ctor-order nondeterminism.

Important
- I1: contrib-example removed from default features. Production cdylib has
  empty registered_contrib_kinds(). Build docs updated.
- I2: CometExtensionRegistry.load() moved out of CometSparkSessionExtensions.apply
  into a lazy call at the top of CometScanRule._apply (after isCometLoaded).
  Sessions that never enable Comet pay zero ServiceLoader cost.
- I3: CometExtensionRegistry.mergedSerdes pre-computed at load() time;
  CometExecRule now consults it via .get() instead of rebuilding the merged
  map per operator transform. Duplicate-class detector logs a warning when
  two contribs claim the same SparkPlan class.
- I4: Multi-extension dispatch now loops over every matching extension and
  takes the first that returns Some; "matched but declined" continues to the
  next extension before falling back to core. Trait docs updated.
- I5: Unit tests added: ParquetDatasourceParams constructor + setters with
  distinguishable bool tuple, CorePlannerContext smoke test that builds a
  DataSourceExec and verifies schema flow-through, session_ctx Arc identity,
  empty-schema conversion. 7 tests total in contrib-spi, 3 in core.
- I6: prepare_object_store returns (ObjectStoreUrl, object_store::path::Path);
  contribs no longer have to reimplement URL parsing for PartitionedFile.location.
- I7: preTransform fold gated on COMET_NATIVE_SCAN_ENABLED. Disabled-Comet +
  Delta JAR on classpath no longer strips load-bearing Catalyst wrappers.
- I8: Display test for every ContribError variant verifies the dispatcher's
  format!("contrib planner {kind:?}: {e}") preserves variant-discriminating info.
- I9: Dispatcher rejects ContribOp payloads larger than 16 MiB with a
  descriptive error.
- I10: CometExtensionRegistry.load() logs a positive INFO message when no
  extensions are discovered, so users get a signal in deploy modes where the
  context classloader doesn't see the contrib JAR.

Nits
- N2: ConstantScanPlanner log moved from info! to debug!.
- N4: Dead doc link to docs/contrib-delta-migration-plan.md removed.
- N5: ExampleScanRuleExtensionSuite no longer calls SparkSession.stop()
  in finally (that tears down the JVM-wide singleton).
- N8: Trimmed comet-contrib-example crate description.
- N9: operator_registry test asserts ContribOp returns None from get_operator_type.
- N10: row_count=0 covered by an additional unit test in the example contrib.

Open-question documentation (contributor guide)
- Send+Sync asymmetry between ContribOperatorPlanner and ContribPlannerContext.
- SPI is alpha-stable; #[non_exhaustive] markers make additive changes minor.
- &[u8] vs Bytes rationale.
- --no-default-features verification + CI matrix suggestion.
- Thin JAR convention + shading guidance.
- Registry-primitive note (may switch to ArcSwap; API unchanged).
- WrongChildCount.expected convention (free-form phrase).

Verified
- cargo check --no-default-features and cargo check (default features) green.
- cargo test -p comet-contrib-spi -p comet-contrib-example: 10 tests pass.
- cargo test -p datafusion-comet --lib (filtered): 4 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regressions
- R1: CometExecRule._apply now calls CometExtensionRegistry.load() at the top
  (after isCometLoaded). Previously only CometScanRule.load()ed; rule-injection
  order changes or partial injection would have left mergedSerdes empty and
  silently un-dispatched contribs.
- R2: CometExtensionRegistry class docstring updated to match the lazy-load
  semantics introduced in the first-pass fix.
- R3: Three remaining dead references to docs/contrib-delta-migration-plan.md
  rewritten to point at contrib-extensions.md (native/core/Cargo.toml,
  root pom.xml, PR1-description.md).
- R4: ContribError::Display wildcard arm now emits {self:?} instead of
  "unknown contrib error" so future variants under #[non_exhaustive] keep
  their debug-repr message through the dispatcher's format chain.
- R5/R6: preTransform corruption guard rewritten to scan-identity check.
  Snapshots every FileSourceScanExec the extension does NOT claim before the
  pass, then verifies each one survives in the rewritten tree by reference
  equality. Catches class-changing replacements (which the old check missed)
  and is robust to plan-tree reordering (which the old zip-by-position was
  not).
- R7: New unit test core_planner_context_encryption_flag_reaches_init_datasource_exec
  uses the encryption_enabled asymmetry (true triggers a factory lookup that
  fails when no factory is configured; false silently succeeds) to verify
  that bool reaches the right positional slot in init_datasource_exec.
  A swap with case_sensitive / use_field_id / etc. would now fail this test.

Surface tweak
- N-NEW-1: ParquetDatasourceParams::session_timezone switched from &'a str
  to owned String. with_session_timezone now accepts `impl Into<String>` so
  contribs can pass runtime-computed timezones (from a session config lookup)
  without juggling lifetimes. ParquetDatasourceParams loses its lifetime
  parameter entirely.

Dispatcher
- N-NEW-2: Payload-size guard moved to AFTER the planner lookup. A bogus
  kind now produces the "not registered" error rather than misleadingly
  blaming an oversized payload.

CI / regression guard
- N-NEW-7: New unit test production_build_has_no_contrib_planners_registered,
  gated on `#[cfg(not(feature = "contrib-example"))]`, asserts the default
  cdylib carries zero contrib surface. Catches an accidental re-introduction
  of a contrib into core's `default = [...]` feature set.

Verified
- cargo check (default features): green.
- cargo test -p datafusion-comet --lib --no-default-features: 135 tests pass
  including the new production-canary.
- cargo test -p datafusion-comet --lib -- execution::planner::contrib:
  4 tests pass including the encryption-flag witness.
- cargo test -p comet-contrib-spi -p comet-contrib-example: 10 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regressions
- R-NEW-1: CometScanRule preTransform corruption guard switched from
  mutable.Set[FileSourceScanExec] (Spark case-class equality) to a Vector
  with `_ eq b` lookup. Two value-equal-but-reference-distinct scans (e.g.,
  self-join after AQE dedup) no longer trigger a false-positive warning.
  Cost stays O(K * (P + S)).
- R-NEW-2: CometExtensionRegistry.load() now runs inside `synchronized`
  with explicit publication order (write @volatile fields, THEN flip
  `loaded`). The previous AtomicBoolean-only gate let Thread B observe
  `loaded=true` and read Seq.empty/Map.empty while Thread A was still
  loading. AQE concurrent rule application across sub-queries now sees
  consistent registry state.

Polish
- N1: Cost comment added to the preTransform guard fold.
- N2: Guard comment notes V2 BatchScanExec is out of scope by design.
- N3: ContribOp dispatcher now rejects empty `kind` with a dedicated error
  ("the JVM-side serde produced a malformed envelope") instead of the
  misleading "build core with `contrib-` feature" message.
- N4: Payload-size guard comment corrected -- prost has already decoded
  the payload by the time we get here; the guard fences the contrib's
  plan() body, not the original allocation.
- N5: Scope limitation documented on the encryption-asymmetry test --
  catches swaps involving the encryption_enabled slot only; new bools
  must come with their own asymmetry witness.
- N6: Production canary cfg switched to `not(any(...))` form with a
  MAINTENANCE comment listing the contract for future contrib features.
- N7: resetForTesting visibility widened from `private[comet]` to public;
  docstring explains that contribs are not required to package under
  org.apache.comet.* and must still be able to reset between tests.
- N8: ContribError::Display wildcard comment clarified -- the wildcard
  defends downstream Display-as-source consumers; inside the defining
  crate the match must be exhaustive anyway.

Verified
- cargo check default features: green.
- cargo test -p datafusion-comet --lib -- execution::planner::contrib:
  5 tests pass (added 1, was 4).
- cargo test -p comet-contrib-spi -p comet-contrib-example: 10 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- F1: CometScanRule preTransform corruption guard swaps ArrayBuffer +
  `_ eq` for java.util.IdentityHashMap, making survivor lookup O(1) and
  the documented O(K * (P + S)) cost accurate.
- F2: CometExtensionRegistry.resetForTesting() now `synchronized`. Without
  it a concurrent load() could observe torn state (loaded=false but the
  fields still populated, or vice versa), causing the next load() to
  short-circuit and miss re-discovery.
- F3: Trimmed overstated comment in load()'s no-extensions branch.
- F4: ContribOp dispatcher rejects whitespace-only `kind` (not just
  empty); displays the raw `kind` repr in the error message.
- F5: ContribOp proto reserves tags 3-9 for additive evolution
  (payload_format_version, compression, contrib_version, etc.) so
  evolving contribs can't accidentally reuse one.
- F6: Contributor guide documents the 16 MiB ContribOp.payload cap and
  notes contribs with a legitimate need for a higher ceiling should
  file an issue rather than work around it.

Also adds a "MUST NOT call load() from a class's static initializer"
note to the load() docstring (Scala monitors are reentrant so it
wouldn't deadlock but would shadow the in-flight publication).

Verified: cargo check green, 21 core planner tests pass, 10 SPI +
example tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses every gap surfaced by the doc-completeness validation pass:

Missing topics now covered
- JVM-side proto compilation (protoc-jar-maven-plugin block + shaded
  protobuf-java rationale -- contribs MUST inherit the parent pom's
  com.google.protobuf -> ${comet.shade.packageName}.protobuf relocation).
- Worked Scala snippet for building a ContribOp envelope from a serde,
  including the Java setContribOp(...) name (vs Rust op_struct).
- CometOperatorSerde[T <: SparkPlan] trait shape: enabledConfig,
  requiresNativeChildren, getSupportLevel, convert, createExec.
- Full walked-through plan() body exercising every ContribPlannerContext
  method (convert_spark_schema, build_physical_expr, prepare_object_store,
  build_parquet_datasource_exec) -- mirrors what Delta/Iceberg ports do.
- ServiceLoader diagnostics: the INFO "none discovered" line, the WARN
  per-failed-entry line, which logger to enable for debugging.
- Classloader-order story (lazy load post-`--jars` so order doesn't matter).
- CometExtensionRegistry.load() MUST NOT be called from static initializers
  (reentrancy shadows in-flight publication).
- Logging conventions (eprintln in #[ctor], log::* with target: elsewhere,
  do NOT re-prefix errors with the contrib's kind).
- Error message convention (dispatcher already prefixes with kind).
- Version pinning for out-of-tree contribs (explicit Comet patch version,
  not ${project.version}).
- Multi-Spark-version shimming: pick a spark.version.short profile, mirror
  Comet's per-profile artifact ID pattern.
- End-to-end Rust+Scala round-trip test pattern with concrete recipe.
- Cargo feature canary maintenance note (when adding contrib-<name>,
  extend the not(any(...)) cfg in production_build_has_no_contrib_planners_registered).

Reorganised so a new contrib author finds things in the right order:
- "Required files" + "Wiring into core" + Cargo feature gate moved BEFORE
  the SPI deep-dive.
- Prerequisites + .gitignore + workspace-placement constraint called out
  upfront.

Inaccuracies fixed
- Operator proto field name (op_struct in Rust, setContribOp on Java
  Builder -- explained as a code-generator language difference).
- "open for inheritance" qualifier sharpened: additive default-implemented
  methods are a minor bump; abstract-method additions are breaking.
- out_dir = "src/generated" pattern justified as a deliberate deviation
  from idiomatic prost (stable include! path for editor tooling).
- contrib-example,contrib-delta example reworded so it doesn't reference
  a feature that doesn't exist in-tree yet.
- PR1's CI -> Comet's CI.
- MAX_CONTRIB_PAYLOAD_BYTES named so readers can rg for it.

Nit cleanups
- _clear_for_test added to the SPI table with explicit "test escape hatch
  only" caveat alongside ScopedContribPlannerRegistration.
- ContribError convention paragraph cross-linked from the SPI table row.

The result is a 758-line single-document reference that a contrib author
can follow end-to-end without reading core's source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the validation findings against commit 91c40e0:

Blockers
- I4: JVM proto-shading recipe rewritten. The original claim that the
  contrib pom would inherit shading from the parent was wrong -- shading
  is configured in spark/pom.xml as a per-module execution, not via
  pluginManagement. A contrib generating its own Java proto without its
  own shade-plugin execution would NoSuchMethodError on setPayload() at
  runtime because ContribOp.Builder expects ${comet.shade.packageName}.protobuf.ByteString.
  New section gives the full pom snippet: protoc-jar-maven-plugin +
  maven-shade-plugin execution that relocates com.google.protobuf to
  the parent's shade prefix.
- R3: with_session_timezone(&scan.session_timezone) didn't compile
  (&String doesn't impl Into<String>). Fixed to use scan.session_timezone.as_str()
  with a brief inline comment explaining why.
- R2/M1: build_partitioned_files was hand-waved with no shape. Added a
  full sketched implementation that builds PartitionedFile per task,
  resolves URLs to object_store::path::Path, sets object_meta.location,
  and notes the common real-world variations (file-range splitting,
  partition_values, format-specific filter wrappers).

Other fixes
- I6: gitignore guidance corrected -- the entry lives in the repo-root
  .gitignore, not in contrib/example/. Verified the actual entry exists.
- I2: build.rs snippet now mirrors contrib/example/native/build.rs
  exactly (including the cargo:rerun-if-changed=src/proto/ line that
  prost-build needs to rebuild on schema changes).
- I5: op_struct vs contrib_op naming clarified -- op_struct is the
  oneof name (Rust pattern-match handle), contrib_op is the field
  name on that oneof (Java setter name). They are not "the same field"
  with different names; they're a oneof and one of its members.
- M3: ServiceLoader-diagnostics section now also covers
  detectDuplicateSerdeClasses (cross-contrib serde key collision) and
  register_contrib_planner's last-write-wins WARN on duplicate kinds.
- cfg(not(any(...))) placeholder example replaced with the literal
  current form, plus the explicit "add feature = "contrib-<name>" here"
  instruction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…model

The original SPI design had asymmetric distribution: the native rlib was
compiled INTO libcomet via a Cargo feature, but the JVM half shipped as a
separate Maven artifact discovered at runtime via ServiceLoader. This made
the protobuf-shading recipe in the contributor guide load-bearing (~70 lines
of XML to relocate `com.google.protobuf` per contrib) and it gave users a
distribution model that didn't actually work -- the native side required a
Comet rebuild regardless of how the JVM half shipped.

The fix mirrors the native side: contribs are now source directories under
contrib/<name>/, NOT Maven modules. Activating `-Pcontrib-<name>` on
spark/pom.xml folds the contrib's Scala + resources + proto into
comet-spark.jar's normal compile + shade execution. `mvn install` produces
a vanilla comet-spark.jar with zero contribs; `mvn install -Pcontrib-example`
produces one with the example contrib's classes inside. Same shape as
`cargo build --features contrib-example`.

Files touched
- spark/pom.xml: new contrib-example profile using build-helper-maven-plugin
  (source roots), maven-resources-plugin (META-INF/services), and an extra
  protoc-jar-maven-plugin execution (Java proto generation). The default
  shade execution gains a ServicesResourceTransformer so contrib service
  files merge cleanly.
- pom.xml: <module>contrib/example</module> removed; contribs aren't modules.
- contrib/example/pom.xml: deleted. The example is now Scala + resources +
  Cargo crate, no Maven pom.
- spark/.../spi/CometExtensionRegistry.scala: docstring rewritten to describe
  the bundled model; no logic change.
- docs/source/contributor-guide/contrib-extensions.md: rewritten "Architecture
  at a glance", "Required files", "Wiring into core", "Build matrix",
  "Proto, JVM side", and "Maven packaging" sections to reflect the new
  model. The protobuf-shading recipe is gone -- shading is handled by
  comet-spark's existing shade execution automatically.

What this fixes
- No Maven cycle (the previous separate-module design hit one and required
  a dedicated SPI module to break it; the source-injection model avoids
  the cycle entirely).
- One artifact installed: `comet-spark-with-<contribs>.jar` rather than
  a JAR + per-contrib JARs.
- ~70 lines of protobuf-shading boilerplate removed from the contributor
  guide. The new "Proto, JVM side" section is ~15 lines.
- Distribution model is honest: contribs are build-time options on Comet,
  JVM and native both.

Verified: spark/pom.xml parses; `-Pcontrib-example` profile activates
cleanly with no Maven reactor errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Q1 (external deps): the previous source-injection-only refactor lost the
encapsulation needed for contribs that pull in external Maven deps like
delta-spark. Reintroduce a per-contrib pom.xml but as a deps-only artifact
(`<packaging>pom</packaging>`, no code, no JAR). The contrib pom enumerates
external deps; spark/pom.xml's `contrib-<name>` profile depends on it via
`<type>pom</type>` to pull those deps transitively onto comet-spark's
classpath. No reactor cycle: the deps pom has no `<dependency>` on
comet-spark; it's a leaf list of external deps.

contrib/example/pom.xml is the template -- its `<dependencies>` block is
empty (the example has no external deps) but the file demonstrates the
pattern that a real Delta contrib would use to pull in delta-spark.

Q2 (registry primitive): swap RwLock<HashMap> for ArcSwap<HashMap> in
comet-contrib-spi's registry. Reads on the dispatch hot path drop from
"acquire RwLock read guard + drop" to "atomic load + ref-count bump"; there
was never any meaningful reason to make readers interact with a lock since
writes happen only during library init (sequential, single-threaded
#[ctor] calls). Public API unchanged; all 7 SPI tests still pass.
ScopedContribPlannerRegistration and _clear_for_test reworked to use rcu /
atomic store respectively.

Audit of other concurrency / perf hot spots: no other meaningful issues
found. Per-dispatch Arc::clone is already optimal (single atomic refcount
bump). CometExtensionRegistry's `synchronized` load() runs once.
CometExecRule's mergedSerdes lookup is O(1). The preTransform corruption
guard is O(K * (P + S)) per plan with K typically 1-3 -- microseconds,
real safety value, keep.

Verified: cargo test -p comet-contrib-spi passes (7 tests); maven profile
contrib-example activates cleanly without reactor cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ture

The PR description had drifted significantly from the actual branch:
- Said "9 commits"; the branch is now 18.
- Claimed JVM contribs ship as separate Maven JARs and users opt in via
  classpath; the symmetric-distribution refactor inverted that.
- Said the registry uses RwLock; it now uses ArcSwap.
- Missing all the post-`e018076d` work: ContribPlannerContext, four review-fix
  rounds, the contributor-guide completeness pass, the symmetric-distribution
  pivot, the deps-pom pattern.
- The architecture diagram still showed ServiceLoader-from-separate-JAR.
- The Delta-port confidence check section listed three SPI gaps closed; the
  fourth (Gap apache#4: ContribPlannerContext) was added later and was missing.

The new description accurately enumerates the 18 commits, walks through the
current distribution model, and explains the design decisions (single-artifact
bundling, ArcSwap registry, source-injection over JAR-shading, deps-only pom
per contrib) that emerged across the iterations. It also documents the four
review passes and the doc-completeness validation as part of the audit trail.

Not committed: this file is a working artifact, used when manually opening
the PR. The previous version was a draft; the current is review-ready.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@schenksj
Copy link
Copy Markdown
Author

@parthchandra - appreciate your feedback on this proposed design for the contrib capability

@andygrove - FYI

schenksj and others added 2 commits May 14, 2026 22:21
PR1-description.md is a working artifact used when manually opening PR1
upstream; it should not be committed. It slipped in via commit a2ac715.
This commit removes it from tracking (file stays on disk locally) and adds
it to .gitignore so future `git add -A` doesn't pick it up again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@schenksj schenksj force-pushed the comet-contrib-spi branch 2 times, most recently from a47d978 to 071286a Compare May 15, 2026 12:30
Format-agnostic surface additions PR2's Delta port (and any future
file-format contrib) needs without duplicating ~600 lines of
CometScanRule's existing schema-check / encryption-gate / marker-dispatch
helpers. No specific format is named anywhere in core; contribs register
their own tags through the SPI.

1. CometScanRule.isSchemaSupported widened from private to private[comet]
   so contrib extensions under org.apache.comet.contrib.* can reuse the
   same schema-check + fallback-reason emission rather than duplicating
   the 25-line body. Zero behaviour change for in-tree callers.

2. CometOperatorSerdeExtension.matchOperator new default-Some-or-None
   method for predicate-based serde dispatch. The class-keyed `serdes`
   map can't disambiguate a marker pattern like
   `CometScanExec(scanImpl="<contrib-specific-tag>")` (the class is
   shared with core's generic CometScanExec). Contribs using such
   markers override matchOperator. Backwards compatible: existing
   contribs that only populate `serdes` see no change. `serdes` now
   defaults to Map.empty so contribs that ONLY use matchOperator don't
   need to override both.

3. CometExecRule three-step dispatch: allExecs (core class map) ->
   mergedSerdes (contrib class map) -> matchOperator iteration (contrib
   predicate). First Some wins; multiple extensions' matchOperator
   results are tried in registration order.

4. CometOperatorSerdeExtension.nativeParquetScanImpls new
   default-Set.empty method. Contribs that use the CometScanExec marker
   pattern AND go through Comet's tuned ParquetSource declare their
   scanImpl tag(s) here. CometScanExec.supportedDataFilters consults the
   merged set (via CometExtensionRegistry.nativeParquetScanImpls) to
   decide whether to apply native-parquet filter exclusions. Core no
   longer needs to hard-code any contrib's tag name.

5. CometExtensionRegistry.nativeParquetScanImpls publishes the merged
   tag set at load() time. Populated/reset alongside mergedSerdesCache
   under the same monitor.

Contributor guide updated with the matchOperator + nativeParquetScanImpls
patterns and explicit guidance that contribs define their own scanImpl
strings in their own code -- core's CometConf only carries
SCAN_NATIVE_DATAFUSION / SCAN_NATIVE_ICEBERG_COMPAT for core's own
variants.

Verified
- cargo check (default features): green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@schenksj schenksj force-pushed the comet-contrib-spi branch from 071286a to 525b980 Compare May 15, 2026 13:05
schenksj and others added 3 commits May 15, 2026 14:25
…patch

Two related additions to the contrib SPI surface, both driven by gaps the
Delta regression exposed:

1. Generic per-partition metadata hook on `CometExecRDD` so contribs can
   populate executor-side thread-locals (e.g. `InputFileBlockHolder` for
   `input_file_name()` / `_metadata.file_path`) from their serialized
   per-partition payloads BEFORE the native iterator starts producing
   rows. Without this, Delta's UPDATE/DELETE/MERGE/CDC commands resolve
   `_metadata.file_path` to empty and throw `DELTA_FILE_TO_OVERWRITE_NOT_FOUND`
   for every touched file. Three pieces:

     - `CometExecRDD.PartitionMetadataHandler` type alias +
       `registerPartitionMetadataHandler` SPI. The signature takes the
       `Map[String, Array[Byte]]` data shape (NOT the spark-internal
       `CometExecPartition`), so contribs don't have to live under
       `org.apache.spark.*` to use it.
     - `CometExecRDD.compute()` invokes handlers after plan-data injection,
       before instantiating the native iterator.
     - `clearPartitionMetadataHandlers()` for test isolation; called from
       `CometExtensionRegistry.resetForTesting`.

   Lifecycle hook: `CometOperatorSerdeExtension.init(): Unit` (default
   no-op), called once per JVM by `CometExtensionRegistry.load` after
   discovery. Contribs override to register their handler. Failures in
   one contrib's `init` are caught and logged so they don't take down
   sibling contribs.

2. `matchOperator` dispatch now fires for `CometScanExec` markers tagged
   with a registered contrib's `scanImpl` (anything in
   `nativeParquetScanImpls`). Before this, the generic
   `case op if isCometScan(op)` branch matched first and routed the
   marker through `CometScanWrapper`, so the contrib's `serialize` (and
   any format-specific concerns inside it -- Delta column-mapping
   physical-name substitution, etc.) was never reached. The dispatch
   order is now: `SCAN_NATIVE_DATAFUSION` -> contrib-marker via
   `matchOperator` -> generic `isCometScan` catch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Revert the `case scan: CometScanExec if nativeParquetScanImpls.contains(...)`
branch added in e417211. Unconditionally routing the contrib marker through
the contrib's `matchOperator` -> serde regressed ~525 previously-passing
Delta tests: the full conversion path (`CometDeltaNativeScan.serialize` ->
`CometDeltaNativeScanExec`) returns 0 rows for the streaming-source
`PreparedDeltaFileIndex` shape that the existing JVM-side `CometScanWrapper`
path handles correctly.

The motivation for the dispatch change (CM-name post-rename returning wrong
values) still stands but needs a different fix: the marker-passthrough path
has to apply Delta's logical->physical name substitution itself rather than
forcing every Delta scan through the heavier kernel path. Left as a TODO.

The partition-metadata SPI hook (also added in e417211) is unaffected by
this revert and remains in place -- it's what addresses the broader
UPDATE/DELETE/MERGE/CDC failures via `InputFileBlockHolder` population.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-instate the marker-dispatch case reverted in 35f1b3b, with a tighter
charter:

When a `CometScanExec` is tagged with a contrib's `scanImpl` (any string
listed in the contrib's `nativeParquetScanImpls`), route it through that
contrib's `matchOperator` serde rather than the generic `CometScanWrapper`.
The contrib chooses per-scan whether to claim it (returning `Some`) or
defer to the wrapper (returning `None`). If `matchOperator` returns `None`,
OR if the chosen serde's `convertToComet` returns `None`, the marker falls
back to the generic wrapper -- so the contrib retains full control over
when the heavier native conversion fires.

The previous revert was driven by a downstream bug: when this dispatch
unconditionally claimed every marker, the Delta contrib's
`CometDeltaNativeScan.serialize` returned 0 rows for simple streaming-source
reads, regressing ~525 tests. With the new design that bug is gated behind
the contrib's `matchOperator` -- contribs that aren't ready for the full
conversion just return `None` until their `serialize` handles every case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
schenksj and others added 2 commits May 15, 2026 21:24
`PlanDataInjector` previously had a hardcoded list of built-in injectors with
a `// Future: DeltaPlanDataInjector, HudiPlanDataInjector, etc.` comment.
That meant a Delta-contrib (PR2) could not actually plug its per-partition
proto-injection logic into the execution path: tasks serialized via
`perPartitionByKey` never got merged back into the operator tree at
`CometExecRDD.compute` time, so the native side decoded `DeltaScan` with an
empty `tasks` list and returned `EmptyExec` (0 rows) for any non-empty Delta
scan that took the native conversion path.

Promote the injector list to a built-in seq + a registerable contrib seq, and
expose `registerInjector` / `clearContribInjectors` on the singleton. Same
pattern as the `CometExecRDD.PartitionMetadataHandler` SPI added in e417211:
contribs register their injector from `CometOperatorSerdeExtension.init`,
`CometExtensionRegistry.resetForTesting` clears the registry alongside other
contrib state for test isolation.

Visibility of the `PlanDataInjector` object had to be widened from
`private[comet]` to package-default (public) so the registry-reset call site
in `org.apache.comet.spi.CometExtensionRegistry` can reach the
`clearContribInjectors` method. The trait stays `private[comet]` (and so does
the rest of the implementation) — contribs in `org.apache.comet.contrib.*`
can still see it via the subpackage rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…put boundary

`foreachUntilCometInput` enumerates known Comet input-class types (`CometNativeScanExec`,
`CometScanExec`, etc.) and recurses past everything else. When a contrib-defined leaf
native exec (e.g. the Delta contrib's `CometDeltaNativeScanExec`) appeared in the plan,
it matched the generic `case _: CometPlan` arm and recursed into its empty children list
without ever invoking `func`. The caller then saw an empty `sparkPlans` buffer and crashed
on `firstNonBroadcastPlan.get` -- `None.get` at operators.scala:534.

Add a case before the `_: CometPlan` recurse arm: any `CometNativeExec` with zero children
is a Comet input. The explicit list above still wins for the known types (preserves existing
behaviour exactly), and contribs' leaf scans now participate without needing core-side
class enumeration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant