Skip to content

Extend pre-delete dep-guard to triggers#221

Merged
jogrogan merged 16 commits into
mainfrom
jogrogan/trigger-dep-guard
May 13, 2026
Merged

Extend pre-delete dep-guard to triggers#221
jogrogan merged 16 commits into
mainfrom
jogrogan/trigger-dep-guard

Conversation

@jogrogan
Copy link
Copy Markdown
Collaborator

Summary

Extends the pre-delete dependency guard (which previously only saw Pipeline CRDs) to also block deletes when a TableTrigger references the resource being dropped. Along the way it generalizes the labels/checker machinery so it's not
Pipeline-specific in name or shape.

Stacked on jogrogan/dependency-guard-validator-v2.

What's in it

  • Split the dep annotation into directional depends-on-sources and depends-on-sinks. The old single depends-on annotation lost source/sink direction, which the visualizer (next PR up the stack) needs and the checker was already implicitly reading as two-sided.
  • Dep-guard scans TableTrigger CRDs. DependencyChecker.findBlockers is generic over any KubernetesObject; we just call it twice (Pipelines and TableTriggers) and merge the blocker lists. Self-owned children are excluded from both sides so LogicalTableDeployer.delete() can still cascade.
  • Renames — PipelineDependency{Labels,Checker} → Dependency{Labels,Checker}, K8sPipelineDependencyValidator → K8sDependencyValidator. Nothing about these is Pipeline-specific anymore.
  • Collapsed labels API — single DependencyLabels.stamp(meta, Collection, Collection) that writes both the labels and the directional annotations. Replaces three separate labelsFor/sourcesAnnotation/sinkAnnotation helpers and the
    inline duplication that had crept into K8sTriggerDeployer.
  • Wire format is symmetric — both depends-on-sources and depends-on-sinks are comma-separated identifier lists. Today every caller passes a singleton sink, but the format doesn't bake in the assumption — a future multi-sink resource needs no migration.
  • Adds optional Sink to trigger - currently only used by LogicalTables for a known rETL trigger, this should be generalized (TODO left in code).

Test plan

  • K8sTriggerDeployerTest — adds coverage for the dep-label stamping on both update (pause/resume partial path) and toK8sObject (full path), for source-only and source+sink trigger shapes.
  • k8s-trigger-validation.id Quidem integration test: DROP TABLE on a source with an active trigger is blocked with the expected error message.

jogrogan and others added 14 commits May 12, 2026 11:35
Refuses a DROP TABLE while an active Pipeline still references the
resource (as either source or sink), so dropping the underlying Kafka
topic / Venice store / MySQL table can't silently orphan a downstream
pipeline.

Validator framework, made Connection-aware:
- Validated.validate(Issues, Connection)              (was: validate(Issues))
- ValidatorProvider.validators(T, Connection)         (was: validators(T))
- ValidationService.validate(T, Issues, Connection)
- ValidationService.validateOrThrow(T, Connection)
- ValidationService.validateOrThrow(Collection<T>, Connection)
- ValidationService.validators(T, Connection)

PendingDelete<T> wrapper (hoptimator-api):
- Explicit "this is being deleted" signal so unrelated callers of
  validateOrThrow(source, connection) don't accidentally trigger
  pre-delete checks.
- Carries an optional selfOwnerUid so cascade-deleted children can be
  excluded from the dependent set.

K8s indexed lookup:
- PipelineDependencyLabels stamps `depends-on-<slug>` labels on every
  Pipeline CRD at create time, naming each source/sink. The slug is a
  16-char SHA-256 prefix of `<database>_<dot-joined-path>`; an
  annotation lists the full identifiers so a slug collision can be
  detected at check time.
- PipelineDependencyChecker uses a server-indexed label-selector list
  + annotation cross-check + selfOwnerUid filter.
- K8sPipelineDeployer threads sources/sink through and calls
  PipelineDependencyLabels.labelsFor / annotationFor at toK8sObject().
  K8sPipelineBundle and K8sMaterializedViewDeployer pass the data
  through.

Dispatch:
- K8sValidatorProvider returns a K8sPipelineDependencyValidator for
  PendingDelete<Source>; registered via
  META-INF/services/com.linkedin.hoptimator.ValidatorProvider.
- K8sPipelineDependencyValidator wraps PipelineDependencyChecker as a
  Validator.

DROP TABLE wiring:
- HoptimatorDdlExecutor calls
  ValidationService.validateOrThrow(new PendingDelete<>(source),
  connection) before DeploymentService.delete in the table branch.
  HoptimatorDdlUtils.removeTableFromSchema() is the symmetric inverse
  of registerTemporaryTableInSchema() for cleanup.

Implementor side-effects (no behavior change):
- KafkaDeployer / VeniceDeployer / MySqlDeployer no longer need a
  declarative DependencyGuarded marker — the guard fires from the
  validator framework before delete() is reached.
- All existing Validated implementors (DefaultValidator,
  CompatibilityValidatorBase, AvroTableValidator, K8sViewTable) and
  ValidatorProvider implementors (DefaultValidatorProvider,
  CompatibilityValidatorProvider, AvroValidatorProvider) updated to
  the new signatures.

Tests: PipelineDependencyLabelsTest, PipelineDependencyCheckerTest,
K8sPipelineDeployerTest assertions for stamping, validator-framework
test updates throughout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LogicalTableDeployer.delete() previously threw SQLFeatureNotSupported.
Now implemented end-to-end as a per-tier sequence that mirrors what
running DROP TABLE on each tier independently would do, plus the
LogicalTable CRD removal at the top.

Flow:
1. Per-tier pre-flight via the validator framework:
   ValidationService.validateOrThrow(new PendingDelete<>(tierSource,
   logicalTableUid), connection) — refuses the drop if any active
   external pipeline still references a tier resource. The
   selfOwnerUid is the LogicalTable CRD's UID so the implicit
   inter-tier pipelines (owned by the CRD, cascade-deleted with it)
   don't self-block.
2. Delete the LogicalTable CRD. K8s owner-ref cascade removes its
   owned Pipeline and TableTrigger CRDs.
3. Best-effort physical cleanup of each tier resource (Kafka topic,
   Venice store, ...). A failed tier delete logs a warning but does
   not abort — a stranded tier is recoverable; aborting mid-DROP
   isn't.
4. Per-tier schema cleanup: deregister the TemporaryTable entry in
   each tier schema only when its physical delete succeeded.

Tests:
- LogicalTableDeployerTest deleteRemovesCrdAndCleansUpTierResources,
  deletePropagatesCrdDeletionFailure, deleteSwallowsTierCleanupFailures.
- logical-ddl.id integration test: DROP TABLE LOGICAL.testevent now
  succeeds and cascades the implicit nearline-to-online pipeline.
- logical-offline-ddl.id companion update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kafka-ddl-create-table.id: cross-driver dependency-guard scenarios
exercising the new pre-delete check end-to-end through the kafka
driver — drop-table-while-pipeline-depends-on-it (source side and
partial-view sink side).

The bulk of the file count is mechanical noise reduction across
existing test files: dropped unused imports, tightened generics on
@SuppressWarnings, etc. — fallout from the warning_cleanup pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pipelines previously stamped a single `depends-on` annotation listing
every source and sink undifferentiated. The dep-guard collision check
worked on this, but it loses the source/sink direction information
needed for visualization.

Replace the unified annotation with two directional annotations:

  hoptimator.linkedin.com/depends-on-sources: <s1>,<s2>,...
  hoptimator.linkedin.com/depends-on-sink:    <single sink>

The dep-guard's annotationConfirms now reads sources or sink as the
collision-guard set — same correctness guarantee, no semantic change
for the dep-guard. The split unlocks directional rendering for the
upcoming pipeline visualizer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related changes that bring triggers into the depends-on label
index so the pre-delete dep-guard can find them.

Trigger API:

  - Replace the old `(String database, List<String> path)` pair with
    a typed `Source source` field (existing hoptimator-api type).
    Drop the convenience getters database() / path() / table() /
    schema() — callers go through trigger.source().X() for symmetry
    with the new Sink.
  - Add an optional `Sink sink` field for bridging-tier triggers
    (LogicalTableDeployer's offline → online reverse-ETL flow), so
    the deployer can stamp a depends-on-sink annotation in addition
    to the source side.
  - Source is nullable for DROP / PAUSE / RESUME paths that only
    need the trigger name.

K8sTriggerDeployer stamping:

  - On both the toK8sObject and the partial-update paths, stamp:
      depends-on-<sourceSlug>: <sourceIdentifier>
      annotation depends-on-sources: <sourceIdentifier>
    and when sink is set:
      depends-on-<sinkSlug>: <sinkIdentifier>
      annotation depends-on-sink: <sinkIdentifier>

LogicalTableDeployer wiring:

  - Pass `offlineSource` directly as the trigger's Source and a Sink
    derived from `onlineSource` (when present) as the trigger's Sink.

HoptimatorDdlExecutor:

  - Resolve the target table's database name the same way DROP TABLE
    resolves it (HoptimatorJdbcTable / TemporaryTable unwrap), so
    user-created `CREATE TRIGGER ... ON <schema>.<table>` triggers
    participate in the dep-guard. Without this, Trigger.source was
    null and K8sTriggerDeployer skipped label stamping — a trigger
    could outlive its source silently.

Tests:

  - K8sTriggerDeployerTest gains updateStampsSinkLabelWhenTriggerCarriesASink
    pinning that a Trigger carrying a Sink stamps both source-side and
    sink-side depends-on labels on the partial-update path.
  - TriggerTest covers the new accessor shape (source(), sink(), null
    source for the bare-name case).
  - LogicalTableDeployerTest asserts trigger.source() / trigger.sink()
    instead of the removed convenience getters.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Triggers carry the same depends-on-<slug> labels Pipelines do (stamped
by K8sTriggerDeployer in the previous commit), but the dep-guard's
PipelineDependencyChecker only consulted the Pipeline CRD list. That
left a hole: a user could DROP TABLE on a source still referenced by
a live trigger.

PipelineDependencyChecker now runs the same label-selector +
annotation-confirmation logic against TableTrigger CRDs as well. The
inner loop is genericized over KubernetesObject; each blocker is
tagged with its CRD kind (pipeline/foo, trigger/bar) so the error
message points the user at what to unhook. Self-owner exclusion still
applies — LogicalTable-owned triggers don't block their parent's
cascade-delete.

Coverage:
- PipelineDependencyCheckerTest gets paired blocksOnExternalTrigger,
  skipsSelfOwnedTrigger, and errorMessageListsAllBlockersAcrossKinds
  cases that prove triggers participate alongside pipelines.
- New k8s-trigger-validation.id integration scenario:
  CREATE TABLE → CREATE TRIGGER → DROP TABLE blocked → DROP TRIGGER →
  DROP TABLE succeeds. Mirrors the MV pattern the existing
  k8s-validation.id scenarios use.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PipelineDependency{Labels,Checker} and K8sPipelineDependencyValidator
were specific to Pipeline CRDs in name only — TableTrigger CRDs now
wear the same depends-on labels and annotations. Renamed to
DependencyLabels, DependencyChecker, and K8sDependencyValidator.

Collapsed the labels API: DependencyLabels now exposes a single
stamp(V1ObjectMeta, Collection<Source>, Sink) that writes both the
depends-on labels and the directional annotations. K8sTriggerDeployer
had reimplemented this inline in two places; both call sites now
collapse to one stamp() call.

Also dropped the 5-arg Trigger constructor; callers must pass a sink
(or null) explicitly.
@jogrogan jogrogan changed the title Extend pre-delete dep-guard to TableTrigger CRDs Extend pre-delete dep-guard to triggers May 12, 2026
Base automatically changed from jogrogan/dependency-guard-validator-v2 to main May 13, 2026 01:36
@ryannedolan
Copy link
Copy Markdown
Collaborator

@copilot resolve the merge conflicts in this pull request

…guard

# Conflicts:
#	hoptimator-api/src/test/java/com/linkedin/hoptimator/PendingDeleteTest.java
#	hoptimator-k8s/src/main/java/com/linkedin/hoptimator/k8s/K8sPipelineBundle.java
#	hoptimator-k8s/src/main/java/com/linkedin/hoptimator/k8s/K8sPipelineDeployer.java
#	hoptimator-k8s/src/main/java/com/linkedin/hoptimator/k8s/K8sValidatorProvider.java
#	hoptimator-k8s/src/test/java/com/linkedin/hoptimator/k8s/K8sPipelineDeployerTest.java
#	hoptimator-k8s/src/test/java/com/linkedin/hoptimator/k8s/K8sValidatorProviderTest.java
#	hoptimator-kafka/src/test/resources/kafka-ddl-create-table.id
#	hoptimator-logical/src/main/java/com/linkedin/hoptimator/logical/LogicalTableDeployer.java
#	hoptimator-logical/src/test/java/com/linkedin/hoptimator/logical/LogicalTableDeployerTest.java
#	hoptimator-logical/src/test/resources/logical-ddl.id

Co-authored-by: ryannedolan <1387539+ryannedolan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 13, 2026

@copilot resolve the merge conflicts in this pull request

Resolved in commit 9fcfb4d. The stacked branch (jogrogan/dependency-guard-validator-v2) had been merged into main as PR #219, so I merged origin/main into this branch and resolved all conflicts, preserving all the new trigger dep-guard work on top of the current main.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • linkedin.jfrog.io
    • Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-opens=java.base/java.nio.charset=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.xml/javax.xml.namespace=ALL-UNNAMED -XX:MaxMetaspaceSize=384m -XX:&#43;HeapDumpOnOutOfMemoryError -Xms256m -Xmx512m -Dfile.encoding=UTF-8 -Duser.country -Duser.language=en (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI requested a review from ryannedolan May 13, 2026 01:46
Copy link
Copy Markdown
Collaborator

@ryannedolan ryannedolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat!

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Code Coverage

Overall Project 84.42% -0.22% 🟢
Files changed 87.89% 🟢

File Coverage
K8sDependencyValidator.java 100% 🟢
K8sValidatorProvider.java 100% 🟢
Trigger.java 100% 🟢
K8sPipelineDeployer.java 97.01% 🟢
DependencyLabels.java 94.37% -5.63% 🟢
K8sTriggerDeployer.java 92.36% -4.93% 🟢
HoptimatorDdlExecutor.java 92.08% -0.89% 🟢
K8sPipelineBundle.java 87.79% 🟢
DependencyChecker.java 86.86% -13.14% 🟢
LogicalTableDeployer.java 76.34% -0.69% 🟢

@jogrogan jogrogan merged commit f8d9685 into main May 13, 2026
1 check passed
@jogrogan jogrogan deleted the jogrogan/trigger-dep-guard branch May 13, 2026 03:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants