Extend pre-delete dep-guard to triggers#221
Merged
Merged
Conversation
Refuses a DROP TABLE while an active Pipeline still references the resource (as either source or sink), so dropping the underlying Kafka topic / Venice store / MySQL table can't silently orphan a downstream pipeline. Validator framework, made Connection-aware: - Validated.validate(Issues, Connection) (was: validate(Issues)) - ValidatorProvider.validators(T, Connection) (was: validators(T)) - ValidationService.validate(T, Issues, Connection) - ValidationService.validateOrThrow(T, Connection) - ValidationService.validateOrThrow(Collection<T>, Connection) - ValidationService.validators(T, Connection) PendingDelete<T> wrapper (hoptimator-api): - Explicit "this is being deleted" signal so unrelated callers of validateOrThrow(source, connection) don't accidentally trigger pre-delete checks. - Carries an optional selfOwnerUid so cascade-deleted children can be excluded from the dependent set. K8s indexed lookup: - PipelineDependencyLabels stamps `depends-on-<slug>` labels on every Pipeline CRD at create time, naming each source/sink. The slug is a 16-char SHA-256 prefix of `<database>_<dot-joined-path>`; an annotation lists the full identifiers so a slug collision can be detected at check time. - PipelineDependencyChecker uses a server-indexed label-selector list + annotation cross-check + selfOwnerUid filter. - K8sPipelineDeployer threads sources/sink through and calls PipelineDependencyLabels.labelsFor / annotationFor at toK8sObject(). K8sPipelineBundle and K8sMaterializedViewDeployer pass the data through. Dispatch: - K8sValidatorProvider returns a K8sPipelineDependencyValidator for PendingDelete<Source>; registered via META-INF/services/com.linkedin.hoptimator.ValidatorProvider. - K8sPipelineDependencyValidator wraps PipelineDependencyChecker as a Validator. DROP TABLE wiring: - HoptimatorDdlExecutor calls ValidationService.validateOrThrow(new PendingDelete<>(source), connection) before DeploymentService.delete in the table branch. HoptimatorDdlUtils.removeTableFromSchema() is the symmetric inverse of registerTemporaryTableInSchema() for cleanup. Implementor side-effects (no behavior change): - KafkaDeployer / VeniceDeployer / MySqlDeployer no longer need a declarative DependencyGuarded marker — the guard fires from the validator framework before delete() is reached. - All existing Validated implementors (DefaultValidator, CompatibilityValidatorBase, AvroTableValidator, K8sViewTable) and ValidatorProvider implementors (DefaultValidatorProvider, CompatibilityValidatorProvider, AvroValidatorProvider) updated to the new signatures. Tests: PipelineDependencyLabelsTest, PipelineDependencyCheckerTest, K8sPipelineDeployerTest assertions for stamping, validator-framework test updates throughout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LogicalTableDeployer.delete() previously threw SQLFeatureNotSupported. Now implemented end-to-end as a per-tier sequence that mirrors what running DROP TABLE on each tier independently would do, plus the LogicalTable CRD removal at the top. Flow: 1. Per-tier pre-flight via the validator framework: ValidationService.validateOrThrow(new PendingDelete<>(tierSource, logicalTableUid), connection) — refuses the drop if any active external pipeline still references a tier resource. The selfOwnerUid is the LogicalTable CRD's UID so the implicit inter-tier pipelines (owned by the CRD, cascade-deleted with it) don't self-block. 2. Delete the LogicalTable CRD. K8s owner-ref cascade removes its owned Pipeline and TableTrigger CRDs. 3. Best-effort physical cleanup of each tier resource (Kafka topic, Venice store, ...). A failed tier delete logs a warning but does not abort — a stranded tier is recoverable; aborting mid-DROP isn't. 4. Per-tier schema cleanup: deregister the TemporaryTable entry in each tier schema only when its physical delete succeeded. Tests: - LogicalTableDeployerTest deleteRemovesCrdAndCleansUpTierResources, deletePropagatesCrdDeletionFailure, deleteSwallowsTierCleanupFailures. - logical-ddl.id integration test: DROP TABLE LOGICAL.testevent now succeeds and cascades the implicit nearline-to-online pipeline. - logical-offline-ddl.id companion update. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kafka-ddl-create-table.id: cross-driver dependency-guard scenarios exercising the new pre-delete check end-to-end through the kafka driver — drop-table-while-pipeline-depends-on-it (source side and partial-view sink side). The bulk of the file count is mechanical noise reduction across existing test files: dropped unused imports, tightened generics on @SuppressWarnings, etc. — fallout from the warning_cleanup pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pipelines previously stamped a single `depends-on` annotation listing every source and sink undifferentiated. The dep-guard collision check worked on this, but it loses the source/sink direction information needed for visualization. Replace the unified annotation with two directional annotations: hoptimator.linkedin.com/depends-on-sources: <s1>,<s2>,... hoptimator.linkedin.com/depends-on-sink: <single sink> The dep-guard's annotationConfirms now reads sources or sink as the collision-guard set — same correctness guarantee, no semantic change for the dep-guard. The split unlocks directional rendering for the upcoming pipeline visualizer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related changes that bring triggers into the depends-on label
index so the pre-delete dep-guard can find them.
Trigger API:
- Replace the old `(String database, List<String> path)` pair with
a typed `Source source` field (existing hoptimator-api type).
Drop the convenience getters database() / path() / table() /
schema() — callers go through trigger.source().X() for symmetry
with the new Sink.
- Add an optional `Sink sink` field for bridging-tier triggers
(LogicalTableDeployer's offline → online reverse-ETL flow), so
the deployer can stamp a depends-on-sink annotation in addition
to the source side.
- Source is nullable for DROP / PAUSE / RESUME paths that only
need the trigger name.
K8sTriggerDeployer stamping:
- On both the toK8sObject and the partial-update paths, stamp:
depends-on-<sourceSlug>: <sourceIdentifier>
annotation depends-on-sources: <sourceIdentifier>
and when sink is set:
depends-on-<sinkSlug>: <sinkIdentifier>
annotation depends-on-sink: <sinkIdentifier>
LogicalTableDeployer wiring:
- Pass `offlineSource` directly as the trigger's Source and a Sink
derived from `onlineSource` (when present) as the trigger's Sink.
HoptimatorDdlExecutor:
- Resolve the target table's database name the same way DROP TABLE
resolves it (HoptimatorJdbcTable / TemporaryTable unwrap), so
user-created `CREATE TRIGGER ... ON <schema>.<table>` triggers
participate in the dep-guard. Without this, Trigger.source was
null and K8sTriggerDeployer skipped label stamping — a trigger
could outlive its source silently.
Tests:
- K8sTriggerDeployerTest gains updateStampsSinkLabelWhenTriggerCarriesASink
pinning that a Trigger carrying a Sink stamps both source-side and
sink-side depends-on labels on the partial-update path.
- TriggerTest covers the new accessor shape (source(), sink(), null
source for the bare-name case).
- LogicalTableDeployerTest asserts trigger.source() / trigger.sink()
instead of the removed convenience getters.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Triggers carry the same depends-on-<slug> labels Pipelines do (stamped by K8sTriggerDeployer in the previous commit), but the dep-guard's PipelineDependencyChecker only consulted the Pipeline CRD list. That left a hole: a user could DROP TABLE on a source still referenced by a live trigger. PipelineDependencyChecker now runs the same label-selector + annotation-confirmation logic against TableTrigger CRDs as well. The inner loop is genericized over KubernetesObject; each blocker is tagged with its CRD kind (pipeline/foo, trigger/bar) so the error message points the user at what to unhook. Self-owner exclusion still applies — LogicalTable-owned triggers don't block their parent's cascade-delete. Coverage: - PipelineDependencyCheckerTest gets paired blocksOnExternalTrigger, skipsSelfOwnedTrigger, and errorMessageListsAllBlockersAcrossKinds cases that prove triggers participate alongside pipelines. - New k8s-trigger-validation.id integration scenario: CREATE TABLE → CREATE TRIGGER → DROP TABLE blocked → DROP TRIGGER → DROP TABLE succeeds. Mirrors the MV pattern the existing k8s-validation.id scenarios use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PipelineDependency{Labels,Checker} and K8sPipelineDependencyValidator
were specific to Pipeline CRDs in name only — TableTrigger CRDs now
wear the same depends-on labels and annotations. Renamed to
DependencyLabels, DependencyChecker, and K8sDependencyValidator.
Collapsed the labels API: DependencyLabels now exposes a single
stamp(V1ObjectMeta, Collection<Source>, Sink) that writes both the
depends-on labels and the directional annotations. K8sTriggerDeployer
had reimplemented this inline in two places; both call sites now
collapse to one stamp() call.
Also dropped the 5-arg Trigger constructor; callers must pass a sink
(or null) explicitly.
Collaborator
|
@copilot resolve the merge conflicts in this pull request |
…guard # Conflicts: # hoptimator-api/src/test/java/com/linkedin/hoptimator/PendingDeleteTest.java # hoptimator-k8s/src/main/java/com/linkedin/hoptimator/k8s/K8sPipelineBundle.java # hoptimator-k8s/src/main/java/com/linkedin/hoptimator/k8s/K8sPipelineDeployer.java # hoptimator-k8s/src/main/java/com/linkedin/hoptimator/k8s/K8sValidatorProvider.java # hoptimator-k8s/src/test/java/com/linkedin/hoptimator/k8s/K8sPipelineDeployerTest.java # hoptimator-k8s/src/test/java/com/linkedin/hoptimator/k8s/K8sValidatorProviderTest.java # hoptimator-kafka/src/test/resources/kafka-ddl-create-table.id # hoptimator-logical/src/main/java/com/linkedin/hoptimator/logical/LogicalTableDeployer.java # hoptimator-logical/src/test/java/com/linkedin/hoptimator/logical/LogicalTableDeployerTest.java # hoptimator-logical/src/test/resources/logical-ddl.id Co-authored-by: ryannedolan <1387539+ryannedolan@users.noreply.github.com>
Contributor
Resolved in commit 9fcfb4d. The stacked branch (jogrogan/dependency-guard-validator-v2) had been merged into main as PR #219, so I merged origin/main into this branch and resolved all conflicts, preserving all the new trigger dep-guard work on top of the current main. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Code Coverage
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends the pre-delete dependency guard (which previously only saw Pipeline CRDs) to also block deletes when a TableTrigger references the resource being dropped. Along the way it generalizes the labels/checker machinery so it's not
Pipeline-specific in name or shape.
Stacked on jogrogan/dependency-guard-validator-v2.
What's in it
inline duplication that had crept into K8sTriggerDeployer.
Test plan