refactor: replace hook model with RuntimeConfig/FeatureSpec discriminated unions#74
Conversation
…ated unions Flatten the entangled HookDefinition → HookManifest → FeatureSchema hierarchy into a clean two-axis model: how a hook runs (RuntimeConfig) and what it produces (FeatureSpec), both using Pydantic discriminated unions for extensibility. - Add OciConfig (runtime) and TableFeatureSpec (feature) as first variants - Rename HookLimits → OciLimits (runtime-specific) - Delete HookManifest, FeatureSchema, HookSnapshot, and the lossy _snapshot_to_hook_definition() converter - Pass HookDefinition directly through event payloads (no more snapshots) - Unify FeatureService.create_table() / create_table_from_snapshot() - Update OCI runner to read hook.runtime.* and hook.name - Update SDK deploy payload to match new shape
|
Remove generate_feature_schema alias (now generate_columns), generate_dockerfile alias (now generate_hook_dockerfile), and update all tests to use the new names and payload shapes directly.
Greptile SummaryReplaces the entangled Key improvements
TestingAll 621 unit tests pass, comprehensive coverage of the new discriminated union model, and integration tests verify the duplicate table handling works correctly. Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| server/osa/domain/shared/model/hook.py | Replaced entangled hierarchy with clean RuntimeConfig/FeatureSpec discriminated unions - well-designed extensible architecture |
| server/osa/domain/feature/handler/create_feature_tables.py | Updated to use HookDefinition directly and gracefully handle duplicate table creation with ConflictError |
| server/osa/domain/validation/service/validation.py | Deleted lossy snapshot converter, now passes HookDefinition directly to runner |
| server/osa/infrastructure/oci/runner.py | Updated field access to use nested runtime structure (hook.runtime.image, hook.name) |
| sdk/py/osa/cli/deploy.py | Updated payload generation to match new RuntimeConfig/FeatureSpec structure |
| server/osa/domain/deposition/event/convention_registered.py | Changed event payload from HookSnapshot list to HookDefinition list |
Class Diagram
%%{init: {'theme': 'neutral'}}%%
classDiagram
class HookDefinition {
+PgIdentifier name
+OciConfig runtime
+TableFeatureSpec feature
}
class RuntimeConfig {
<<abstract>>
+string type
}
class OciConfig {
+type: "oci"
+string image
+string digest
+dict config
+OciLimits limits
}
class OciLimits {
+int timeout_seconds
+string memory
+string cpu
}
class FeatureSpec {
<<abstract>>
+string kind
}
class TableFeatureSpec {
+kind: "table"
+string cardinality
+ColumnDef[] columns
}
class ColumnDef {
+PgIdentifier name
+string json_type
+string format
+bool required
}
RuntimeConfig <|-- OciConfig
FeatureSpec <|-- TableFeatureSpec
HookDefinition --> OciConfig : runtime
HookDefinition --> TableFeatureSpec : feature
OciConfig --> OciLimits : limits
TableFeatureSpec --> ColumnDef : columns
note for RuntimeConfig "Discriminated on 'type'\nExtensible for NextflowConfig, etc."
note for FeatureSpec "Discriminated on 'kind'\nExtensible for TimeSeriesFeatureSpec, etc."
Last reviewed commit: a4646cb
Change mock return value from events=[] to deliveries=[] to match updated ClaimResult interface
Add SDK path filtering and test job for Python SDK validation. Add server contract tests for API contract verification. Add server integration tests with PostgreSQL service for end-to-end testing with database interactions.
test: improve mock setup with returncode and direct httpx patching feat: split integration tests into regular and postgres-specific variants test: add deliveries table to truncation list in test cleanup refactor: migrate from events-based to delivery-based event processing
|
@greptile |
| outbox: Outbox | ||
|
|
||
| async def handle(self, event: ConventionRegistered) -> None: | ||
| for hook_snapshot in event.hooks: | ||
| for hook in event.hooks: | ||
| logger.info( | ||
| "Creating feature table: hook=%s convention=%s", | ||
| hook_snapshot.name, | ||
| hook.name, | ||
| event.convention_srn, | ||
| ) | ||
| await self.feature_service.create_table_from_snapshot(hook_snapshot) | ||
| await self.feature_service.create_table(hook) |
There was a problem hiding this comment.
create_table is no longer idempotent — handler will fail on event redelivery
The old FeatureStore.create_table() was idempotent (used checkfirst=True / ON CONFLICT DO NOTHING). The integration test added in this PR explicitly changes this behaviour:
# test_feature_store.py (new)
with pytest.raises(ConflictError, match="already exists"):
await store.create_table("duplicate_hook", hook.feature.columns)This handle() method makes no attempt to catch ConflictError. In an at-least-once delivery system (which the outbox/event model here is), ConventionRegistered can be redelivered — e.g. after a worker restart, a stale claim reset, or an upstream retry. On the second delivery, create_table will raise ConflictError, the hook loop will abort, and ConventionReady will never be emitted for that convention, silently stalling the pipeline.
Options:
- Restore idempotent DDL (re-add
checkfirst=True/IF NOT EXISTSto the store) — simplest and matches the original intent. - Catch
ConflictErrorand treat it as a no-op:
from osa.domain.shared.error import ConflictError
async def handle(self, event: ConventionRegistered) -> None:
for hook in event.hooks:
logger.info("Creating feature table: hook=%s convention=%s", hook.name, event.convention_srn)
try:
await self.feature_service.create_table(hook)
except ConflictError:
logger.info("Feature table already exists, skipping: hook=%s", hook.name)
await self.outbox.append(ConventionReady(...))
Additional Comments (1)
The As written, the To honour the stated extensibility goal today and avoid a silent no-op annotation: from typing import Union
# After adding each new variant, extend this union:
RuntimeConfigUnion = Union[OciConfig] # → Union[OciConfig, NextflowConfig, ...]
FeatureSpecUnion = Union[TableFeatureSpec]
class HookDefinition(ValueObject):
name: PgIdentifier
runtime: Annotated[RuntimeConfigUnion, Field(discriminator="type")]
feature: Annotated[FeatureSpecUnion, Field(discriminator="kind")]This keeps the field types correct today (single-element unions collapse to the concrete type in practice) while giving a clear extension point and making |
Handle ConflictError when creating feature tables to support event redelivery scenarios where tables may already exist, ensuring ConventionReady event is still emitted for idempotent processing
|
@greptile |
Summary
HookDefinition → HookManifest → FeatureSchemahierarchy into a clean two-axis model: how a hook runs (RuntimeConfig) and what it produces (FeatureSpec), both using Pydantic discriminated unions for extensibilityHookManifest,FeatureSchema,HookSnapshot, and the lossy_snapshot_to_hook_definition()converter —HookDefinitionis now a value object passed directly through event payloadsFeatureService.create_table()/create_table_from_snapshot()into a single method, and update the OCI runner + SDK deploy payload to match the new shapeWhat changed
New model (
hook.py):Deleted:
HookManifest,FeatureSchema,HookSnapshot,hook_snapshot.py,_snapshot_to_hook_definition()Updated: 5 event files, 5 services, 1 handler, OCI runner, SDK manifest + deploy, 15 test files
Test plan
uv run pytest tests/unit -v)uv run ruff check osa/ tests/)uv run ty check) — no new diagnosticsosa deployfromrcsb-pdb/, verify conventions serialize correctly