Skip to content

refactor: replace hook model with RuntimeConfig/FeatureSpec discriminated unions#74

Merged
rorybyrne merged 7 commits into
mainfrom
refactor/hook-model-discriminated-unions
Mar 2, 2026
Merged

refactor: replace hook model with RuntimeConfig/FeatureSpec discriminated unions#74
rorybyrne merged 7 commits into
mainfrom
refactor/hook-model-discriminated-unions

Conversation

@rorybyrne

Copy link
Copy Markdown
Contributor

Summary

  • Flatten the entangled HookDefinition → HookManifest → FeatureSchema hierarchy into a clean two-axis model: how a hook runs (RuntimeConfig) and what it produces (FeatureSpec), both using Pydantic discriminated unions for extensibility
  • Delete HookManifest, FeatureSchema, HookSnapshot, and the lossy _snapshot_to_hook_definition() converter — HookDefinition is now a value object passed directly through event payloads
  • Unify FeatureService.create_table() / create_table_from_snapshot() into a single method, and update the OCI runner + SDK deploy payload to match the new shape

What changed

New model (hook.py):

class HookDefinition(ValueObject):
    name: PgIdentifier
    runtime: Annotated[OciConfig, Field(discriminator="type")]
    feature: Annotated[TableFeatureSpec, Field(discriminator="kind")]

Deleted: HookManifest, FeatureSchema, HookSnapshot, hook_snapshot.py, _snapshot_to_hook_definition()

Updated: 5 event files, 5 services, 1 handler, OCI runner, SDK manifest + deploy, 15 test files

Test plan

  • All 621 unit tests pass (uv run pytest tests/unit -v)
  • Lint clean (uv run ruff check osa/ tests/)
  • Type check passes (uv run ty check) — no new diagnostics
  • Pre-commit hooks all green
  • Nuke demo DB, re-run osa deploy from rcsb-pdb/, verify conventions serialize correctly
  • Verify records publish and features insert with new shape

…ated unions

Flatten the entangled HookDefinition → HookManifest → FeatureSchema
hierarchy into a clean two-axis model: how a hook runs (RuntimeConfig)
and what it produces (FeatureSpec), both using Pydantic discriminated
unions for extensibility.

- Add OciConfig (runtime) and TableFeatureSpec (feature) as first variants
- Rename HookLimits → OciLimits (runtime-specific)
- Delete HookManifest, FeatureSchema, HookSnapshot, and the lossy
  _snapshot_to_hook_definition() converter
- Pass HookDefinition directly through event payloads (no more snapshots)
- Unify FeatureService.create_table() / create_table_from_snapshot()
- Update OCI runner to read hook.runtime.* and hook.name
- Update SDK deploy payload to match new shape
@github-actions

github-actions Bot commented Mar 1, 2026

Copy link
Copy Markdown

Code Coverage

Package Line Rate Complexity Health
. 70% 0
application 0% 0
application.api 100% 0
application.api.rest 0% 0
application.api.v1 82% 0
application.api.v1.routes 10% 0
application.event 100% 0
cli 40% 0
cli.commands 18% 0
cli.util 53% 0
domain 100% 0
domain.auth 100% 0
domain.auth.command 97% 0
domain.auth.event 100% 0
domain.auth.model 92% 0
domain.auth.port 99% 0
domain.auth.query 93% 0
domain.auth.service 90% 0
domain.auth.util 100% 0
domain.auth.util.di 81% 0
domain.curation 100% 0
domain.curation.adapter 100% 0
domain.curation.command 100% 0
domain.curation.event 100% 0
domain.curation.handler 92% 0
domain.curation.model 100% 0
domain.curation.port 100% 0
domain.curation.query 100% 0
domain.curation.service 100% 0
domain.deposition 100% 0
domain.deposition.adapter 100% 0
domain.deposition.command 68% 0
domain.deposition.event 100% 0
domain.deposition.handler 100% 0
domain.deposition.model 98% 0
domain.deposition.port 100% 0
domain.deposition.query 60% 0
domain.deposition.service 97% 0
domain.export 100% 0
domain.export.adapter 100% 0
domain.export.command 100% 0
domain.export.event 100% 0
domain.export.model 100% 0
domain.export.port 100% 0
domain.export.query 100% 0
domain.export.service 100% 0
domain.feature 100% 0
domain.feature.event 100% 0
domain.feature.handler 100% 0
domain.feature.model 0% 0
domain.feature.port 100% 0
domain.feature.service 96% 0
domain.feature.util 100% 0
domain.feature.util.di 0% 0
domain.index 100% 0
domain.index.event 100% 0
domain.index.handler 75% 0
domain.index.model 84% 0
domain.index.service 100% 0
domain.record 100% 0
domain.record.adapter 100% 0
domain.record.command 100% 0
domain.record.event 100% 0
domain.record.handler 100% 0
domain.record.model 100% 0
domain.record.port 100% 0
domain.record.query 100% 0
domain.record.service 89% 0
domain.search 100% 0
domain.search.adapter 100% 0
domain.search.command 100% 0
domain.search.event 100% 0
domain.search.model 100% 0
domain.search.port 100% 0
domain.search.query 100% 0
domain.search.service 100% 0
domain.semantics 100% 0
domain.semantics.command 31% 0
domain.semantics.event 100% 0
domain.semantics.handler 100% 0
domain.semantics.model 100% 0
domain.semantics.port 100% 0
domain.semantics.query 0% 0
domain.semantics.service 100% 0
domain.semantics.util 100% 0
domain.semantics.util.di 0% 0
domain.shared 91% 0
domain.shared.authorization 87% 0
domain.shared.model 93% 0
domain.shared.port 100% 0
domain.source 100% 0
domain.source.event 100% 0
domain.source.handler 92% 0
domain.source.model 100% 0
domain.source.port 100% 0
domain.source.schedule 67% 0
domain.source.service 98% 0
domain.validation 100% 0
domain.validation.adapter 100% 0
domain.validation.command 0% 0
domain.validation.event 100% 0
domain.validation.handler 90% 0
domain.validation.model 85% 0
domain.validation.port 100% 0
domain.validation.query 100% 0
domain.validation.service 89% 0
infrastructure 100% 0
infrastructure.auth 0% 0
infrastructure.event 64% 0
infrastructure.http 36% 0
infrastructure.index 0% 0
infrastructure.index.vector 73% 0
infrastructure.messaging 100% 0
infrastructure.oci 52% 0
infrastructure.persistence 71% 0
infrastructure.persistence.adapter 74% 0
infrastructure.persistence.mappers 56% 0
infrastructure.persistence.repository 32% 0
infrastructure.source 0% 0
sdk 100% 0
sdk.index 100% 0
util 100% 0
util.di 25% 0
Summary 63% (4041 / 6419) 0

Remove generate_feature_schema alias (now generate_columns),
generate_dockerfile alias (now generate_hook_dockerfile), and update
all tests to use the new names and payload shapes directly.
@greptile-apps

greptile-apps Bot commented Mar 1, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Replaces the entangled HookDefinition → HookManifest → FeatureSchema hierarchy with a clean two-axis discriminated union model, eliminating the intermediate HookSnapshot type and its lossy converter.

Key improvements

  • New architecture: HookDefinition now directly contains RuntimeConfig (how it runs) and FeatureSpec (what it produces), both using Pydantic discriminated unions for type-safe extensibility
  • Deleted complexity: Removed HookManifest, FeatureSchema, HookSnapshot, and the lossy _snapshot_to_hook_definition() converter that was discarding data
  • Event payloads: All events (ConventionRegistered, RecordPublished, etc.) now carry complete HookDefinition objects instead of partial snapshots
  • Service simplification: Unified FeatureService.create_table() and create_table_from_snapshot() into a single method
  • Graceful conflict handling: CreateFeatureTables handler now catches ConflictError for duplicate tables, preventing event processing failures on redelivery
  • CI expansion: Added contract, integration, and SDK test workflows

Testing

All 621 unit tests pass, comprehensive coverage of the new discriminated union model, and integration tests verify the duplicate table handling works correctly.

Confidence Score: 5/5

  • This PR is safe to merge with high confidence
  • Well-executed architectural refactoring with comprehensive test coverage (621 passing unit tests), proper handling of edge cases (duplicate table creation), clean discriminated union pattern for extensibility, and backwards-compatible event payload changes that don't require database migrations
  • No files require special attention - the refactoring is thorough and well-tested across all layers

Important Files Changed

Filename Overview
server/osa/domain/shared/model/hook.py Replaced entangled hierarchy with clean RuntimeConfig/FeatureSpec discriminated unions - well-designed extensible architecture
server/osa/domain/feature/handler/create_feature_tables.py Updated to use HookDefinition directly and gracefully handle duplicate table creation with ConflictError
server/osa/domain/validation/service/validation.py Deleted lossy snapshot converter, now passes HookDefinition directly to runner
server/osa/infrastructure/oci/runner.py Updated field access to use nested runtime structure (hook.runtime.image, hook.name)
sdk/py/osa/cli/deploy.py Updated payload generation to match new RuntimeConfig/FeatureSpec structure
server/osa/domain/deposition/event/convention_registered.py Changed event payload from HookSnapshot list to HookDefinition list

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class HookDefinition {
        +PgIdentifier name
        +OciConfig runtime
        +TableFeatureSpec feature
    }
    
    class RuntimeConfig {
        <<abstract>>
        +string type
    }
    
    class OciConfig {
        +type: "oci"
        +string image
        +string digest
        +dict config
        +OciLimits limits
    }
    
    class OciLimits {
        +int timeout_seconds
        +string memory
        +string cpu
    }
    
    class FeatureSpec {
        <<abstract>>
        +string kind
    }
    
    class TableFeatureSpec {
        +kind: "table"
        +string cardinality
        +ColumnDef[] columns
    }
    
    class ColumnDef {
        +PgIdentifier name
        +string json_type
        +string format
        +bool required
    }
    
    RuntimeConfig <|-- OciConfig
    FeatureSpec <|-- TableFeatureSpec
    HookDefinition --> OciConfig : runtime
    HookDefinition --> TableFeatureSpec : feature
    OciConfig --> OciLimits : limits
    TableFeatureSpec --> ColumnDef : columns
    
    note for RuntimeConfig "Discriminated on 'type'\nExtensible for NextflowConfig, etc."
    note for FeatureSpec "Discriminated on 'kind'\nExtensible for TimeSeriesFeatureSpec, etc."
Loading

Last reviewed commit: a4646cb

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

47 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Change mock return value from events=[] to deliveries=[] to match
updated ClaimResult interface
Add SDK path filtering and test job for Python SDK validation.
Add server contract tests for API contract verification.
Add server integration tests with PostgreSQL service for
end-to-end testing with database interactions.
test: improve mock setup with returncode and direct httpx patching

feat: split integration tests into regular and postgres-specific variants

test: add deliveries table to truncation list in test cleanup

refactor: migrate from events-based to delivery-based event processing
@rorybyrne

Copy link
Copy Markdown
Contributor Author

@greptile

Comment on lines +23 to +32
outbox: Outbox

async def handle(self, event: ConventionRegistered) -> None:
for hook_snapshot in event.hooks:
for hook in event.hooks:
logger.info(
"Creating feature table: hook=%s convention=%s",
hook_snapshot.name,
hook.name,
event.convention_srn,
)
await self.feature_service.create_table_from_snapshot(hook_snapshot)
await self.feature_service.create_table(hook)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_table is no longer idempotent — handler will fail on event redelivery

The old FeatureStore.create_table() was idempotent (used checkfirst=True / ON CONFLICT DO NOTHING). The integration test added in this PR explicitly changes this behaviour:

# test_feature_store.py (new)
with pytest.raises(ConflictError, match="already exists"):
    await store.create_table("duplicate_hook", hook.feature.columns)

This handle() method makes no attempt to catch ConflictError. In an at-least-once delivery system (which the outbox/event model here is), ConventionRegistered can be redelivered — e.g. after a worker restart, a stale claim reset, or an upstream retry. On the second delivery, create_table will raise ConflictError, the hook loop will abort, and ConventionReady will never be emitted for that convention, silently stalling the pipeline.

Options:

  1. Restore idempotent DDL (re-add checkfirst=True / IF NOT EXISTS to the store) — simplest and matches the original intent.
  2. Catch ConflictError and treat it as a no-op:
from osa.domain.shared.error import ConflictError

async def handle(self, event: ConventionRegistered) -> None:
    for hook in event.hooks:
        logger.info("Creating feature table: hook=%s convention=%s", hook.name, event.convention_srn)
        try:
            await self.feature_service.create_table(hook)
        except ConflictError:
            logger.info("Feature table already exists, skipping: hook=%s", hook.name)
    await self.outbox.append(ConventionReady(...))

@greptile-apps

greptile-apps Bot commented Mar 1, 2026

Copy link
Copy Markdown
Contributor
Additional Comments (1)

server/osa/domain/shared/model/hook.py
Discriminated union applied to concrete type, not a Union

The runtime and feature fields are annotated with Field(discriminator=...) but the type is a single concrete class (OciConfig / TableFeatureSpec), not a Union. Pydantic v2 discriminated unions require a Union type for the discriminator to actually function.

As written, the discriminator annotation is effectively ignored during validation — Pydantic just treats the field as OciConfig or TableFeatureSpec directly. The PR comment promises "new variants slot in without touching existing code," but that won't be true: adding NextflowConfig would also require updating HookDefinition itself to change the type to Union[OciConfig, NextflowConfig].

To honour the stated extensibility goal today and avoid a silent no-op annotation:

from typing import Union

# After adding each new variant, extend this union:
RuntimeConfigUnion = Union[OciConfig]   # → Union[OciConfig, NextflowConfig, ...]
FeatureSpecUnion   = Union[TableFeatureSpec]

class HookDefinition(ValueObject):
    name: PgIdentifier
    runtime: Annotated[RuntimeConfigUnion, Field(discriminator="type")]
    feature: Annotated[FeatureSpecUnion,   Field(discriminator="kind")]

This keeps the field types correct today (single-element unions collapse to the concrete type in practice) while giving a clear extension point and making model_validate work correctly with the discriminator once a second variant is added.

Handle ConflictError when creating feature tables to support event
redelivery scenarios where tables may already exist, ensuring
ConventionReady event is still emitted for idempotent processing
@rorybyrne

Copy link
Copy Markdown
Contributor Author

@greptile

@rorybyrne rorybyrne merged commit cefb391 into main Mar 2, 2026
9 checks passed
@rorybyrne rorybyrne deleted the refactor/hook-model-discriminated-unions branch March 2, 2026 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant