Skip to content

Feature request: typed query artifacts for SQL files #509

@cofin

Description

@cofin

Summary

Follow-up feature request for brainstorming typed artifacts generated from SQL files. This is a wishlist / design exploration item, not committed implementation scope for PR #497.

Background

PR #497 adds -- param: metadata, runtime validation, optional named parameters, and declared type resolution. The next possible step is generating type-checker and IDE-friendly artifacts from the SQL file loader metadata plus query shape.

Wishlist / Brainstorming Topics

  • Generate typed query artifacts from existing SQLSpec surfaces, especially SQLFileLoader, ParameterDeclaration, SQLGlot parsing, and data dictionary metadata where available.
  • Factor in query shape, not just declared params: selected columns, aliases, expressions, select *, joins, casts, functions, and scalar/count-style queries.
  • Generate parameter artifacts such as TypedDict classes, keyword-only accessor signatures, and shared type aliases for JSON/UUID/Decimal/date/time values.
  • Generate result artifacts in one or more forms:
    • msgspec.Struct
    • Pydantic BaseModel
    • TypedDict
    • potentially dataclass/attrs if there is demand
  • Generate typed runtime accessors, not only .pyi stubs. .pyi may still be useful when runtime code stays dynamic, but should not be assumed to be the only output.
  • Centralize import generation so emitted files include only the imports actually needed by generated parameter/result types.
  • Use explicit SQL annotations as an escape hatch for ambiguous or non-portable query shapes, for example expression result types, JSON paths, driver-specific functions, duplicate column names, and unresolved select * projections.
  • Keep runtime validation separate from generated typing. Generated code should improve IDE/type-checker ergonomics without duplicating the runtime validation contract.

Possible Artifact Shape

A generator might emit files similar to:

generated/
  params.py      # param TypedDicts and aliases
  models.py      # result models, e.g. msgspec/Pydantic/TypedDict
  queries.py     # typed runtime accessors / optional typed execution helpers
  queries.pyi    # optional, if needed for dynamic runtime wrappers
  __init__.py

Open Design Questions

  • Which output families should be supported first: TypedDict, msgspec, Pydantic, or configurable combinations?
  • Should typed execution helpers call driver methods directly, or should generation stop at typed (SQL, params) accessors?
  • How should cardinality be declared or inferred: many, one, one-or-none, scalar, execute?
  • What annotation syntax should be used for result hints when SQL/data dictionary inference is insufficient?
  • How much should this rely on live database data dictionary introspection versus offline SQL-only inference?
  • Where should generated files live, and should the generator be CLI-based, Python API-based, or both?

Non-goals to consider

  • Do not build a separate SQL parser when existing SQLFileLoader / SQLGlot / data dictionary surfaces can be reused.
  • Do not duplicate execute-time runtime validation.
  • Do not infer full ORM-like schema models or relationships.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions