Skip to content

refactor!: explicit backend api cleanup#6

Merged
gkennos merged 21 commits into
mainfrom
api_cleanup
May 19, 2026
Merged

refactor!: explicit backend api cleanup#6
gkennos merged 21 commits into
mainfrom
api_cleanup

Conversation

@gkennos
Copy link
Copy Markdown
Member

@gkennos gkennos commented May 14, 2026

BREAKING CHANGE: known breaking change - removal of the dialect argument from _merge_replace() and _merge_upsert().

Opinionated backend now means that in the unlikely event that someone was using this on some other dialect, it will no longer work, but normal usage on SQLite and PostgreSQL is intended to remain functionally the same.

CLOSES: #3, CLOSES: #5, CLOSES: #7

@gkennos gkennos requested review from Copilot and nicoloesch May 14, 2026 22:26
@gkennos gkennos self-assigned this May 14, 2026
@gkennos gkennos added the enhancement New feature or request label May 14, 2026
@gkennos gkennos changed the title explicit backend api cleanup refactor!: explicit backend api cleanup May 14, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors orm-loader to make database-specific behavior explicit via a backend abstraction, while updating the PostgreSQL path to use psycopg (v3) and expanding tests/docs accordingly.

Changes:

  • Introduces a DatabaseBackend contract with concrete SQLite/PostgreSQL implementations plus resolve_backend() dispatch.
  • Routes staging-table creation, fast-path loading, merge behavior, FK toggling, and materialized view operations through the resolved backend.
  • Updates packaging to add a postgres extra (psycopg[binary]), bumps version to 0.4.0, and adds/updates backend-focused test coverage and docs.

Reviewed changes

Copilot reviewed 34 out of 36 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
uv.lock Bumps package version and adds locked dependencies for new postgres extra (psycopg).
TODO.txt Adds future-facing notes about optional malformed text repair.
tests/models.py Adds an index to support index-management behavior in merge tests.
tests/conftest.py Switches PostgreSQL test DSN to SQLAlchemy psycopg driver.
tests/backends/test_sqlite_backend.py Adds unit tests for SQLite backend staging, FK toggling, merge SQL, and pragmas hooks.
tests/backends/test_postgres_backend.py Adds unit tests for Postgres backend staging, FK toggling, merge SQL, and materialized view SQL emission.
tests/backends/test_base_backend.py Adds contract tests for backend base class, capabilities, backend resolution, and import behavior without psycopg installed.
src/orm_loader/tables/typing.py Tightens typing (generics, TypedDict/Unpack) and adds index_strategy/updated merge signatures to the protocol.
src/orm_loader/tables/serialisable_table.py Improves type annotations for JSON/dict serialization helpers.
src/orm_loader/tables/orm_table.py Tightens ORMTableBase typing and reduces cast() usage.
src/orm_loader/tables/loadable_table.py Refactors staging/merge/index-management to use backend resolution and introduces index_strategy.
src/orm_loader/mappers/materialised_view_mixin.py Routes MV create/refresh through backend implementations and improves typing.
src/orm_loader/loaders/loading_helpers.py Updates PostgreSQL COPY helper to a psycopg3-style copy loop and introduces COPY_BLOCK_SIZE.
src/orm_loader/loaders/data_classes.py Removes stale commented-out dedupe code and trailing whitespace.
src/orm_loader/helpers/sqlite.py Replaces direct event-hook/pragmas logic with backend-delegated helpers for SQLite.
src/orm_loader/helpers/logging.py Simplifies log-level coercion and adds typing for formatter methods.
src/orm_loader/helpers/discovery.py Updates typing and changes defaulting behavior for the base parameter.
src/orm_loader/helpers/bulk.py Delegates FK toggling and bulk-load context behavior to the resolved backend; delegates replica-role engine context.
src/orm_loader/helpers/bootstrap.py Adds type annotations for bootstrap/schema creation helpers.
src/orm_loader/helpers/init.py Re-exports new/renamed SQLite helpers.
src/orm_loader/backends/sqlite.py Adds SQLite backend implementation (staging, merge SQL, FK toggling, connect hooks, FK error explanation, journal restore).
src/orm_loader/backends/resolve.py Adds backend resolver selecting a concrete backend based on SQLAlchemy dialect.
src/orm_loader/backends/postgres.py Adds Postgres backend implementation (unlogged staging, COPY fast-path, merge SQL, MV operations, replica-role context).
src/orm_loader/backends/base.py Adds backend base contract, capabilities, shared helpers, and a generic bulk-load context manager.
src/orm_loader/backends/init.py Exposes backend public API and resolver.
README.md Updates project description and documents backend support and staged ingestion approach.
pyproject.toml Bumps version, adds postgres optional dependency, and expands metadata/classifiers and tooling config.
docs/tables/mat_view.md Updates MV docs to reflect backend resolution and PostgreSQL-only support.
docs/tables/loadable_table.md Updates staged ingestion docs to reflect backend-specific behavior and index handling.
docs/loaders/loaders.md Clarifies loader responsibilities vs table/backend merge behavior and updates dedupe description.
docs/loaders/index.md Updates loader overview to emphasize staged loading and table/backend ownership of merge semantics.
docs/loaders/helpers.md Updates helper docs, including PostgreSQL COPY behavior and fallback behavior.
docs/loaders/context.md Documents quote_mode in loader context.
docs/index.md Updates project overview and explicitly documents current backend support limits.
CHANGELOG.md Adds 0.4.0 notes about psycopg and backend API cleanup.
.gitignore Adds _temp/ ignore entry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyproject.toml Outdated
Comment thread src/orm_loader/backends/postgres.py
Comment thread src/orm_loader/backends/sqlite.py
Comment thread src/orm_loader/backends/sqlite.py
Comment thread src/orm_loader/helpers/discovery.py
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 36 changed files in this pull request and generated 4 comments.

Comment thread pyproject.toml Outdated
Comment thread src/orm_loader/backends/postgres.py
Comment thread src/orm_loader/helpers/discovery.py
Comment thread src/orm_loader/helpers/sqlite.py Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 39 changed files in this pull request and generated 4 comments.

Comment thread src/orm_loader/backends/base.py Outdated
Comment thread src/orm_loader/tables/orm_table.py Outdated
Comment thread src/orm_loader/backends/sqlite.py Outdated
Comment thread src/orm_loader/backends/postgres.py Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 39 changed files in this pull request and generated 4 comments.

Comment thread src/orm_loader/backends/sqlite.py
Comment thread src/orm_loader/backends/postgres.py
Comment thread src/orm_loader/backends/postgres.py
Comment thread src/orm_loader/helpers/logging.py
Comment thread src/orm_loader/backends/base.py Outdated
Comment thread src/orm_loader/backends/resolve.py
Copy link
Copy Markdown
Collaborator

@nicoloesch nicoloesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid work overall. The backend abstraction is clean and the three stated goals are met. A few things to address before merge:

Medium

  1. SQL injection in quick_load_pg (loading_helpers.py): encoding and delimiter are interpolated directly into the COPY SQL string without validation. delimiter is safe in practice (only , or \t), but encoding comes from chardet and should be checked against an allow-list before interpolation.

  2. quote_mode default mismatch: load_csv defaults to "csv", quick_load_pg defaults to "auto". Auto-detection is the main new feature but it's unreachable through the normal API path. Suggest changing load_csv's default to "auto" (and updating LoaderContext to match) so it actually fires by default.

Low / Cleanup

  1. session.bind deprecation (loadable_table.py lines 96, 120, 180, 221, 441): Deprecated in SA 2.x; can return None silently with the recommended session construction pattern. Replace with session.get_bind().

  2. Bare assert isinstance(...) in production backend code (postgres.py lines 101, 107; sqlite.py lines 115, 121): Assertions are stripped by python -O. Use if not isinstance(...): raise RuntimeError(...).

  3. Orphaned comment block in data_classes.py (lines 174–199): The _dedupe_db method stub was removed but its commented-out body wasn't. Delete these lines.

  4. get_model_by_tablename (discovery.py): Only searches direct Base subclasses. Won't find models behind a mixin layer. Functionality is also not in main but we may need to address this.

  5. helpers/sqlite.py clarity: File is a valid compat shim (old callers using enable_sqlite_foreign_keys as a bare event hook function still work). The docstring should say so explicitly and point callers toward attach_sqlite_bulk_load_pragmas as the preferred path.

Comment thread src/orm_loader/helpers/discovery.py Outdated
Comment thread src/orm_loader/helpers/sqlite.py
Comment thread src/orm_loader/tables/typing.py
Comment thread src/orm_loader/tables/loadable_table.py
Comment thread src/orm_loader/backends/postgres.py
Comment thread src/orm_loader/backends/sqlite.py
Comment thread src/orm_loader/helpers/bulk.py
Comment thread src/orm_loader/loaders/data_classes.py Outdated
@gkennos gkennos requested a review from nicoloesch May 19, 2026 05:01
Copy link
Copy Markdown
Collaborator

@nicoloesch nicoloesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 asserts are still present, one comment about docstring. Apart from that, approved!

Comment thread src/orm_loader/helpers/bulk.py Outdated
Comment thread src/orm_loader/helpers/bulk.py Outdated
Comment thread src/orm_loader/helpers/sqlite.py
Comment thread src/orm_loader/helpers/bulk.py
@gkennos
Copy link
Copy Markdown
Member Author

gkennos commented May 19, 2026

All comments addressed - merging

@gkennos gkennos merged commit aa26f58 into main May 19, 2026
3 checks passed
@gkennos gkennos deleted the api_cleanup branch May 19, 2026 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

3 participants