Summary
SqlProof.client_for_dataset({}) calls _insert_dataset unconditionally,
which calls insertion_order(schema_info.tables) to topo-sort tables
before iterating any rows. If the target schema has a foreign-key cycle,
the topo sort raises CircularDependencyError — even when the dataset
is empty and no rows ever need to land.
Reproduction
Any Postgres schema with a mutual FK cycle. In our schema the cycle is:
content.current_snapshot_id → content_import_snapshots(id)
content_import_snapshots.content_id → content(id)
(A common "row points at the most recent child" pattern.)
from sqlproof import SqlProof
proof = SqlProof.from_connection_string(dsn)
with proof.client_for_dataset({}) as client: # raises here
pass
Output:
sqlproof.exceptions.CircularDependencyError: Circular foreign-key dependency detected: ai_response_contexts, ai_responses, blog_drafts, content, content_edit_feedback, content_edits, content_findings, content_import_attempts, content_import_snapshots, content_versions, gap_content_drafts, geo_citations_old, geo_questions, webflow_published_items, wordpress_published_posts, workflow_metadata
Note: the table list in the error reports everything transitively
downstream of the cycle, not just the two tables that form it — which
makes the actual cycle hard to spot.
Why this is a bug
For an empty dataset the topo-sort result is never used. The loop body
in src/sqlproof/core.py:250-261 finds no rows for any table and exits:
for table in insertion_order(schema_info.tables):
rows = dataset.get(table.name, [])
for row in rows:
if not row:
continue
# … insert
The effect: every property test that uses the standard
proof.client_for_dataset({}) pattern against a schema with a cycle
can't even open the client. In our repo this took out the entire
DB-backed test suite (~30 tests), not just the ones that would
actually have wanted to insert into the cyclic tables.
Suggested fix
Short-circuit _insert_dataset when the dataset has no rows to insert:
def _insert_dataset(client, schema_info, dataset):
if not any(rows for rows in dataset.values()):
return
for table in insertion_order(schema_info.tables):
…
Two-line patch. Doesn't change behavior for non-empty datasets and
doesn't paper over real cycle errors when rows actually need to land —
those still get raised, with the same message, at the same call site.
Workaround we're using
Reaches into a private attribute and skips the framework's
setup/teardown lifecycle, but unblocks tests today:
@pytest.fixture
def cycle_safe_db(proof: SqlProof):
with proof._db_manager.acquire() as client:
client.execute("SAVEPOINT my_test")
try:
yield client
finally:
client.execute("ROLLBACK TO SAVEPOINT my_test")
client.execute("RELEASE SAVEPOINT my_test")
Related (probably a separate issue / feature request)
There's no public escape hatch for schemas with a real cycle the test
author can't fix (e.g. tables owned by another team, third-party
extensions, materialized snapshots of historical schemas). An
excluded_tables or included_tables argument on SqlProofConfig
would let those projects opt cyclic tables out of introspection
entirely. The empty-dataset fix above unblocks the common case; this
would unblock the harder case. Happy to file separately if useful.
Environment
- sqlproof:
0.1.0a1 (editable, local path)
- Python:
3.11.8
- Postgres:
16 (Supabase local)
- macOS 14
Summary
SqlProof.client_for_dataset({})calls_insert_datasetunconditionally,which calls
insertion_order(schema_info.tables)to topo-sort tablesbefore iterating any rows. If the target schema has a foreign-key cycle,
the topo sort raises
CircularDependencyError— even when the datasetis empty and no rows ever need to land.
Reproduction
Any Postgres schema with a mutual FK cycle. In our schema the cycle is:
content.current_snapshot_id→content_import_snapshots(id)content_import_snapshots.content_id→content(id)(A common "row points at the most recent child" pattern.)
Output:
Note: the table list in the error reports everything transitively
downstream of the cycle, not just the two tables that form it — which
makes the actual cycle hard to spot.
Why this is a bug
For an empty dataset the topo-sort result is never used. The loop body
in
src/sqlproof/core.py:250-261finds no rows for any table and exits:The effect: every property test that uses the standard
proof.client_for_dataset({})pattern against a schema with a cyclecan't even open the client. In our repo this took out the entire
DB-backed test suite (~30 tests), not just the ones that would
actually have wanted to insert into the cyclic tables.
Suggested fix
Short-circuit
_insert_datasetwhen the dataset has no rows to insert:Two-line patch. Doesn't change behavior for non-empty datasets and
doesn't paper over real cycle errors when rows actually need to land —
those still get raised, with the same message, at the same call site.
Workaround we're using
Reaches into a private attribute and skips the framework's
setup/teardown lifecycle, but unblocks tests today:
Related (probably a separate issue / feature request)
There's no public escape hatch for schemas with a real cycle the test
author can't fix (e.g. tables owned by another team, third-party
extensions, materialized snapshots of historical schemas). An
excluded_tablesorincluded_tablesargument onSqlProofConfigwould let those projects opt cyclic tables out of introspection
entirely. The empty-dataset fix above unblocks the common case; this
would unblock the harder case. Happy to file separately if useful.
Environment
0.1.0a1(editable, local path)3.11.816(Supabase local)