Problem
Two related bugs surface when running `create-tables` against an MS-SQL destination whose source `orm.yaml` was generated from a schema-qualified PostgreSQL or MS-SQL source (e.g. `SRC_SCHEMA=mimic100`).
Bug 1 — `NoReferencedTableError` on schema-qualified FK targets
FK targets are stored in `orm.yaml` with the source schema prefix:
```yaml
gender_concept_id:
foreign_keys:
- mimic100.concept.concept_id # 3-part: schema.table.column
```
When `dict_to_metadata` builds the destination `MetaData`, tables are registered without a schema prefix (just `concept`). SQLAlchemy interprets `mimic100.concept.concept_id` as schema `mimic100`, table `concept` — and cannot find it in the MetaData. This causes a crash during `metadata.create_all()`'s topological sort:
```
sqlalchemy.exc.NoReferencedTableError: Foreign key associated with column
'person.gender_concept_id' could not find table 'mimic100.concept'
```
Fix
Added `_unqualify_fk_target(fk)` in `datafaker/serialize_metadata.py` to strip the schema qualifier before constructing `ForeignKey` objects:
```python
def _unqualify_fk_target(fk: str) -> str:
parts = fk.split(".")
return ".".join(parts[-2:]) if len(parts) == 3 else fk
```
The original 3-part string is still passed to `ignore_fk` so that `should_ignore_fk`'s schema-aware lookup continues to work correctly.
Bug 2 — MS-SQL error 1785 (multiple cascade paths)
Once Bug 1 is fixed, MS-SQL raises error 1785 when creating tables that have multiple FK columns all pointing to the same target table with `ON DELETE CASCADE`:
```
ProgrammingError: ('42000', "Introducing FOREIGN KEY constraint
'person_ethnicity_source_concept_id_fkey' on table 'person' may cause cycles or
multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or
modify other FOREIGN KEY constraints.")
```
OMOP-style schemas (and many vocabulary-heavy schemas) commonly have many FK columns on a single table all referencing the same vocabulary table (e.g. `concept`), which triggers this MS-SQL restriction. PostgreSQL allows multiple cascade paths freely.
Fix
Added a `@compiles(CreateTable, "mssql")` hook in `datafaker/create.py` that strips `ON DELETE CASCADE` from MS-SQL DDL, mirroring the existing DuckDB hook:
```python
@compiles(CreateTable, "mssql")
def remove_mssql_on_delete_cascade(element, compiler, **kw):
text = compiler.visit_create_table(element, **kw)
return text.replace(" ON DELETE CASCADE", "")
```
Dropping CASCADE is safe for datafaker because rows are inserted in topological (FK-respecting) order — referential integrity is enforced by insert order, not by the database engine.
Tests added
`tests/test_serialize_metadata_mssql.py`
- `TestUnqualifyFkTarget` — 3 unit tests for `_unqualify_fk_target`: 3-part target drops schema, 2-part unchanged, single-part unchanged
- `TestSchemaQualifiedFKResolution.test_schema_qualified_fk_resolves_in_metadata` — builds a MetaData with a schema-qualified FK target and asserts the FK resolves without `NoReferencedTableError`
`tests/test_create_mssql.py`
- `TestMSSQLRemoveOnDeleteCascade` — 3 tests: CASCADE absent from MS-SQL DDL, FOREIGN KEY constraint preserved, DuckDB hook not broken
Problem
Two related bugs surface when running `create-tables` against an MS-SQL destination whose source `orm.yaml` was generated from a schema-qualified PostgreSQL or MS-SQL source (e.g. `SRC_SCHEMA=mimic100`).
Bug 1 — `NoReferencedTableError` on schema-qualified FK targets
FK targets are stored in `orm.yaml` with the source schema prefix:
```yaml
gender_concept_id:
foreign_keys:
- mimic100.concept.concept_id # 3-part: schema.table.column
```
When `dict_to_metadata` builds the destination `MetaData`, tables are registered without a schema prefix (just `concept`). SQLAlchemy interprets `mimic100.concept.concept_id` as schema `mimic100`, table `concept` — and cannot find it in the MetaData. This causes a crash during `metadata.create_all()`'s topological sort:
```
sqlalchemy.exc.NoReferencedTableError: Foreign key associated with column
'person.gender_concept_id' could not find table 'mimic100.concept'
```
Fix
Added `_unqualify_fk_target(fk)` in `datafaker/serialize_metadata.py` to strip the schema qualifier before constructing `ForeignKey` objects:
```python
def _unqualify_fk_target(fk: str) -> str:
parts = fk.split(".")
return ".".join(parts[-2:]) if len(parts) == 3 else fk
```
The original 3-part string is still passed to `ignore_fk` so that `should_ignore_fk`'s schema-aware lookup continues to work correctly.
Bug 2 — MS-SQL error 1785 (multiple cascade paths)
Once Bug 1 is fixed, MS-SQL raises error 1785 when creating tables that have multiple FK columns all pointing to the same target table with `ON DELETE CASCADE`:
```
ProgrammingError: ('42000', "Introducing FOREIGN KEY constraint
'person_ethnicity_source_concept_id_fkey' on table 'person' may cause cycles or
multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or
modify other FOREIGN KEY constraints.")
```
OMOP-style schemas (and many vocabulary-heavy schemas) commonly have many FK columns on a single table all referencing the same vocabulary table (e.g. `concept`), which triggers this MS-SQL restriction. PostgreSQL allows multiple cascade paths freely.
Fix
Added a `@compiles(CreateTable, "mssql")` hook in `datafaker/create.py` that strips `ON DELETE CASCADE` from MS-SQL DDL, mirroring the existing DuckDB hook:
```python
@compiles(CreateTable, "mssql")
def remove_mssql_on_delete_cascade(element, compiler, **kw):
text = compiler.visit_create_table(element, **kw)
return text.replace(" ON DELETE CASCADE", "")
```
Dropping CASCADE is safe for datafaker because rows are inserted in topological (FK-respecting) order — referential integrity is enforced by insert order, not by the database engine.
Tests added
`tests/test_serialize_metadata_mssql.py`
`tests/test_create_mssql.py`