Background
It's a pretty common use case in SQLAlchemy to split models up into multiple modules/files. These models will often refer to each other using ForeignKey objects, which accept either a Model.attribute or Table.column syntax. To avoid circular dependencies in Python, the Table.column syntax is often used, wherein a string is passed to ForeignKey to be resolved later. In the follow example, two related models are split between two files. Even though one is loaded before the other, no exception is raised because SQLAlchemy delays resolution until all the modules have been loaded.
models/init.py
from .employee import Employee
from .manager import Manager
models/employee.py
class Employee(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String)
phone = db.Column(db.String)
manager_id = db.Column(db.ForeignKey("manager.id"))
models/manager.py
class Manager(db.Model):
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String)
salary = db.Column(db.BigInteger)
Another way of structuring the above is to treat a manager as a subclass of employee. Internally, additional fields specified in the Manager class are put in a separate table. The database is queried with a JOIN on employee to retrieve all the fields for a manager.
This creates a situation where a foreign key is inherited from the base class. In the below example, the id of a child model is set to a foreign key--the id of the base class.
models/init.py
from .employee import Employee, Manager
from .report import Report
models/employee.py
class Employee(db.Model):
__tablename__ = "employee"
employee_type = db.Column(db.String)
__mapper_args__ = {
"polymorphic_identity": "employee",
"polymorphic_on": employee_type
}
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String)
reports = db.relationship("report.id")
class Manager(Employee):
__tablename__ = "manager"
__mapper_args__ = {"polymorphic_identity": "manager"}
id = db.Column(db.ForeignKey("employee.id"), primary_key=True, unique=True)
salary = db.Column(db.BigInteger)
models/report.py
class Report(db.Model):
id = db.Column(db.Integer, primary_key=True)
employee_id = db.Column(db.Integer, db.ForeignKey("employee.id"))
title = db.Column(db.String)
contents = db.Column(db.Text)
First Issue
When creating a Mapper for a history table, chrononaut is aware of models that inherit from other classes. It checks for inheritance by setting super_mapper to the Mapper object of the base class. Later on, it inspects columns to see if there are foreign keys referencing the base class:
if super_mapper and col_references_table(column, super_mapper.local_table):
super_fks.append(
(col.key, list(super_history_mapper.local_table.primary_key)[0])
)
Normally, this works fine. In the above example, both Manager and Employee are in the same file, and if they weren't, we would still need to make sure that Employee was imported before declaring Manager.
The problem arises when considering Report. This model also has a foreign key, but it is not declared at the same time as Employee. The code that makes model inheritance from a base class work also breaks other foreign keys.
The issue is that col_references_table, the method that checks for an inherited foreign key, calls ForeignKey.references(). This method immediately tries to resolve Table.column strings rather than waiting until all the models have been loaded. This behavior breaks our ability to split related models across files.
Resolution
In the case of an inherited foreign key, the base class will already have been loaded. Therefore, ForeignKey.references() will not raise an exception. It is only in the case that the foreign key is not inherited that there is an issue. In this case, a sqlalchemy.exc.NoReferencedTableError exception is raised. If this exception is raised, we know for a fact that the key was not inherited.
The proposed fix is to just catch this exception and return False.
Second Issue
There are two additional spots in chrononaut where model resolution is forced via accessing the SQLAlchemy API.
for prop in local_mapper.iterate_properties:
getattr(cls, prop.key).impl.active_history = True
In the above, iterate_properties tries to resolve everything before allowing you to iterate over the mapper's properties.
if super_history_mapper:
bases = (super_history_mapper.class_,)
if table is not None:
properties['changed'] = (
(table.c.changed, ) + tuple(super_history_mapper.attrs.changed.columns)
)
Similarly, accessing .attrs on a mapper tries to resolve Table.column immediately.
Resolution
These can be fixed by using ._props, which is the private API that exists before the resolver runs. I think it's OK to rely on this since we are already deep into SQLAlchemy's API and many things (public or private) changed upstream will likely affect this package.
Background
It's a pretty common use case in SQLAlchemy to split models up into multiple modules/files. These models will often refer to each other using
ForeignKeyobjects, which accept either aModel.attributeorTable.columnsyntax. To avoid circular dependencies in Python, theTable.columnsyntax is often used, wherein a string is passed toForeignKeyto be resolved later. In the follow example, two related models are split between two files. Even though one is loaded before the other, no exception is raised because SQLAlchemy delays resolution until all the modules have been loaded.models/init.py
models/employee.py
models/manager.py
Another way of structuring the above is to treat a manager as a subclass of employee. Internally, additional fields specified in the Manager class are put in a separate table. The database is queried with a JOIN on employee to retrieve all the fields for a manager.
This creates a situation where a foreign key is inherited from the base class. In the below example, the
idof a child model is set to a foreign key--theidof the base class.models/init.py
models/employee.py
models/report.py
First Issue
When creating a
Mapperfor a history table,chrononautis aware of models that inherit from other classes. It checks for inheritance by settingsuper_mapperto theMapperobject of the base class. Later on, it inspects columns to see if there are foreign keys referencing the base class:Normally, this works fine. In the above example, both
ManagerandEmployeeare in the same file, and if they weren't, we would still need to make sure thatEmployeewas imported before declaringManager.The problem arises when considering
Report. This model also has a foreign key, but it is not declared at the same time asEmployee. The code that makes model inheritance from a base class work also breaks other foreign keys.The issue is that
col_references_table, the method that checks for an inherited foreign key, callsForeignKey.references(). This method immediately tries to resolveTable.columnstrings rather than waiting until all the models have been loaded. This behavior breaks our ability to split related models across files.Resolution
In the case of an inherited foreign key, the base class will already have been loaded. Therefore,
ForeignKey.references()will not raise an exception. It is only in the case that the foreign key is not inherited that there is an issue. In this case, asqlalchemy.exc.NoReferencedTableErrorexception is raised. If this exception is raised, we know for a fact that the key was not inherited.The proposed fix is to just catch this exception and return
False.Second Issue
There are two additional spots in
chrononautwhere model resolution is forced via accessing the SQLAlchemy API.In the above,
iterate_propertiestries to resolve everything before allowing you to iterate over the mapper's properties.Similarly, accessing
.attrson a mapper tries to resolveTable.columnimmediately.Resolution
These can be fixed by using
._props, which is the private API that exists before the resolver runs. I think it's OK to rely on this since we are already deep into SQLAlchemy's API and many things (public or private) changed upstream will likely affect this package.