Skip to content

Referential Integrity

Val Huber edited this page Dec 21, 2020 · 22 revisions

Background

Referential Integrity (RI) has been implemented by most modern DBMS products, addressing:

  • Child Insert / Update - Parent Check: the parent must exist unless the Foreign Key has any null values

    • E.g., Employees must belong to a valid department, or have a null DepartmentId
  • Parent Update / Delete of Primary Key - child rows should be affected:

    • Cascade Update - on parent primary key update, propagate the changed key to children

      • E.g., Orders are still attached if you alter a Customers' Id
    • Cascade Nullify - on parent delete, nullify child Foreign Keys

      • E.g., on Department delete, nullify Employees DepartmentId
    • Cascade Delete - on parent delete, child rows are deleted

      • E.g., on Order delete, delete the Order Details

The sections below describe RI support in the DBMS, in SQLAlchemy, and in Logic Bank.

DBMS - Preferred RI Enforcement

DBMS services can centralize RI, whether or not access is via the ORM. It is the preferable approach.

It is not, however, sufficient in all cases:

  • Some organizations do not include RI declarations in some of their databases. Moreover, the schema is often locked to prevent impacting existing applications. So, new applications must find their own mechanisms to enforce RI.

  • Some databases, such as SqlLite, do not provide RI support.

So, SQLAlchemy and Logic Bank support is provided to address these cases.

SQLAlchemy - Supplemental RI Enforcement

Parent Check - not provided (rely on DBMS), so use Logic Bank rule

The philosophy of SQLAlchemy is that DBMS systems are the proper portion of the architecture to enforce RI, so Parent Check logic is deferred to the DBMS runtime.

This leaves a Parent Check functionality gap, so, see below: Logic Bank augments SQLAlchemy RI Support.

Cascade Delete / Nullify - ORM support provided

As shown below, SQLAlchemy provides cascade support for child row behavior based on the relationship cascade property: when parent rows are deleted.

The code below from the referenential integrity sample contains the following code in the db/models.py file:

  • Cascade Delete - e.g., delete each Child when a ```Parent is deleted
  • Cascade Nullify - e.g., change the ChildOrphan foreeign key to Null when a Parent is deleted
class Parent(Base):
    __tablename__ = 'Parent'

    parent_attr_1 = Column(String(16), primary_key=True)
    parent_attr_2 = Column(String(16), primary_key=True)

    ChildList = relationship("Child"
                             , backref="Parent"
                             , cascade="all"  # cascade delete
                             # , passive_deletes=True  use *only* when DBMS does the cascade delete
                             # for LogicBank delete logic
                             , cascade_backrefs=True
                             )
    ChildOrphanList = relationship("ChildOrphan"
                                   , backref="Parent"
                                   , cascade="save-update, merge, refresh-expire, expunge"
                                   # no delete option means "nullify"
                                   , cascade_backrefs=True
                                   )


class Child(Base):
    __tablename__ = 'Child'

    child_key = Column(String(16), primary_key=True)
    parent_1 = Column(String(16))
    parent_2 = Column(String(16))
    __table_args__ = (ForeignKeyConstraint([parent_1, parent_2],
                                           [Parent.parent_attr_1, Parent.parent_attr_2]),
                      {})

# similarly for ChildOrphan

Using these Cascade options triggers ORM support for:

  1. Database Creation (DDL) - see options
  2. Parent Update Cascade operations:
    1. Cascade Delete / Nullify operations on child rows, and
    2. beforeFlush delete events for child rows (which enable Logic Bank rule execution)

Logic Bank - Augments SQLAlchemy RI Support

Use Logic Bank to enforce Parent Check

To address cases where the DBMS is not enforcing RI, Logic Bank provides a parent_check rule:

Rule.parent_check(validate=model.ChildClass, error_msg="...", enable=True)

DBMS doing cascade, disable ORM cascade processing with passive_delete

Use the passive_delete option to suppress ORM Cascade Delete / Nullify when relying on DBMS cascade processing. But this introduces a logic problem if the deleted children should affect other rows. For example:

  • You might delete an Order, which cascade-deletes the Order Details
  • But each deleted Order Details needs to adjust a Product.

Logic Bank is therefore sensitive to this option: for cascade deletes where passive_delete is set it will visit each child row to run its logic. In such cases, you will see log entries containing:

Cascade Delete to run rules on

Note: all the Logic Bank examples use SqlLite. Since it does not enforce cascade, there are no examples of passive_delete.