Skip to content

Temporal Tables

Joshua Essex edited this page Oct 4, 2019 · 1 revision

Our database includes application-time temporal tables for each primary table in each schema.

Application-time temporal tables

An application-time temporal table is a table that tracks the changes over time to an entity reflected in another table (the "primary table"). E.g., person_history tracks the changes over time to a given person in the person table.

Each row in the temporal table (a "snapshot") contains the state of that entity for a certain period of time, along with valid_from and valid_to timestamp columns defining the period over which that state is valid. The most recent snapshot is always identical to the current state of the entity on the primary table, and has a valid_to value of NULL.

These tables are designed to be queried both horizontally (the complete state of the database, including all relationships, at a given historical point in time) and vertically (the changes to one entity over time).

The valid period columns reflect the state of the entity in the real world (to our best approximation), not the state of the entity in the database. E.g., if historical data for a sentence that was imposed in 2005 are ingested into the system, the corresponding historical snapshot will be dated to 2005 (the time the event happened in the real world), not the time the data were recorded in the database. This latter alternative is called a system-time temporal table.

As part of the temporal table design, no rows are ever deleted from the primary tables. Any event corresponding to a delete is instead indicated by updating a status value. This makes it easier to keep track of all entities that might ever have been related to a given entity at any point in its history, even if some of those entities are no longer valid at some later date.

Determining valid periods

As a complication to the above, the vast majority of ingested sources do not provide sufficient detail for us to ascertain the real-world time that the state of any given entity changed. Because of this, for all updates to existing entities, we use the time the update was ingested and treat it as the time the update occurred. It is only when ingesting new entities that we use whatever provided times are available to determine the valid period of the data.

County Schema Specifics

The most important time considered when ingesting new entities is the booking admission date. When a new booking is ingested, if its provided admission date is earlier than the ingestion date, the booking, as well as all of its children that do not provide more granular dates, are treated as valid from the provided admission date. Additionally, if the booking is the first booking for the given person, that person will also have their valid period dated from the provided admission date.

Note this only applies to new bookings. A new bond, for example, added to an existing booking will have its valid period dated from ingestion time, not the booking's admission date.

To see which descendants of booking provide more granular dates for setting their valid periods, see the BOOKING_DESCENDANT_START_DATE_FIELD and BOOKING_DESCENDANT_END_DATE_FIELD maps in update_historical_snapshots.py.