Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Single Table Inheritance support #966

Merged
merged 21 commits into from
Mar 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
32 changes: 2 additions & 30 deletions docs/docs/advanced/class-table-inheritance.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,42 +112,14 @@ After this, Joist will enforce that all `Animal`s must be either `Dog`s or `Cat`

For example, if an `em.load(Animal, "a:1")` finds a row only in the `animals` table, and no matching row in the `dogs` or `cats` table, then the `em.load` method will fail with an error message.

## What about Single Table Inheritance?

An alternative to Class Table Inheritance (CTI) is [Single Table Inheritance](https://www.martinfowler.com/eaaCatalog/singleTableInheritance.html) (STI), where `Dog`s and `Cat`s don't have their own tables, but have their subtype-specific fields stored directly on the `animals` table (e.g. both `animals.can_bark` and `animals.can_meow` would be columns directly in the `animals` table even though, for dogs, the `can_meow` column is not applicable).

Joist currently does not support STI, generally because CTI has several pros:

1. With CTI, the database schema makes it obvious what the class hierarchy should be.

Given how schema-driven Joist's `joist-codegen` is, it's very convenient to have the per-type fields already split out (into separate tables) and then to use the `id` foreign keys to discover the `extends` relationships.

With STI, this sort of "obvious" visibility does not exist, and we'd have to encode the type hierarchy in `joist-config.json`, i.e. some sort of mapping that says `animals.can_bark` is only applicable for the `Dog` subtype, and `animals.can_meow` is only applicable for the `Cat` subtype.

2. With CTI, the schema is safer, because the subtype-only columns can have not-null constraints.

With STI, even if `can_bark` is required for all `Dog`s, because there will be `Cat` rows in the `animals` table that just fundamentally cannot have a `can_bark` value, the column must be nullable.

Which is fine if it's already nullable, but if you wanted it to be non-null, now we have to encode in `joist-config.json` that it is _technically_ required, and rely on Joist's runtime code to enforce it.

3. With CTI, we can have foreign keys directly to subtypes.

For example, we could have a `DogCollar` entity that had a `dog_collars.dog_id` foreign key that points _only_ to `dogs`, and is fundamentally unable to point to `Cat`s.

With STI, it's not possible in the database to represent/enforce that FKs are only valid for a specific subtype.

That said, the pro of STI is that you don't need `LEFT OUTER JOIN`s to load entities, b/c all data for all subtypes is a single table, so Joist could likely support STI someday, it just does not currently.

## But Isn't Inheritance Bad Design?

Yes, inheritance can be abused, particularly with deep inheritance hierarchies and/or just "bad design".
Yes, inheritance can be abused, particularly with deep inheritance hierarchies and/or just bad design decisions.

But when you have a situation that fits it well, it can be an appropriate/valid way to design a schema, at your own choice/discretion.

If it helps, inheritance can also be thought of Abstract Data Types, which as a design pattern is generally considered a modern/"good" approach for accurately & type-safely modeling values that have different fields based on their current kind/type.
If it helps, inheritance can also be thought of Abstract Data Types, which as a design pattern is generally considered a modern/good approach for accurately & type-safely modeling values that have different fields based on their current kind/type.

ADTs also focus just on the per-kind/per-type data attributes, and less on the polymorphic behavior of methods encoded/implemented within the class hierarchy which was the focus of traditional OO-based inheritance.

When using inheritance with Joist entities, you can pick whichever approach you prefer: either more "just data" ADT-ish inheritance or "implementation-hiding methods" OO-ish inheritance.


165 changes: 165 additions & 0 deletions docs/docs/advanced/single-table-inheritance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
---
title: Single Table Inheritance
sidebar_position: 7
---

Joist supports [Single Table Inheritance](https://www.martinfowler.com/eaaCatalog/singleTableInheritance.html), which allows inheritance/subtyping of entities (like `class Dog extends Animal`), by automatically mapping multiple logical polymorphic entities (`Dog`, `Cat`, and `Animal`) into a single physical SQL table (`animals`).

## Database Representation

For example, lets say we have a `Dog` entity and a `Cat` entity, and we want them to both extend the `Animal` entity.

For single table inheritance, we represent this in Postgres by having a single table: `animals`.

- The `animals` table has all columns for `Animal`s, `Dog`s, or `Cat`s
- A discriminator column, i.e. `type_id`, tells Joist whether a given row is a `Dog` or a `Cat`
- We currently require the discriminator field to be an enum column
- Any `Dog`-only columns are configured in `joist-config.json`
- Any `Cat`-only columns are configured in `joist-config.json`
- Any `Dog`- or `Cat`-only columns must be nullable

The`joist-config.json` might look like:

```json
{
"entities": {
"Animal": {
"fields": {
"type": { "stiDiscriminator": { "DOG": "Dog", "CAT": "Cat" } },
"canBark": { "stiType": "Dog" },
"canMeow": { "stiType": "Cat", "stiNotNull": true }
},
"tag": "a"
},
"DogPack": {
"relations": {
"leader": { "stiType": "Dog" }
},
"tag": "dp"
}
}
}
```

## Entity Representation

When `joist-codegen` sees the above `joist-config.json` setup, Joist will ensure that the `Dog` model extends the `Animal` model.

Note that because of the codegen entities, it will actually end up looking like:

```typescript
// in Dog.ts
class Dog extends DogCodegen {
// any custom logic
}

// in DogCodegen.ts
abstract class DogCodegen extends Animal {
can_bark: boolean;
}

// in Animal.ts
class Animal extends AnimalCodegen {
// any custom logic
}

// in AnimalCodegen.ts
abstract class AnimalCodegen extends BaseEntity {
name: string;
}
```

And when you load several `Animal`s, Joist will automatically read the `type_id` column and create the respective subtype:

```typescript
const [a1, a2] = await em.loadAll(Animal, ["a:1", "a:2"]);
// If a1 was saved as a dog, it will be a Dog
expect(a1).toBeInstanceOf(Dog);
// if a2 was saved as a cat, it will be a Cat
expect(a2).toBeInstanceOf(Cat);
```

## SubType Configuration

Due to STI's lack of schema-based encoding (see Pros/Cons section below), you may often need to manually configure the `joist-config.json` to give Joist hints about "which subtype" a given relation should be.

For example, instead of the `DogPack.leader` relation (from the `dog_packers.leader_id` FK) being typed as `Animal` (which is the `animals` table that the `leader_id` FK points to in the database schema), you want it to be typed as `Dog` because you know all `DogPack` leader's should be `Dog`s.

These hints in `joist-config.json` generally look like:

1. Adding an `stiDiscriminator` mapping to the `type` field that Joist will use to know "which subtype is this?"
2. Adding `stiType: "Dog"` or `stiType: "Cat"` to any field (like `canBark` or `canMeow`) in the `animals` table that should be limited to a specific subtype
- The value of `"Dog"` or `"Cat"` should match a name in the `stiDiscriminator` mapping
- Currently, we only support a field being in a single subtype
3. Adding `stiNotNull: true` to any fields that you want Joist to enforce as not null
- For example, if you want `canMewo` to be required for all `Cat`s, you can add `stiNotNull: true` to the `canMeow` field
- Without an explicit `stiNotNull` set, we assume subtype fields are nullable, which is how they're represented in the database
- See the "Pros/Cons" section later for why this can't be encoded in the database
4. On any FKs that point _to_ your base type, add `stiType: "SubType"` to indicate that the FK is only valid for the given subtype.
- See the `DogPack` example in the above example config

## Tagged Ids

Currently, subtypes share the same tagged id as the base type.

For example, `dog1.id` returns `a:1` because the `Dog`'s base type is `Animal`, and all `Animal`s (regardless of whether they're `Dog`s or `Cat`s) use the `a` tag.

## Abstract Base Types

If you'd like to enforce that base type is abstract, i.e. that users cannot instantiate `Animal`, they must instantiate either a `Dog` or `Cat`, then you can mark `Animal` as `abstract` in the `joist-config.json` file:

```json
{
"entities": {
"Animal": {
"tag": "a",
"abstract": true
}
}
}
```

You also need to manually update the `Animal.ts` file to make the class `abstract`:

```typescript
export abstract class Animal extends AnimalCodegen {}
```

After this, Joist will enforce that all `Animal`s must be either `Dog`s or `Cat`s.

For example, if an `em.load(Animal, "a:1")` finds a row only in the `animals` table, and no matching row in the `dogs` or `cats` table, then the `em.load` method will fail with an error message.

## Pros/Cons to Single Table Inheritance

Between Single Table Inheritance (STI) and [Class Table Inheritance](./class-table-inheritance.md) (CTI), Joist generally recommends using CTI over STI for the following reasons:

1. With CTI, the database schema makes it obvious what the class hierarchy should be.

Given the schema itself already has the per-type fields split out (into separate tables), there is very little configuration for CTI, and instead the generated entities are basically "automatically correct".

With STI, this schema-based encoding does not exist, so we have to configure items like the discriminator value, and which fields belong to which subtype, in the `joist-config.json`. This is doable, but tedious.

2. With CTI, the schema is safer, because the subtype-only columns can have not-null constraints.

With STI, if we want `can_bark` to be required for all `Dog`s, we cannot use a `can_bark boolean NOT NULL` in the schema, because the `animals` table will also have `Cat` rows that fundamentally don't have `can_bark` values.

Instead, we have to indicate in `joist-config.json` that Joist should enforce model-level not-null constraints, which is okay, but not as good as database-level enforcement.

3. With CTI, we can have foreign keys point directly to subtypes.

For example, we could have a `DogPack` entity with a `dog_packs.leader_id` foreign key that references the `dogs` subtype table, and so points _only_ to `Dog`s, and is fundamentally unable to point to `Cat`s (even at the database level, this is enforced b/c the `dogs` table will not have any ids of `Cat` entities).

With STI, it's not possible in the database to represent/enforce that FKs are only valid for a specific subtype (`dog_packs.leader_id` can only point to the `animals` table).

That said, the pro of STI is that you don't need `LEFT OUTER JOIN`s to load entities (see the [CTI](./class-table-inheritance.md) docs), b/c all data for all subtypes is a single table.

## When to Choose STI/CTI

To application code, the STI and CTI approach can look near identical, because both approaches result in the same `Dog`, `Cat`, and `Animal` type hierarchy.

But, generally Joist recommends:

- Use CTI when the polymorphism is an integral part of your domain model, i.e. you have "true" `Cat` and `Dog` entities as separate concepts you want to model in your domain
- Use STI when the polymorphism is for a transient implementation detail, i.e. migrating your `Cat` model to a `CatV2` model.

And, either way, use both approaches judiciously; in a system of 50-100 entities, you should probably be using CTI/STI only a handful of times.
6 changes: 6 additions & 0 deletions packages/codegen/src/EntityDbMetadata.ts
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,11 @@ export class EntityDbMetadata {
updatedAt: PrimitiveField | undefined;
deletedAt: PrimitiveField | undefined;
baseClassName: string | undefined;
inheritanceType: "sti" | "cti" | undefined;
/** This will only be set on the base meta. */
stiDiscriminatorField?: string;
/** This will only be set on the sub metas. */
stiDiscriminatorValue?: number;
abstract: boolean;
invalidDeferredFK: boolean;

Expand All @@ -228,6 +233,7 @@ export class EntityDbMetadata {

if (isSubClassTable(table)) {
this.baseClassName = tableToEntityName(config, table.columns.get("id").foreignKeys[0].referencedTable);
this.inheritanceType = "cti";
}

this.primaryKey =
Expand Down
7 changes: 6 additions & 1 deletion packages/codegen/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ const fieldConfig = z
zodSchema: z.optional(z.string()),
type: z.optional(z.string()),
serde: z.optional(z.string()),
stiDiscriminator: z.optional(z.record(z.string(), z.string())),
stiType: z.optional(z.string()),
stiNotNull: z.optional(z.boolean()),
})
.strict();

Expand All @@ -28,6 +31,8 @@ const relationConfig = z
polymorphic: z.optional(z.union([z.literal("notNull"), z.literal(true)])),
large: z.optional(z.boolean()),
orderBy: z.optional(z.string()),
stiType: z.optional(z.string()),
stiNotNull: z.optional(z.boolean()),
})
.strict();

Expand Down Expand Up @@ -122,7 +127,7 @@ export function warnInvalidConfigEntries(config: Config, db: DbMetadata): void {
const [entity] = entities;

// Check fields
const fields = [...entity.primitives, ...entity.manyToOnes];
const fields = [...entity.primitives, ...entity.manyToOnes, ...entity.enums];
for (const [name, config] of Object.entries(entityConfig.fields || {})) {
if (config.ignore) continue;
const field = fields.find((f) => f.fieldName === name);
Expand Down
36 changes: 30 additions & 6 deletions packages/codegen/src/generateEntityCodegenFile.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { camelCase, pascalCase } from "change-case";
import { Code, code, imp, joinCode } from "ts-poet";
import { DbMetadata, EntityDbMetadata, EnumField, PrimitiveField, PrimitiveTypescriptType } from "./EntityDbMetadata";
import { Config } from "./config";
import { getStiEntities } from "./index";
import { keywords } from "./keywords";
import {
BaseEntity,
Expand Down Expand Up @@ -30,15 +31,16 @@ import {
OptsOf,
OrderBy,
PartialOrNull,
ReactiveField,
PersistedAsyncReference,
PolymorphicReference,
ProjectEntity,
ReactiveField,
SSAssert,
TaggedId,
ValueFilter,
ValueGraphQLFilter,
Zod,
cannotBeUpdated,
cleanStringValue,
failNoIdYet,
getField,
Expand All @@ -53,6 +55,7 @@ import {
isEntity,
isLoaded,
loadLens,
mustBeSubType,
newChangesProxy,
newRequiredRule,
setField,
Expand Down Expand Up @@ -371,7 +374,7 @@ export function generateEntityCodegenFile(config: Config, dbMeta: DbMetadata, me
: "";

// Set up the codegen artifacts to extend from the base type if necessary
const baseEntity = meta.baseClassName ? dbMeta.entities.find((e) => e.name === meta.baseClassName)! : undefined;
const baseEntity = dbMeta.entities.find((e) => e.name === meta.baseClassName);
const subEntities = dbMeta.entities.filter((e) => e.baseClassName === meta.name);
const base = baseEntity?.entity.type ?? code`${BaseEntity}<${EntityManager}, ${idType}>`;
const maybeBaseFields = baseEntity ? code`extends ${imp(baseEntity.name + "Fields@./entities")}` : "";
Expand Down Expand Up @@ -480,7 +483,7 @@ export function generateEntityCodegenFile(config: Config, dbMeta: DbMetadata, me

export const ${configName} = new ${ConfigApi}<${entity.type}, ${contextType}>();

${generateDefaultValidationRules(meta, configName)}
${generateDefaultValidationRules(dbMeta, meta, configName)}

export abstract class ${entityName}Codegen extends ${base} implements ${ProjectEntity} {
static defaultValues: object = {
Expand Down Expand Up @@ -635,13 +638,31 @@ function generateDefaultValues(config: Config, meta: EntityDbMetadata): Code[] {
return [...primitives, ...enums, ...pgEnums];
}

function generateDefaultValidationRules(meta: EntityDbMetadata, configName: string): Code[] {
function generateDefaultValidationRules(db: DbMetadata, meta: EntityDbMetadata, configName: string): Code[] {
// Add required rules for all not-null columns
const fields = [...meta.primitives, ...meta.enums, ...meta.manyToOnes, ...meta.polymorphics];
return fields
const rules = fields
.filter((p) => p.notNull)
.map(({ fieldName }) => {
return code`${configName}.addRule(${newRequiredRule}("${fieldName}"));`;
});
// Add STI discriminator cannot change
if (meta.stiDiscriminatorField) {
const field = meta.enums.find((e) => e.fieldName === meta.stiDiscriminatorField) ?? fail("STI field not found");
rules.push(code`${configName}.addRule(${cannotBeUpdated}("${field.fieldName}"));`);
}
// Add STI type must match
const stiEntities = getStiEntities(db.entities);
if (stiEntities.size > 0) {
for (const m2o of meta.manyToOnes) {
// The `m2o.otherEntity` may already be pointing at the subtype, but stiEntities has subtypes in it as well...
const target = stiEntities.get(m2o.otherEntity.name);
if (target && m2o.otherEntity.name !== target.base.name) {
rules.push(code`${configName}.addRule("${m2o.fieldName}" ,${mustBeSubType}("${m2o.fieldName}"));`);
}
}
}
return rules;
}

// Make our opts type
Expand All @@ -655,7 +676,10 @@ function generateOptsFields(config: Config, meta: EntityDbMetadata): Code[] {
});
const enums = meta.enums.map((field) => {
const { fieldName, enumType, notNull, isArray } = field;
if (isArray) {
if (meta.stiDiscriminatorField === fieldName) {
// Don't include the discriminator as an opt b/c we'll infer it from the instance type
return code``;
} else if (isArray) {
// Arrays are always optional and we'll default to `[]`
return code`${fieldName}?: ${enumType}[];`;
} else {
Expand Down
6 changes: 5 additions & 1 deletion packages/codegen/src/generateMetadataFile.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,16 @@ export function generateMetadataFile(config: Config, dbMeta: DbMetadata, meta: E
Object.values(fields).forEach((code) => code.asOneline());

const maybeBaseType = meta.baseClassName ? `"${meta.baseClassName}"` : undefined;
// We want to put inheritanceType: sti/cti onto base classes as well
const maybeInheritanceType = meta.inheritanceType ? `inheritanceType: "${meta.inheritanceType}",` : "";
const maybeStiColumn = meta.stiDiscriminatorField ? `stiDiscriminatorField: "${meta.stiDiscriminatorField}",` : "";
const maybeStiValue = meta.stiDiscriminatorValue ? `stiDiscriminatorValue: ${meta.stiDiscriminatorValue},` : "";

return code`
export const ${entity.metaName}: ${EntityMetadata}<${entity.name}> = {
cstr: ${entity.type},
type: "${entity.name}",
baseType: ${maybeBaseType},
baseType: ${maybeBaseType}, ${maybeInheritanceType} ${maybeStiColumn} ${maybeStiValue}
idType: "${config.idType ?? "tagged-string"}",
idDbType: "${meta.primaryKey.columnType}",
tagName: "${meta.tagName}",
Expand Down