SE 13 Database Design Principles

SE-13: Database Design Principles

Part of the Software Engineering Principles series

Why Database Design Matters

The database is the persistent foundation of most applications. Unlike code, which can be refactored and redeployed, the database schema and the data it holds are far harder to change once in production. A poorly designed schema propagates bad data, degrades performance, and constrains every future development decision.

Good database design produces schemas that:

Accurately represent the business domain
Enforce data integrity constraints
Perform well at expected data volumes
Can be extended without destructive migrations

Relational Database Concepts

Tables, Rows, and Columns

A table models an entity type (patients, bills, medicines). Each row is one instance of that entity. Each column is an attribute of that entity.

Primary Keys

Every table must have a primary key — a column or set of columns whose values uniquely identify each row.

CREATE TABLE patient (
    id         BIGINT      NOT NULL AUTO_INCREMENT,
    name       VARCHAR(200) NOT NULL,
    dob        DATE,
    PRIMARY KEY (id)
);

Use surrogate keys (auto-generated integers or UUIDs) for primary keys in most cases. Avoid using natural keys (like national ID numbers) as primary keys because they can change and they couple your schema to external systems.

Foreign Keys

A foreign key is a column in one table that references the primary key of another, enforcing referential integrity.

CREATE TABLE bill (
    id         BIGINT NOT NULL AUTO_INCREMENT,
    patient_id BIGINT NOT NULL,
    net_value  DECIMAL(15,2) NOT NULL,
    PRIMARY KEY (id),
    FOREIGN KEY (patient_id) REFERENCES patient(id)
);

A foreign key constraint prevents inserting a bill for a non-existent patient and prevents deleting a patient who has bills.

Normalisation

Normalisation is the process of structuring a database to reduce data redundancy and improve integrity. It is formalised as a series of "Normal Forms" (1NF, 2NF, 3NF, etc.).

First Normal Form (1NF)

Each column contains atomic (indivisible) values
No repeating groups or arrays in a single column

-- Bad: storing multiple phone numbers in one column
patient: id, name, phones = "077-123, 011-456"

-- Good: separate table for phone numbers
patient_phone: patient_id, phone_number, phone_type

Second Normal Form (2NF)

Satisfies 1NF
Every non-key column depends on the entire primary key (relevant for composite keys)

-- Bad: a composite key where description depends on only one part
order_item: (order_id, medicine_id, quantity, medicine_description)
-- medicine_description depends only on medicine_id, not the combination

-- Good: medicine description lives in the medicine table
medicine: (id, name, description)
order_item: (order_id, medicine_id, quantity)

Third Normal Form (3NF)

Satisfies 2NF
No non-key column depends on another non-key column (no transitive dependencies)

-- Bad: department_location depends on department_id, not bill_id
bill: (id, patient_id, department_id, department_location)

-- Good: location is in the department table
department: (id, name, location)
bill: (id, patient_id, department_id)

Practical Guideline

Aim for 3NF in most cases. Denormalise (deliberately introduce redundancy) only when a specific performance requirement demands it, and document the reason.

Relationships

One-to-Many

The most common relationship. One patient has many bills.

-- patient (1) ──< bill (many)
bill.patient_id references patient.id

Many-to-Many

Implemented with a junction table. A prescription can contain many medicines, and a medicine can appear in many prescriptions.

CREATE TABLE prescription_item (
    prescription_id BIGINT NOT NULL,
    medicine_id     BIGINT NOT NULL,
    quantity        INT    NOT NULL,
    dosage          VARCHAR(100),
    PRIMARY KEY (prescription_id, medicine_id),
    FOREIGN KEY (prescription_id) REFERENCES prescription(id),
    FOREIGN KEY (medicine_id)     REFERENCES medicine(id)
);

One-to-One

One row in table A corresponds to exactly one row in table B. Useful when a large or rarely-needed subset of attributes should be split into a separate table.

Data Integrity

Constraints

Constraints enforce business rules at the database level — they cannot be bypassed by application bugs.

CREATE TABLE stock_item (
    id          BIGINT NOT NULL AUTO_INCREMENT,
    medicine_id BIGINT NOT NULL,
    quantity    INT NOT NULL DEFAULT 0,
    unit_price  DECIMAL(10,2) NOT NULL,
    CONSTRAINT chk_quantity_non_negative CHECK (quantity >= 0),
    CONSTRAINT chk_price_positive        CHECK (unit_price > 0),
    PRIMARY KEY (id)
);

Constraint	Purpose
`NOT NULL`	Column must have a value
`UNIQUE`	No two rows can have the same value in this column
`PRIMARY KEY`	Unique, not null — identifies the row
`FOREIGN KEY`	References a row in another table
`CHECK`	Value must satisfy a condition
`DEFAULT`	Value to use when none is provided

Transactions

A transaction is a unit of work that either completes entirely or not at all (atomicity). Transactions ensure that related changes succeed or fail together.

@Transactional
public void transferStock(StockItem from, StockItem to, int quantity) {
    from.deduct(quantity);   // If this succeeds...
    to.add(quantity);        // ...this must also succeed, or both roll back
    stockRepo.save(from);
    stockRepo.save(to);
}
// If any step throws an exception, the entire transaction rolls back

The ACID properties of transactions:

Atomicity — all or nothing
Consistency — the database moves from one valid state to another
Isolation — concurrent transactions do not interfere
Durability — committed data survives system failure

Indexing

An index is a data structure that allows the database to find rows quickly without scanning the entire table.

When to Index

Columns used frequently in WHERE clauses
Columns used in JOIN conditions (foreign keys)
Columns used in ORDER BY when large result sets are sorted

-- Without index: full table scan for every lookup
SELECT * FROM bill WHERE patient_id = 42;

-- With index: direct lookup
CREATE INDEX idx_bill_patient ON bill (patient_id);

Index Trade-offs

Benefit	Cost
Faster reads	Slower writes (index must be updated)
Faster joins	More disk space
Faster sorts	More maintenance

Do not index every column. Index columns that are actually used in queries on large tables. Over-indexing degrades write performance.

Composite Indexes

An index on multiple columns is most useful when queries filter on those columns together, in the same order as the index.

-- Useful for: WHERE department_id = 5 AND created_date > '2025-01-01'
CREATE INDEX idx_bill_dept_date ON bill (department_id, created_date);

Naming Conventions

Consistent naming makes schemas readable.

Object	Convention	Example
Table	lowercase, underscore	`patient`, `bill_item`, `stock_transfer`
Column	lowercase, underscore	`patient_id`, `created_date`, `net_value`
Primary key	`id`	`id`
Foreign key	`{referenced_table}_id`	`patient_id`, `department_id`
Index	`idx_{table}_{column(s)}`	`idx_bill_patient`, `idx_stock_dept_date`
Constraint	`chk_{table}_{description}`	`chk_stock_quantity_non_negative`

Schema Migration

A migration script modifies an existing database schema in a controlled, repeatable way. Every schema change — adding a table, adding a column, adding an index — should be implemented as a versioned migration script.

Principles

Never modify a deployed migration — once it has run in production, it is permanent history
Make migrations reversible — include rollback steps where possible
Test migrations against a copy of production data before deploying
Keep migrations small — one logical change per migration file

-- v2.1.5/migration.sql
-- Add expiry tracking to stock items

ALTER TABLE stock_item 
    ADD COLUMN expiry_date DATE NULL,
    ADD COLUMN batch_number VARCHAR(50) NULL;

CREATE INDEX idx_stock_expiry ON stock_item (expiry_date);

Previous: SE-12: Security Principles
Next: SE-14: Agile and Scrum

Back to Software Engineering Principles

සිංහල English

SE 13 Database Design Principles

SE-13: Database Design Principles

Why Database Design Matters

Relational Database Concepts

Tables, Rows, and Columns

Primary Keys

Foreign Keys

Normalisation

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

Practical Guideline

Relationships

One-to-Many

Many-to-Many

One-to-One

Data Integrity

Constraints

Transactions

Indexing

When to Index

Index Trade-offs

Composite Indexes

Naming Conventions

Schema Migration

Principles

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!