# SQL Task (Solved — Intro Level) — Normalising a Flat-File Database (Veterinary Clinic)

> **Teacher reference solution (introductory SQL level)**  
> This version avoids engine-specific clauses (e.g., `ENGINE=...`) and avoids named `CONSTRAINT ...` blocks.  
> It uses only basic **PRIMARY KEY**, **FOREIGN KEY**, and simple table design.


## 0) Setup (JupySQL)

Connect to your database from here using JupySQL.

- **MySQL example:** `mysql+pymysql://USER:PASSWORD@HOST:PORT/DBNAME`  
- **PostgreSQL example:** `postgresql://USER:PASSWORD@HOST:PORT/DBNAME`


In [None]:
# %load_ext sql
# %sql mysql+pymysql://USER:PASSWORD@HOST:PORT/DBNAME
# %%sql
# SELECT 1;

In [10]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [11]:
%sql mysql+pymysql://mysql_user:mysql_password@localhost:3306/mydatabase

%config SqlMagic.displaylimit = 0

In [12]:
%%sql
SELECT 1;

1
1


## 1) Starting point: the flat table (given)

Dates are stored in SQL as ISO `YYYY-MM-DD`.


In [14]:
%%sql
DROP TABLE IF EXISTS vet_flat;

CREATE TABLE vet_flat (
  owner_id        INT,
  owner_name      VARCHAR(100),
  owner_address   VARCHAR(200),
  patient_id      INT,
  patient_name    VARCHAR(100),
  patient_type    VARCHAR(50),
  vet_id          INT,
  vet_name        VARCHAR(100),
  vet_address     VARCHAR(200),
  visit_date      DATE,
  treatment       VARCHAR(100),
  treatment_type  VARCHAR(50),
  cost            DECIMAL(10,2)
);

INSERT INTO vet_flat VALUES
(1029, 'Alison Bachman', '12 Green Lane, 20192', 1011, 'Oskar',  'Dog',     4400, 'Rachel', '345 Ridley St, 99554',  '2025-05-23', 'Worming',      'Tablet',     50),
(1922, 'Aria Mathers',   '458 Rigistr, 4993',    1012, 'Seb',    'Cat',     4100, 'Lucy',   '29 Entle Street, 3049', '2025-05-23', 'Broken Tail',  'Surgery',    400),
(1029, 'Alison Bachman', '12 Green Lane, 20192', 3999, 'Jaques', 'Hamster', 4400, 'Rachel', '345 Ridley St, 99554',  '2025-05-23', 'Cut of Paw',   'Medication', 50),
(2032, 'Theo Naidoo',    '45 Rue Martignac',     2393, 'Kai',    'Dog',     4400, 'Rachel', '345 Ridley St, 99554',  '2025-05-23', 'Broken Leg',   'Surgery',    450),
(2032, 'Theo Naidoo',    '45 Rue Martignac',     2393, 'Kai',    'Dog',     4400, 'Rachel', '345 Ridley St, 99554',  '2025-08-17', 'Cast Removal', 'Surgery',    200);


### 1.1 Create and populate the flat table

Run (or adapt) the SQL below to create `vet_flat` and insert the sample data.

In [None]:
%%sql
-- TODO: Add your exploration query #1
DESCRIBE vet_flat;

In [None]:
%%sql
-- TODO: Add your exploration query #2
SELECT * FROM vet_flat;

In [None]:
%%sql
-- TODO: Add your exploration query #3


In [None]:
%%sql
-- TODO: Add your exploration query #4


In [None]:
%%sql
-- TODO: Add your exploration query #4


## 2) Diagnose the problems (answers)

- **Redundancy:** owner/vet/patient data repeats in many rows.
- **Anomalies:** updates require changing many rows; you can’t insert an owner without a visit; deleting a row can lose important details.


## 3) Normalised design (simple 3NF)

Tables:
- `owners(owner_id PK, owner_name, owner_address)`
- `patients(patient_id PK, owner_id FK, patient_name, patient_type)`
- `vets(vet_id PK, vet_name, vet_address)`
- `visits(visit_id PK, patient_id FK, vet_id FK, visit_date)`
- `treatments(treatment_id PK, treatment_name UNIQUE, treatment_type)`
- `visit_treatments(visit_id FK, treatment_id FK, cost, PK(visit_id, treatment_id))`

This supports **more than one treatment per visit**.


## 4) ERD (text)

```mermaid
erDiagram
  OWNERS ||--o{ PATIENTS : owns
  PATIENTS ||--o{ VISITS : has
  VETS ||--o{ VISITS : attends
  VISITS ||--o{ VISIT_TREATMENTS : includes
  TREATMENTS ||--o{ VISIT_TREATMENTS : recorded_in
```


## 5) Create the normalised tables (basic SQL)

No `ENGINE=...` and no named `CONSTRAINT ...`.


In [None]:
%%sql
DROP TABLE IF EXISTS visit_treatments;
DROP TABLE IF EXISTS visits;
DROP TABLE IF EXISTS treatments;
DROP TABLE IF EXISTS patients;
DROP TABLE IF EXISTS vets;
DROP TABLE IF EXISTS owners;

CREATE TABLE owners (
  owner_id      INT PRIMARY KEY,
  owner_name    VARCHAR(100) NOT NULL,
  owner_address VARCHAR(200) NOT NULL
);

CREATE TABLE patients (
  patient_id   INT PRIMARY KEY,
  owner_id     INT NOT NULL,
  patient_name VARCHAR(100) NOT NULL,
  patient_type VARCHAR(50) NOT NULL,
  FOREIGN KEY (owner_id) REFERENCES owners(owner_id)
);

CREATE TABLE vets (
  vet_id      INT PRIMARY KEY,
  vet_name    VARCHAR(100) NOT NULL,
  vet_address VARCHAR(200) NOT NULL
);

CREATE TABLE visits (
  visit_id   INT PRIMARY KEY,
  patient_id INT NOT NULL,
  vet_id     INT NOT NULL,
  visit_date DATE NOT NULL,
  FOREIGN KEY (patient_id) REFERENCES patients(patient_id),
  FOREIGN KEY (vet_id) REFERENCES vets(vet_id)
);

CREATE TABLE treatments (
  treatment_id   INT PRIMARY KEY,
  treatment_name VARCHAR(100) NOT NULL,
  treatment_type VARCHAR(50)  NOT NULL,
  UNIQUE (treatment_name)
);

CREATE TABLE visit_treatments (
  visit_id     INT NOT NULL,
  treatment_id INT NOT NULL,
  cost         DECIMAL(10,2) NOT NULL,
  PRIMARY KEY (visit_id, treatment_id),
  FOREIGN KEY (visit_id) REFERENCES visits(visit_id),
  FOREIGN KEY (treatment_id) REFERENCES treatments(treatment_id)
);


## 6) Insert data (simple inserts)

Explicit IDs (intro level). Includes one visit with **two** treatments.


In [None]:
%%sql
INSERT INTO owners VALUES
(1029, 'Alison Bachman', '12 Green Lane, 20192'),
(1922, 'Aria Mathers',   '458 Rigistr, 4993'),
(2032, 'Theo Naidoo',    '45 Rue Martignac');

INSERT INTO patients VALUES
(1011, 1029, 'Oskar',  'Dog'),
(1012, 1922, 'Seb',    'Cat'),
(3999, 1029, 'Jaques', 'Hamster'),
(2393, 2032, 'Kai',    'Dog');

INSERT INTO vets VALUES
(4400, 'Rachel', '345 Ridley St, 99554'),
(4100, 'Lucy',   '29 Entle Street, 3049');

INSERT INTO treatments VALUES
(1, 'Worming',      'Tablet'),
(2, 'Broken Tail',  'Surgery'),
(3, 'Cut of Paw',   'Medication'),
(4, 'Broken Leg',   'Surgery'),
(5, 'Cast Removal', 'Surgery'),
(6, 'Pain Relief',  'Medication');

INSERT INTO visits VALUES
(1, 1011, 4400, '2025-05-23'),
(2, 1012, 4100, '2025-05-23'),
(3, 3999, 4400, '2025-05-23'),
(4, 2393, 4400, '2025-05-23'),
(5, 2393, 4400, '2025-08-17');

INSERT INTO visit_treatments VALUES
(1, 1,  50),
(2, 2, 400),
(3, 3,  50),
(4, 4, 450),
(5, 5, 200),
(4, 6,  30);


## 7) Validation queries (6 examples)

In [None]:
%%sql
-- (1) All visits for a specific owner (owner_id = 1029)
SELECT o.owner_id, o.owner_name,
       p.patient_name,
       v.visit_date,
       ve.vet_name,
       t.treatment_name,
       vt.cost
FROM owners o
JOIN patients p ON p.owner_id = o.owner_id
JOIN visits v ON v.patient_id = p.patient_id
JOIN vets ve ON ve.vet_id = v.vet_id
JOIN visit_treatments vt ON vt.visit_id = v.visit_id
JOIN treatments t ON t.treatment_id = vt.treatment_id
WHERE o.owner_id = 1029
ORDER BY v.visit_date, p.patient_name, t.treatment_name;


In [None]:
%%sql
-- (2) Patient history (patient_id = 2393)
SELECT p.patient_id, p.patient_name,
       v.visit_date,
       ve.vet_name,
       t.treatment_name,
       vt.cost
FROM patients p
JOIN visits v ON v.patient_id = p.patient_id
JOIN vets ve ON ve.vet_id = v.vet_id
JOIN visit_treatments vt ON vt.visit_id = v.visit_id
JOIN treatments t ON t.treatment_id = vt.treatment_id
WHERE p.patient_id = 2393
ORDER BY v.visit_date, t.treatment_name;


In [None]:
%%sql
-- (3) Total spend per owner
SELECT o.owner_id, o.owner_name, SUM(vt.cost) AS total_spent
FROM owners o
JOIN patients p ON p.owner_id = o.owner_id
JOIN visits v ON v.patient_id = p.patient_id
JOIN visit_treatments vt ON vt.visit_id = v.visit_id
GROUP BY o.owner_id, o.owner_name
ORDER BY total_spent DESC;


In [None]:
%%sql
-- (4) Visits per vet in a date range
SELECT ve.vet_id, ve.vet_name, COUNT(*) AS num_visits
FROM vets ve
JOIN visits v ON v.vet_id = ve.vet_id
WHERE v.visit_date BETWEEN '2025-05-01' AND '2025-06-01'
GROUP BY ve.vet_id, ve.vet_name
ORDER BY num_visits DESC;


In [None]:
%%sql
-- (5) Top 3 treatments
SELECT t.treatment_name, COUNT(*) AS times_given
FROM visit_treatments vt
JOIN treatments t ON t.treatment_id = vt.treatment_id
GROUP BY t.treatment_name
ORDER BY times_given DESC
LIMIT 3;


In [None]:
%%sql
-- (6) Patients by species and owners
SELECT p.patient_type, p.patient_name, o.owner_name
FROM patients p
JOIN owners o ON o.owner_id = p.owner_id
ORDER BY p.patient_type, p.patient_name;


## 8) Justification (short, intro level)

- The flat table repeats owner/vet/patient data (redundancy).
- The normalised tables store each type of data once.
- Foreign keys link the tables and keep the data consistent.
- `visit_treatments` allows more than one treatment per visit.
