# Formative Worksheet 03: SQL (MySQL) ‚Äî Many-to-Many (Students ‚Üî Majors)

> Goal: repeat the full workflow (**CREATE TABLE**, **INSERT**, **SELECT**, **UPDATE**, **DELETE**) using a **many-to-many** relationship:
- A student can enroll in **multiple** majors
- A major can have **multiple** students
- `gpa` is stored **per enrollment** (one GPA per student‚Äìmajor pair)

‚ö†Ô∏è Start from **zero** (assume previous worksheets do not exist).


## üìì Environment setup (Jupyter Notebook)
If you're running this in a fresh environment, install the required packages.


In [None]:
pip install ipykernel jupyterlab jupysql pymysql cryptography --upgrade --no-cache-dir


## Connect JupySQL to your MySQL server
1) Load the `sql` extension.
2) Connect to your server (replace user, password, host, port, and database).


In [1]:
%load_ext sql


In [2]:
%sql mysql+pymysql://mysql_user:mysql_password@localhost:3306/mydatabase

%config SqlMagic.displaylimit = 0


---
## Exercise 1 ‚Äî Create the tables (`students`, `majors`, and `enrollments`)

Create three tables:

### Table A: `students`
- `id` (integer, primary key, auto-increment)
- `name` (text, not null)
- `birthdate` (date, not null)

### Table B: `majors`
- `id` (integer, primary key, auto-increment)
- `name` (text, not null, **unique**)

### Table C: `enrollments` (junction table)
- `student_id` (integer, not null)
- `major_id` (integer, not null)
- `gpa` (decimal/numeric, 2 decimal places, not null)

Rules:
- A student can appear multiple times (different majors)
- A major can appear multiple times (different students)
- The pair (`student_id`, `major_id`) should be **unique** (use a composite PRIMARY KEY)

üí° Tip: drop tables first to ensure you start from scratch:
`DROP TABLE IF EXISTS enrollments;` then `DROP TABLE IF EXISTS students;` and `DROP TABLE IF EXISTS majors;`

üí° Tip (many-to-many pattern):
Use a junction table with two foreign keys and a composite primary key:
```sql
CREATE TABLE <junction_table> (
  <a_id> INT NOT NULL,
  <b_id> INT NOT NULL,
  PRIMARY KEY (<a_id>, <b_id>),
  FOREIGN KEY (<a_id>) REFERENCES <table_a>(id),
  FOREIGN KEY (<b_id>) REFERENCES <table_b>(id)
);
```


In [12]:
%%sql

-- EXERCISE 1:
-- 1) DROP TABLE IF EXISTS ... (enrollments first)
DROP TABLE IF EXISTS enrollments;
DROP TABLE IF EXISTS students;
DROP TABLE IF EXISTS majors;


In [None]:
%%sql


-- 2) CREATE TABLE students
CREATE TABLE students (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    birthdate DATE NOT NULL
);


In [None]:
%%sql

-- 3) CREATE TABLE majors
CREATE TABLE majors (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255) NOT NULL UNIQUE
);


In [None]:
%%sql

-- 4) CREATE TABLE enrollments (with FK + composite PK)
CREATE TABLE enrollments (
    student_id INT NOT NULL,
    major_id INT NOT NULL,
    gpa DECIMAL(4,2) NOT NULL, 
    
    PRIMARY KEY (student_id, major_id),
    FOREIGN KEY (student_id) REFERENCES students(id),
    FOREIGN KEY (major_id) REFERENCES majors(id),
    
    CHECK (gpa >= 0.00 AND gpa <= 20.00)
);


## Exercise 2 ‚Äî Confirm the structure (SELECT)
Check that all three tables exist and that the column types are correct.

üí° Tip: In MySQL you can use `DESCRIBE <table_name>;` or `SHOW COLUMNS FROM <table_name>;`.


In [16]:
%%sql

-- EXERCISE 2:
-- DESCRIBE students;
DESCRIBE students;


Field,Type,Null,Key,Default,Extra
id,int,NO,PRI,,auto_increment
name,varchar(255),NO,,,
birthdate,date,NO,,,


In [17]:
%%sql

-- EXERCISE 2 (continued):
-- DESCRIBE majors;
DESCRIBE majors;


Field,Type,Null,Key,Default,Extra
id,int,NO,PRI,,auto_increment
name,varchar(255),NO,UNI,,


In [18]:
%%sql

-- EXERCISE 2 (continued):
-- DESCRIBE enrollments;
DESCRIBE enrollments;


Field,Type,Null,Key,Default,Extra
student_id,int,NO,PRI,,
major_id,int,NO,PRI,,
gpa,"decimal(4,2)",NO,,,


---
## Exercise 3 ‚Äî Insert data into `majors`
Insert the following majors into `majors`:
- Computer Science
- Economics
- Biology
- Engineering
- Mathematics
- Physics
- Chemistry


In [19]:
%%sql

-- EXERCISE 3:
-- Insert the 7 majors here
INSERT INTO majors (name) VALUES
    ('Computer Science'),
    ('Economics'),
    ('Biology'),
    ('Engineering'),
    ('Mathematics'),
    ('Physics'),
    ('Chemistry')
;


## Exercise 4 ‚Äî Insert data into `students` (10 records)
Insert 10 students (no `major_id` here; majors are handled through `enrollments`).

| name | birthdate |
|---|---|
| Ana Silva | 2007-03-14 |
| Bruno Costa | 2006-11-02 |
| Carla Mendes | 2007-07-29 |
| Daniel Rocha | 2006-01-18 |
| Eva Santos | 2007-09-05 |
| Filipe Almeida | 2006-05-21 |
| Guilherme Ferreira | 2007-12-10 |
| Helena Sousa | 2006-08-03 |
| In√™s Pereira | 2007-02-27 |
| Jo√£o Martins | 2006-04-16 |

- Make sure dates use the format `YYYY-MM-DD` (See more about [ISO 8601](https://www.iso.org/iso-8601-date-and-time-format.html) and the [MySQL reference](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-literals.html)).


In [20]:
%%sql

-- EXERCISE 4:
-- Insert the 10 students here (omit id if AUTO_INCREMENT)
INSERT INTO students (name, birthdate) VALUES
    ('Ana Silva', '2007-03-14'),
    ('Bruno Costa', '2006-11-02'),
    ('Carla Mendes', '2007-07-29'),
    ('Daniel Rocha', '2006-01-18'),
    ('Eva Santos', '2007-09-05'),
    ('Filipe Almeida', '2006-05-21'),
    ('Guilherme Ferreira', '2007-12-10'),
    ('Helena Sousa', '2006-08-03'),
    ('In√™s Pereira', '2007-02-27'),
    ('Jo√£o Martins', '2006-04-16')
;


## Exercise 5 ‚Äî Verify base tables (SELECT)
Show all rows in `students` and `majors`.


In [23]:
%%sql

-- EXERCISE 5 (1):
-- SELECT * FROM students;
SELECT * FROM students;


id,name,birthdate
1,Ana Silva,2007-03-14
2,Bruno Costa,2006-11-02
3,Carla Mendes,2007-07-29
4,Daniel Rocha,2006-01-18
5,Eva Santos,2007-09-05
6,Filipe Almeida,2006-05-21
7,Guilherme Ferreira,2007-12-10
8,Helena Sousa,2006-08-03
9,In√™s Pereira,2007-02-27
10,Jo√£o Martins,2006-04-16


In [29]:
%%sql

-- EXERCISE 5 (2):
-- SELECT * FROM majors;
SELECT * FROM majors
ORDER BY id;


id,name
1,Computer Science
2,Economics
3,Biology
4,Engineering
5,Mathematics
6,Physics
7,Chemistry


---
## Exercise 6 ‚Äî Insert data into `enrollments` (many-to-many)
Insert the enrollments below. Remember: **gpa is per major**.

Dataset (use major names for readability):

| student_name | major_name | gpa |
|---|---|---:|
| Ana Silva | Computer Science | 17.50 |
| Ana Silva | Mathematics | 16.80 |
| Bruno Costa | Economics | 14.20 |
| Bruno Costa | Mathematics | 13.90 |
| Carla Mendes | Biology | 16.10 |
| Carla Mendes | Chemistry | 15.40 |
| Daniel Rocha | Engineering | 13.80 |
| Eva Santos | Mathematics | 18.30 |
| Eva Santos | Physics | 17.10 |
| Filipe Almeida | Mathematics | 12.60 |
| Guilherme Ferreira | Mathematics | 15.70 |
| Guilherme Ferreira | Economics | 15.10 |
| Helena Sousa | Physics | 16.90 |
| In√™s Pereira | Biology | 13.10 |
| Jo√£o Martins | Chemistry | 14.90 |
| Jo√£o Martins | Computer Science | 14.40 |

üí° Tip: you can insert using subqueries to fetch ids:
```sql
INSERT INTO enrollments (student_id, major_id, gpa)
VALUES (
  (SELECT id FROM students WHERE name = '<student_name>'),
  (SELECT id FROM majors WHERE name = '<major_name>'),
  <gpa_value>
);
```


In [30]:
%%sql

-- EXERCISE 6:
-- Insert the enrollments (student_id + major_id must come from lookups)

INSERT INTO enrollments (student_id, major_id, gpa) VALUES
    (
        (SELECT id FROM students WHERE name='Ana Silva'), 
        (SELECT id FROM majors WHERE name='Computer Science'), 
        17.50
    ),
    
    ((SELECT id FROM students WHERE name='Ana Silva'), (SELECT id FROM majors WHERE name='Mathematics'), 16.80),

    ((SELECT id FROM students WHERE name='Bruno Costa'), (SELECT id FROM majors WHERE name='Economics'), 14.20),
    ((SELECT id FROM students WHERE name='Bruno Costa'), (SELECT id FROM majors WHERE name='Mathematics'), 13.90),

    ((SELECT id FROM students WHERE name='Carla Mendes'), (SELECT id FROM majors WHERE name='Biology'), 16.10),
    ((SELECT id FROM students WHERE name='Carla Mendes'), (SELECT id FROM majors WHERE name='Chemistry'), 15.40),

    ((SELECT id FROM students WHERE name='Daniel Rocha'), (SELECT id FROM majors WHERE name='Engineering'), 13.80),

    ((SELECT id FROM students WHERE name='Eva Santos'), (SELECT id FROM majors WHERE name='Mathematics'), 18.30),
    ((SELECT id FROM students WHERE name='Eva Santos'), (SELECT id FROM majors WHERE name='Physics'), 17.10),

    ((SELECT id FROM students WHERE name='Filipe Almeida'), (SELECT id FROM majors WHERE name='Mathematics'), 12.60),

    ((SELECT id FROM students WHERE name='Guilherme Ferreira'), (SELECT id FROM majors WHERE name='Mathematics'), 15.70),
    ((SELECT id FROM students WHERE name='Guilherme Ferreira'), (SELECT id FROM majors WHERE name='Economics'), 15.10),

    ((SELECT id FROM students WHERE name='Helena Sousa'), (SELECT id FROM majors WHERE name='Physics'), 16.90),

    ((SELECT id FROM students WHERE name='In√™s Pereira'), (SELECT id FROM majors WHERE name='Biology'), 13.10),

    ((SELECT id FROM students WHERE name='Jo√£o Martins'), (SELECT id FROM majors WHERE name='Chemistry'), 14.90),
    ((SELECT id FROM students WHERE name='Jo√£o Martins'), (SELECT id FROM majors WHERE name='Computer Science'), 14.40
 
);



## Exercise 7 ‚Äî Verify enrollments (SELECT with JOINs)
Show a readable list of enrollments with:
- student id, student name
- major name
- gpa (per major)

üí° Tip (JOIN):
```sql
SELECT <columns>
FROM <junction> j
JOIN <table_a> a ON j.<a_id> = a.id
JOIN <table_b> b ON j.<b_id> = b.id;
```


In [39]:
%%sql

-- EXERCISE 7:
-- SELECT with JOINs to display enrollments
-- Without renaming columns, the output will have ambiguous column names (id and name)

SELECT *
    
FROM enrollments

JOIN students ON enrollments.student_id = students.id -- Join with students table
JOIN majors   ON enrollments.major_id   = majors.id   -- Join with majors table
    
;



student_id,major_id,gpa,id,name,birthdate,id_1,name_1
1,1,17.5,1,Ana Silva,2007-03-14,1,Computer Science
1,5,16.8,1,Ana Silva,2007-03-14,5,Mathematics
2,2,14.2,2,Bruno Costa,2006-11-02,2,Economics
2,5,13.9,2,Bruno Costa,2006-11-02,5,Mathematics
3,3,16.1,3,Carla Mendes,2007-07-29,3,Biology
3,7,15.4,3,Carla Mendes,2007-07-29,7,Chemistry
4,4,13.8,4,Daniel Rocha,2006-01-18,4,Engineering
5,5,18.3,5,Eva Santos,2007-09-05,5,Mathematics
5,6,17.1,5,Eva Santos,2007-09-05,6,Physics
6,5,12.6,6,Filipe Almeida,2006-05-21,5,Mathematics


In [None]:
%%sql

-- EXERCISE 7:
-- SELECT with JOINs to display enrollments
-- Without renaming columns, the output will have ambiguous column names (id and name)

SELECT
    students.id ,
    students.name,
    majors.name,
    enrollments.gpa 
    
FROM enrollments

JOIN students
    ON enrollments.student_id = students.id

JOIN majors
    ON enrollments.major_id = majors.id
    
ORDER BY students.id, majors.name;



id,name,name_1,gpa
1,Ana Silva,Computer Science,17.5
1,Ana Silva,Mathematics,16.8
2,Bruno Costa,Economics,14.2
2,Bruno Costa,Mathematics,13.9
3,Carla Mendes,Biology,16.1
3,Carla Mendes,Chemistry,15.4
4,Daniel Rocha,Engineering,13.8
5,Eva Santos,Mathematics,18.3
5,Eva Santos,Physics,17.1
6,Filipe Almeida,Mathematics,12.6


In [36]:
%%sql

-- EXERCISE 7:
-- SELECT with JOINs to display enrollments
-- Renaming columns, the output will have ambiguous column names (id and name)

SELECT
    students.id     AS student_id,
    students.name   AS student_name,
    majors.name     AS major_name,
    enrollments.gpa
    
FROM enrollments

JOIN students
    ON enrollments.student_id = students.id

JOIN majors
    ON enrollments.major_id = majors.id
    
ORDER BY students.id, majors.name;



student_id,student_name,major_name,gpa
1,Ana Silva,Computer Science,17.5
1,Ana Silva,Mathematics,16.8
2,Bruno Costa,Economics,14.2
2,Bruno Costa,Mathematics,13.9
3,Carla Mendes,Biology,16.1
3,Carla Mendes,Chemistry,15.4
4,Daniel Rocha,Engineering,13.8
5,Eva Santos,Mathematics,18.3
5,Eva Santos,Physics,17.1
6,Filipe Almeida,Mathematics,12.6


In [35]:
%%sql

-- EXERCISE 7:
-- SELECT with JOINs to display enrollments
-- Renaming columns, the output will have ambiguous column names (id and name)
-- Renaming table aliases for shorter code

SELECT

    s.id     AS student_id,
    s.name   AS student_name,
    m.name   AS major_name,
    e.gpa

FROM enrollments AS e

JOIN students AS s
    ON e.student_id = s.id

JOIN majors AS m
    ON e.major_id = m.id

ORDER BY s.id, m.name;



student_id,student_name,major_name,gpa
1,Ana Silva,Computer Science,17.5
1,Ana Silva,Mathematics,16.8
2,Bruno Costa,Economics,14.2
2,Bruno Costa,Mathematics,13.9
3,Carla Mendes,Biology,16.1
3,Carla Mendes,Chemistry,15.4
4,Daniel Rocha,Engineering,13.8
5,Eva Santos,Mathematics,18.3
5,Eva Santos,Physics,17.1
6,Filipe Almeida,Mathematics,12.6


---
## Exercise 8 ‚Äî Update data (UPDATE)
Update Bruno Costa's GPA **for the Economics major** to **15.00**.

üí° Tip: In a junction table you often update using BOTH keys:
```sql
UPDATE <junction_table>
SET <column> = <new_value>
WHERE <a_id> = <value> AND <b_id> = <value>;
```


In [None]:
%%sql

-- EXERCISE 8:
-- UPDATE the enrollment GPA for Bruno Costa in Economics
UPDATE enrollments

JOIN students  ON enrollments.student_id = students.id
JOIN majors    ON enrollments.major_id   = majors.id

SET enrollments.gpa = 15.00

WHERE students.name = 'Bruno Costa'
  AND majors.name = 'Economics'
;


In [41]:
%%sql

-- EXERCISE 8:
-- UPDATE the enrollment GPA for Bruno Costa in Economics
UPDATE enrollments AS e
    JOIN students  AS s  ON  e.student_id = s.id
    JOIN majors    AS m  ON  e.major_id   = m.id
    
SET e.gpa = 15.00

WHERE s.name = 'Bruno Costa'
  AND m.name = 'Economics'
;


## Exercise 9 ‚Äî Confirm the update (SELECT)
Show Bruno Costa's enrollments (with major names) to confirm the change.


In [45]:
%%sql

-- EXERCISE 9:
-- SELECT Bruno Costa enrollments (JOIN)
SELECT 
    students.name AS student,
    majors.name   AS major, 
    enrollments.gpa

FROM enrollments
JOIN students ON enrollments.student_id = students.id
JOIN majors   ON enrollments.major_id   = majors.id

WHERE students.name = 'Bruno Costa'
;


student,major,gpa
Bruno Costa,Economics,15.0
Bruno Costa,Mathematics,13.9


---
## Exercise 10 ‚Äî Delete records (DELETE)
Delete **one or more enrollments** with `gpa` below **13.00**.

üí° Tip: `DELETE` can be applied to the junction table without deleting the student itself.


In [46]:
%%sql

-- EXERCISE 10:
-- DELETE enrollments where gpa < 13.00
DELETE FROM enrollments
WHERE gpa < 13.00;


## Exercise 11 ‚Äî Verification query after DELETE (SELECT)
Show all remaining enrollments (JOIN) after the delete.


In [51]:
%%sql

-- EXERCISE 11:
-- SELECT with JOINs to verify remaining enrollments
SELECT
  s.id    AS  student_id,
  s.name  AS  student_name,
  m.name  AS  major_name,
  e.gpa
  
FROM enrollments AS e
JOIN students    AS s  ON  e.student_id = s.id
JOIN majors      AS m  ON  e.major_id = m.id

ORDER BY e.gpa DESC;



student_id,student_name,major_name,gpa
5,Eva Santos,Mathematics,18.3
1,Ana Silva,Computer Science,17.5
5,Eva Santos,Physics,17.1
8,Helena Sousa,Physics,16.9
1,Ana Silva,Mathematics,16.8
3,Carla Mendes,Biology,16.1
7,Guilherme Ferreira,Mathematics,15.7
3,Carla Mendes,Chemistry,15.4
7,Guilherme Ferreira,Economics,15.1
2,Bruno Costa,Economics,15.0


---
## Exercise 12 ‚Äî Global verification query (SELECT + ORDER BY)
Show `student_name`, `major_name`, `gpa`, ordered by `gpa` (highest to lowest).


In [None]:
%%sql

-- EXERCISE 12:
-- SELECT with JOINs + ORDER BY gpa DESC

SELECT
    s.name AS student_name,
    m.name AS major_name,
    e.gpa
  
FROM enrollments AS  e
JOIN students    AS  s  ON e.student_id = s.id
JOIN majors      AS  m  ON e.major_id = m.id

ORDER BY e.gpa DESC;



student_name,major_name,gpa
Eva Santos,Mathematics,18.3
Ana Silva,Computer Science,17.5
Eva Santos,Physics,17.1
Helena Sousa,Physics,16.9
Ana Silva,Mathematics,16.8
Carla Mendes,Biology,16.1
Guilherme Ferreira,Mathematics,15.7
Carla Mendes,Chemistry,15.4
Guilherme Ferreira,Economics,15.1
Bruno Costa,Economics,15.0


---
## Exercise 13 ‚Äî SELECT with filters
Write **two** queries:
1) Show only enrollments for the major `Computer Science`.
2) Show enrollments with `gpa` between **15.00** and **18.00** (inclusive).


In [53]:
%%sql

-- EXERCISE 13 (1):
-- Filter by major name

SELECT
    s.name AS student_name,
    m.name AS major_name,
    e.gpa

FROM enrollments AS e
JOIN students AS s ON e.student_id = s.id
JOIN majors   AS m ON e.major_id = m.id
WHERE m.name = 'Computer Science';



student_name,major_name,gpa
Ana Silva,Computer Science,17.5
Jo√£o Martins,Computer Science,14.4


In [54]:
%%sql

-- EXERCISE 13 (2):
-- Filter by GPA range
SELECT
    s.name AS student_name,
    m.name AS major_name,
    e.gpa
FROM enrollments  AS  e
JOIN students     AS  s ON e.student_id = s.id
JOIN majors      AS  m ON e.major_id = m.id
WHERE e.gpa BETWEEN 15.00 AND 18.00;


student_name,major_name,gpa
Ana Silva,Computer Science,17.5
Ana Silva,Mathematics,16.8
Bruno Costa,Economics,15.0
Carla Mendes,Biology,16.1
Carla Mendes,Chemistry,15.4
Eva Santos,Physics,17.1
Guilherme Ferreira,Economics,15.1
Guilherme Ferreira,Mathematics,15.7
Helena Sousa,Physics,16.9


---
## Challenge (optional) ‚Äî GROUP BY (by major)
Write a query that shows, **for each major**:
- the major name
- the **number of enrolled students** in that major
- the **average GPA** in that major

üí° Tip: Use `GROUP BY` + aggregates like `COUNT()` and `AVG()`.


In [55]:
%%sql

-- CHALLENGE:
-- GROUP BY major name (with JOINs) to compute COUNT and AVG
-- (Optional) order by average GPA DESC

SELECT
    m.name AS major_name,
    COUNT(*) AS num_students,
    ROUND(AVG(e.gpa), 2) AS avg_gpa
FROM enrollments AS e
JOIN majors      AS m ON e.major_id = m.id
GROUP BY m.id, m.name
ORDER BY avg_gpa DESC;



major_name,num_students,avg_gpa
Physics,2,17.0
Mathematics,4,16.18
Computer Science,2,15.95
Chemistry,2,15.15
Economics,2,15.05
Biology,2,14.6
Engineering,1,13.8


---
**end of doc**

# Extra: VIEWS

CREATE VIEW AS

SELECT
  s.id    AS  student_id,
  s.name  AS  student_name,
  m.name  AS  major_name,
  e.gpa
  
FROM enrollments AS e
JOIN students    AS s  ON  e.student_id = s.id
JOIN majors      AS m  ON  e.major_id = m.id

ORDER BY e.gpa DESC;
