# Set Theory in SQL: Classlist Database
© Explore Data Science Academy

## Instructions to Students

This challenge is designed to determine how much you have learned so far and will test your knowledge set theory through the use of SQL queries.

The answers for this challenge should be selected on Athena for each corresponding Multiple Choice Question. The questions are included in this notebook and are numbered according to the Athena Questions, the options to choose from for each question has also been included.

Do not add or remove cells in this notebook. Do not edit or remove the `%%sql` comment as it is required to run each cell.

**_Good Luck!_**

## Honour Code

I EDNA , KOBO, confirm - by submitting this document - that the solutions in this notebook are a result of my own work and that I abide by the EDSA honour code (https://drive.google.com/file/d/1QDCjGZJ8-FmJE3bZdIQNwnJyQKPhHZBn/view?usp=sharing).

Non-compliance with the honour code constitutes a material breach of contract.

## The Classlist Database

![Hi](https://upload.wikimedia.org/wikipedia/commons/3/39/Student_in_Class_%283618969705%29.jpg)

The Classlist database contains the records of multiple students who have undertaken primary and supplementary examinations in multiple subjects. This data is split across two tables: 

 - **Exammarks**; and 
 - **Supplementarymarks**

Unlike our previous challenge, we leave it up to you to investigate the contents of these tables and the various attributes they contain.  

## Loading the database

To begin and start making use of SQL queries you need to prepare your SQL environment. You can do this by loading in the magic command `%load_ext sql`, next you can go ahead and load in your database. To do this you will need to ensure you have downloaded the `classlist.db`sqlite file from Athena and have stored it in a known location. 

Now that you have all the prerequistes you can go ahead and load it into the notebook. 

In [1]:
%load_ext sql

In [2]:
%%sql 

sqlite:///classlist.db

## Questions on Set Theory 

Use the given cell below each question to execute your SQL queries to find the correct input from the options provided for the multiple choice questions on Athena.

**Question 2**

How many students did not write any of their final exams?

**Options:** 
 - 5
 - 95
 - 70
 - 25

**Solution**

In [3]:
%%sql 
SELECT COUNT(DISTINCT sm.StudentNo) - COUNT(DISTINCT em.StudentNo)
FROM Supplementarymarks sm
LEFT JOIN exammarks em
ON sm.StudentNo = em.StudentNo;

 * sqlite:///classlist.db
(sqlite3.OperationalError) no such table: Supplementarymarks
[SQL: SELECT COUNT(DISTINCT sm.StudentNo) - COUNT(DISTINCT em.StudentNo)
FROM Supplementarymarks sm
LEFT JOIN exammarks em
ON sm.StudentNo = em.StudentNo;]
(Background on this error at: https://sqlalche.me/e/20/e3q8)


**Question 4**

What are the names of the students in the grade who scored the highest marks for Science? _(hint: you need to consider the exam AND supplementary exam marks)_

**Options:**
 - Jack and Jane
 - Joe and Duane
 - Leroy and Harold
 - Craig and Danielle

**Solution**

In [5]:
%%sql
SELECT Name, Science
FROM (
  SELECT Name, Science FROM Exammarks
  UNION
  SELECT Name, Science FROM SupplementaryMarks
) AS combined_marks
WHERE Science > 50
ORDER BY Science DESC
LIMIT 2;

 * sqlite:///classlist.db
(sqlite3.OperationalError) no such table: SupplementaryMarks
[SQL: SELECT Name, Science
FROM (
  SELECT Name, Science FROM Exammarks
  UNION
  SELECT Name, Science FROM SupplementaryMarks
) AS combined_marks
WHERE Science > 50
ORDER BY Science DESC
LIMIT 2;]
(Background on this error at: https://sqlalche.me/e/20/e3q8)


**Question 5**

How many students had to re-write their Maths and science exam? *(hint: a passing mark is considered to be 50 or greater)*

**Options:**
 - 12
 - 4
 - 20
 - 9

**Solution**

In [6]:
%%sql
SELECT COUNT(StudentNo) 
FROM ExamMarks
WHERE Maths < 50 AND Science < 50;

 * sqlite:///classlist.db
(sqlite3.OperationalError) no such table: ExamMarks
[SQL: SELECT COUNT(StudentNo) 
FROM ExamMarks
WHERE Maths < 50 AND Science < 50;]
(Background on this error at: https://sqlalche.me/e/20/e3q8)


**Question 6**

What was the average mark, rounded down, for students who wrote the supplementary accounting exam after missing the first?

**Options:**
 - 73
 - 79
 - 76
 - 82

**Solution**

In [8]:
%%sql
SELECT ROUND(AVG(sm.Accounting) - 0.5)
FROM SupplementaryMarks sm
WHERE NOT EXISTS (
  SELECT *
  FROM ExamMarks em
  WHERE sm.StudentNo = em.StudentNo
);


 * sqlite:///classlist.db
(sqlite3.OperationalError) no such table: SupplementaryMarks
[SQL: SELECT ROUND(AVG(sm.Accounting) - 0.5)
FROM SupplementaryMarks sm
WHERE NOT EXISTS (
  SELECT *
  FROM ExamMarks em
  WHERE sm.StudentNo = em.StudentNo
);]
(Background on this error at: https://sqlalche.me/e/20/e3q8)


**Question 7**

What was the average mark, rounded down, for students who wrote the supplementary accounting exam after failing the first?

**Options:**
 - 79
 - 82
 - 76
 - 73

**Solution**

In [9]:
%%sql
SELECT ROUND(AVG(sm.Accounting) - 0.5)
FROM SupplementaryMarks sm
LEFT OUTER JOIN ExamMarks em ON sm.StudentNo = em.StudentNo
WHERE em.Accounting < 50 OR em.Accounting IS NULL;

 * sqlite:///classlist.db
(sqlite3.OperationalError) no such table: SupplementaryMarks
[SQL: SELECT ROUND(AVG(sm.Accounting) - 0.5)
FROM SupplementaryMarks sm
LEFT OUTER JOIN ExamMarks em ON sm.StudentNo = em.StudentNo
WHERE em.Accounting < 50 OR em.Accounting IS NULL;]
(Background on this error at: https://sqlalche.me/e/20/e3q8)


**Question 9**

What is the Full name of the student in the grade who scored the highest mark for Biology? *(hint: consider both supplementary and exam marks)*
 
 **Options:**
 - Tracy Grady
 - Bertha Hoff
 - Daryl Finn
 - Lillie Deaton

**Solution**

In [10]:
%%sql 
SELECT Name, Surname, Biology 
FROM (
  SELECT Name, Surname, Biology
  FROM SupplementaryMarks
  WHERE Biology > 50
  UNION
  SELECT Name, Surname, Biology
  FROM ExamMarks
  WHERE Biology > 50
) AS subquery
ORDER BY Biology DESC
LIMIT 1;

 * sqlite:///classlist.db
(sqlite3.OperationalError) no such table: ExamMarks
[SQL: SELECT Name, Surname, Biology 
FROM (
  SELECT Name, Surname, Biology
  FROM SupplementaryMarks
  WHERE Biology > 50
  UNION
  SELECT Name, Surname, Biology
  FROM ExamMarks
  WHERE Biology > 50
) AS subquery
ORDER BY Biology DESC
LIMIT 1;]
(Background on this error at: https://sqlalche.me/e/20/e3q8)


**Question 10**

Assuming all subjects are weighted equally, what was the average total mark, rounded down, for students who didn’t write any supplementary exams?
 
**Options:**
 - 74
 - 66
 - 73
 - 76

**Solution**

In [11]:
%%sql 
SELECT ROUND(AVG((em.Maths + em.Accounting + em.CompSci + em.Science + em.Biology) / 5) - 0.5)
FROM ExamMarks em
LEFT OUTER JOIN SupplementaryMarks sm ON em.StudentNo = sm.StudentNo
WHERE sm.StudentNo IS NULL;

 * sqlite:///classlist.db
(sqlite3.OperationalError) no such table: ExamMarks
[SQL: SELECT ROUND(AVG((em.Maths + em.Accounting + em.CompSci + em.Science + em.Biology) / 5) - 0.5)
FROM ExamMarks em
LEFT OUTER JOIN SupplementaryMarks sm ON em.StudentNo = sm.StudentNo
WHERE sm.StudentNo IS NULL;]
(Background on this error at: https://sqlalche.me/e/20/e3q8)
