# Set Theory in SQL: Classlist Database
#### © Explore Data Science Academy

## Instructions to Students

This challenge was designed to determine how much have been learned so far and to test knowledge on SQL SET Theory statements.

Questions were provided which were answered using SQL Queries in attempt to test my understanding of the subject matter.

## Honour Code

I YINKA, AKINDELE, confirm - by submitting this document - that the solutions in this notebook are a result of my own work and that I abided by the EDSA honour code.

Non-compliance with the honour code constitutes a material breach of contract.

## The TMDb Database

In this challenge I explored the [The Movie Database](https://www.themoviedb.org/) - an online movie and TV show database, which houses some of the most popular movies and TV shows at your finger tips. The TMDb database supports 39 official languages used in over 180 countries daily and dates all the way back to 2008. 


<img src="images/sql_tmdb.jpg" width=80%/>


Below is an Entity Relationship diagram(ERD) of the TMDb database:

<img src="images/TMDB_ER_diagram.png" width=70%/>

As can be seen from the ER diagram, the TMDb database consists of `12 tables` containing information about movies, cast, genre and so much more.  



#### Getting started!

This challenge is designed test your knowledge on SQL SET theory statements. Based on the TMDb dataset.

## Loading the database

The SQL environment was prepared by loading in the magic command below in order to use SQL queries.

The database, TMDB.db, was then loaded into the SQL environment in the following cell.


In [1]:
%load_ext sql

In [2]:
%%sql 

sqlite:///data/classlist.db

<br>
<br>

#### The query below was used to view all the tables in the database

In [3]:
%%sql
SELECT name FROM sqlite_master WHERE type IN ('table', 'view') AND name NOT LIKE 'sqlite_%' ORDER BY 1

 * sqlite:///data/classlist.db
Done.


name
exammarks
supplementarymarks
sysdiagrams


## Questions on Set Theory 

#### In the following cells, questions were provided that are to be answered from the database using set theory SQL queries.

### Question

How many students did not write any of their final exams?

**Solution**

In [4]:
%%sql

SELECT StudentNo, Maths, Science, Biology, Accounting, CompSci, count(*) AS "Number_of_students_that_did_not_write_any_final_exams"
FROM (SELECT StudentNo, Maths, Science, Biology, Accounting, CompSci
FROM exammarks
WHERE Maths is NULL
AND Science is NULL
AND Biology is NULL
AND Accounting is NULL
AND CompSci is NULL

UNION

SELECT StudentNo, Maths, Science, Biology, Accounting, CompSci
FROM supplementarymarks
WHERE Maths is NULL
AND Science is NULL
AND Biology is NULL
AND Accounting is NULL
AND CompSci is NULL)

 * sqlite:///data/classlist.db
Done.


StudentNo,Maths,Science,Biology,Accounting,CompSci,Number_of_students_that_did_not_write_any_final_exams
DODJAM003,,,,,,1


### Question

What are the names of the students in the grade who scored the highest marks for Science?


**Solution**

In [5]:
%%sql

SELECT Name AS "First_Name", Science AS "Science_score"
FROM exammarks

UNION

SELECT Name AS "First_Name", Science AS "Science_score"
FROM supplementarymarks

ORDER BY Science DESC
LIMIT 5;

 * sqlite:///data/classlist.db
Done.


First_Name,Science_score
CRAIG,100
DANIELLE,100
BILLIE,99
DUANE,99
JOE,97


### Question

How many students had to re-write their Maths and science exam? *(hint: a passing mark is considered to be 50 or greater)*

**Solution**

In [6]:
%%sql

SELECT count(*) AS "Number_of_students_rewriting_Maths_and_Science"
FROM (SELECT Maths, Science
FROM exammarks
WHERE Maths < 50
AND Science < 50)

 * sqlite:///data/classlist.db
Done.


Number_of_students_rewriting_Maths_and_Science
4


### Question

What was the average mark, rounded down, for students who wrote the supplementary accounting exam after missing the first?

**Solution**

In [7]:
%%sql

SELECT round(avg(sm.Accounting)-0.5)
FROM supplementarymarks sm
LEFT OUTER JOIN exammarks em
ON sm.StudentNo = em.StudentNo
WHERE NOT EXISTS
      (SELECT *
       FROM ExamMarks
       WHERE sm.StudentNo = em.StudentNo)

 * sqlite:///data/classlist.db
Done.


round(avg(sm.Accounting)-0.5)
76.0


### Question

What was the average mark, rounded down, for students who wrote the supplementary accounting exam after failing the first?


**Solution**

In [8]:
%%sql

SELECT FLOOR(avg(s.Accounting))
FROM supplementarymarks s, exammarks e
WHERE e.Accounting < 50 

 * sqlite:///data/classlist.db
Done.


FLOOR(avg(s.Accounting))
73


### Question

What is the Full name of the student in the grade who scored the highest mark for Biology? *(hint: consider both supplementary and exam marks)*


**Solution**

In [9]:
%%sql

SELECT Name || " "|| Surname AS "Full Name", Biology AS "Biology_Score"
FROM exammarks

UNION

SELECT Name || " "|| Surname AS "Full Name", Biology AS "Biology_Score"
FROM supplementarymarks

ORDER BY Biology DESC
LIMIT 2;

 * sqlite:///data/classlist.db
Done.


Full Name,Biology_Score
TRACY GRADY,99
BERTHA HOFF,98


### Question

Assuming all subjects are weighted equally, what was the average total mark, rounded down, for students who didn’t write any supplementary exams?

 
 correct answer 74

**Solution**

In [10]:
%%sql

SELECT FLOOR(avg((e.Maths + e.Science + e.Biology + e.Accounting + e.CompSci)/5)) AS "Average_of_those_who_didn't_write_any_supplementary_exams"
FROM exammarks e
WHERE NOT EXISTS
(
    SELECT *
    FROM supplementarymarks s
    WHERE s.StudentNo = e.StudentNo
)

 * sqlite:///data/classlist.db
Done.


Average_of_those_who_didn't_write_any_supplementary_exams
74
