# SQL Exam


## Instructions to students

This challenge is designed to determine how much you have learned so far and will test your knowledge on SQL.

The answers for this challenge should be selected on Athena for each corresponding multiple-choice question. The questions are included in this notebook and are numbered according to the Athena questions. The options for each question have also been included.

Do not add or remove cells in this notebook. Do not edit or remove the `%%sql` comment as it is required to run each cell.

**_Good luck!_**

## Honour code

I, OLAKUNLE ADEOTI OPELOYERU, confirm – by submitting this document – that the solutions in this notebook are a result of my own work and that I abide by the EDSA honour code 


## The TMDb database

In this supplementary exam, you will be exploring [The Movie Database](https://www.themoviedb.org/) – an online movie and TV show database that houses some of the most popular movies and TV shows at your fingertips. The TMDb database supports 39 official languages used in over 180 countries daily and dates back all the way to 2008. 


<img src="https://github.com/Explore-AI/Pictures/blob/master/sql_tmdb.jpg?raw=true" width=80%/>


Below is an Entity Relationship Diagram (ERD) of the TMDb database:

<img src="https://github.com/Explore-AI/Pictures/blob/master/TMDB_ER_diagram.png?raw=true" width=70%/>

As can be seen from the ERD, the TMDb database consists of `12 tables` containing information about movies, cast, genre, and so much more.  

Let's get started!

## Loading the database

Before you begin, you need to prepare your SQL environment.  You can do this by loading the magic command `%load_ext sql`.

In [1]:
# Load and activate the SQL extension to allow us to execute SQL in a Jupyter notebook. 
# If you get an error here, make sure that mysql and pymysql are installed correctly. 

%load_ext sql

Next, go ahead and load your database. To do this, you will need to ensure you have downloaded the `TMDB.db` sqlite file from Athena and have stored it in a known location.

In [4]:
# Establish a connection to the local database using the '%sql' magic command.
# Replace 'password' with our connection password and `db_name` with our database name. 
# If you get an error here, please make sure the database name or password is correct.

%sql sqlite:///TMDB.db

'Connected: @TMDB.db'

In [3]:
from sqlalchemy import create_engine

# Create an engine to the database
engine = create_engine('sqlite:///TMDB.db')

# Connect to the engine
with engine.connect() as connection:
    # Execute the correct SQL command to list tables
    result = connection.execute("SELECT name FROM sqlite_master WHERE type='table';")
    tables = result.fetchall()

    # Print the tables
    for table in tables:
        print(table[0])

actors
casts
genremap
genres
keywordmap
keywords
languagemap
languages
movies
oscars
productioncompanies
productioncompanymap
productioncountries
productioncountrymap
sysdiagrams


If the above line didn't throw out any errors, then you should be good to go. Good luck with the exam! 

## Questions on SQL

Use the given cell below each question to execute your SQL queries to find the correct input from the options provided. Your solution should match one of the multiple-choice questions on Athena.

### Question 1

Who won the Oscar for “Actor in a Leading Role” in  2015?

(Hint: The winner is indicated as '1.0'.)

**Options:** 

  - Micheal Fassbender
  - Natalie Portman
  - Leonardo DiCaprio
  - Eddie Redmayne


In [198]:
%%sql

SELECT *
FROM oscars
WHERE award LIKE 'Actor%Leading%' AND winner = 1.0
AND year = 2015;

 * sqlite:///TMDB.db
Done.


year,award,winner,name,film,movie_title
2015,Actor in a Leading Role,1.0,Leonardo DiCaprio,The Revenant,


### Question 2

What query will produce the ten oldest movies in the database?

**Options:**

 - SELECT TOP(10) * FROM movies WHERE release_date ORDER BY release_date ASC

 - SELECT  * FROM movies WHERE release_date IS NOT NULL ORDER BY release_date ASC LIMIT 10

 - SELECT * FROM movies WHERE release_date IS NOT NULL ORDER BY release_date DESC LIMIT 10

 -  SELECT * FROM movies WHERE release_date IS NULL ORDER BY release_date DESC LIMIT 10

In [5]:
%%sql

SELECT *
FROM movies
WHERE release_date IS NOT NULL ORDER BY release_date ASC
LIMIT 10;

 * sqlite:///TMDB.db
Done.


movie_id,title,release_date,budget,homepage,original_language,original_title,overview,popularity,revenue,runtime,release_status,tagline,vote_average,vote_count
3059,Intolerance,1916-09-04 00:00:00.000000,385907,,en,Intolerance,"The story of a poor young woman, separated by prejudice from her husband and baby, is interwoven with tales of intolerance from throughout history.",3.232447,8394751.0,197.0,Released,The Cruel Hand of Intolerance,7.4,60
3060,The Big Parade,1925-11-05 00:00:00.000000,245000,,en,The Big Parade,"The story of an idle rich boy who joins the US Army's Rainbow Division and is sent to France to fight in World War I, becomes friends with two working class men, experiences the horrors of trench warfare, and finds love with a French girl.",0.785744,22000000.0,151.0,Released,,7.0,21
19,Metropolis,1927-01-10 00:00:00.000000,92620000,,de,Metropolis,"In a futuristic city sharply divided between the working class and the city planners, the son of the city's mastermind falls in love with a working class prophet who predicts the coming of a savior to mediate their differences.",32.351527,650422.0,153.0,Released,There can be no understanding between the hands and the brain unless the heart acts as mediator.,8.0,657
905,Pandora's Box,1929-01-30 00:00:00.000000,0,,de,Die Bnchse der Pandora,The rise and inevitable fall of an amoral but naive young woman whose insouciant eroticism inspires lust and violence in those around her.,1.824184,0.0,109.0,Released,,7.6,45
65203,The Broadway Melody,1929-02-08 00:00:00.000000,379000,,en,The Broadway Melody,"Harriet and Queenie Mahoney, a vaudeville act, come to Broadway, where their friend Eddie Kerns needs them for his number in one of Francis Zanfield's shows. Eddie was in love with Harriet, but when he meets Queenie, he falls in love to her, but she is courted by Jock Warriner, a member of the New Yorker high society. It takes a while till Queenie recognizes, that she is for Jock nothing more than a toy, and it also takes a while till Harriet recognizes, that Eddie is in love with Queenie",0.968865,4358000.0,100.0,Released,The pulsating drama of Broadway's bared heart speaks and sings with a voice to stir your soul!,5.0,19
22301,Hell's Angels,1930-11-15 00:00:00.000000,3950000,,en,Hell's Angels,"Two brothers attending Oxford enlist with the Royal Flying Corps when World War I breaks out. Roy and Monte Rutledge have very different personalities. Monte is a freewheeling womanizer, even with his brother's girlfriend Helen. He also proves to have a yellow streak when it comes to his Night Patrol duties. Roy is made of strong moral fiber and attempts to keep his brother in line. Both volunteer for an extremely risky two man bombing mission for different reasons. Monte wants to lose his cowardly reputation and Roy seeks to protect his brother. Roy loves Helen; Helen enjoys an affair with Monte; before they leave on their mission over Germany they find her in still another man's arms. Their assignment to knock out a strategic German munitions facility is a booming success, but with a squadron of fighters bearing down on them afterwards, escape seems unlikely.",8.484123,8000000.0,127.0,Released,Howard Hughes' Thrilling Multi-Million Dollar Air Spectacle,6.1,19
22649,A Farewell to Arms,1932-12-08 00:00:00.000000,4,,en,A Farewell to Arms,"British nurse Catherine Barkley (Helen Hayes) and American Lieutenant Frederic Henry (Gary Cooper) fall in love during the First World War in Italy. Eventually separated by Frederic's transfer, tremendous challenges and difficult decisions face each, as the war rages on. Academy Awards winner for Best Cinematography and for Best Sound, Recording. Nominated for Best Picture and for Best Art Direction.",1.199451,25.0,89.0,Released,Every woman who has loved will understand,6.2,28
3062,42nd Street,1933-02-02 00:00:00.000000,439000,,en,42nd Street,"A producer puts on what may be his last Broadway show, and at the last moment a chorus girl has to replace the star.",1.933366,2281000.0,89.0,Released,,6.1,37
43595,She Done Him Wrong,1933-02-09 00:00:00.000000,200000,,en,She Done Him Wrong,"""New York singer and nightclub owner Lady Lou has more men friends than you can imagine. Unfortunately one of them is a vicious criminal who's escaped and is on the way to see """"his"""" girl, not realising she hasn't exactly been faithful in his absence. Help is at hand in the form of young Captain Cummings a local temperance league leader though.""",0.622752,2200000.0,66.0,Released,Mae West gives a 'Hot Time' to the nation!,5.1,27
3078,It Happened One Night,1934-02-22 00:00:00.000000,325000,,en,It Happened One Night,"Ellie Andrews has just tied the knot with society aviator King Westley when she is whisked away to her father's yacht and out of King's clutches. Ellie jumps ship and eventually winds up on a bus headed back to her husband. Reluctantly she must accept the help of out-of- work reporter Peter Warne. Actually, Warne doesn't give her any choice: either she sticks with him until he gets her back to her husband, or he'll blow the whistle on Ellie to her father. Either way, Peter gets what he wants... a really juicy newspaper story!",11.871424,4500000.0,105.0,Released,TOGETHER... for the first time,7.7,275


### Question 3

How many unique awards are there in the Oscars table?

**Options:**
 - 141
 - 53 
 - 80
 - 114

In [284]:
%%sql

SELECT award,
    COUNT(DISTINCT award) AS Unique_awards
FROM oscars;

 * sqlite:///TMDB.db
Done.


award,Unique_awards
Actor,114


### Question 4

How many movies are there that contain the word “Spider” within their title?

**Options:**
 - 0
 - 5
 - 1
 - 9

In [10]:
%%sql

SELECT title,
    COUNT (title) AS Spider_movies
FROM movies 
WHERE title LIKE '%Spider%';

 * sqlite:///TMDB.db
Done.


title,Spider_movies
Spider-Man,9


### Question 5

How many movies are there that are both in the "Thriller" genre and contain the word “love” anywhere in the keywords?


**Options:**
 - 48
 - 38
 - 14
 - 1

In [297]:
%%sql

SELECT
    COUNT(*)
FROM 
    keywords kw 
INNER JOIN keywordmap kwm
    ON kwm.keyword_id = kw.keyword_id
INNER JOIN movies m
    ON m.movie_id = kwm.movie_id
INNER JOIN genremap gm
    ON gm.movie_id = m.movie_id
INNER JOIN genres g
    ON g.genre_id = gm.genre_id
WHERE g.genre_name = "Thriller"
AND keyword_name LIKE 'love';

 * sqlite:///TMDB.db
Done.


COUNT(*)
14


### Question 6

How many movies are there that were released between 1 August 2006 ('2006-08-01') and 1 October 2009 ('2009-10-01') that have a popularity score of more than 40 and a budget of less than 50 000 000?

 
**Options:**

 - 29
 - 23
 - 28
 - 35

In [9]:
%%sql

SELECT COUNT (*)
FROM 
    movies
WHERE release_date BETWEEN '2006-08-01' AND '2009-10-01' 
AND popularity > 40 AND budget < 50000000;

 * sqlite:///TMDB.db
Done.


COUNT (*)
29


### Question 7

How many unique characters has "Vin Diesel" played so far in the database?

**Options:**
 - 24
 - 19
 - 18
 - 16

In [14]:
%%sql

SELECT 
    characters,
COUNT(DISTINCT characters) AS Unique_Role
FROM 
    casts
INNER JOIN actors 
ON actors.actor_id = casts.actor_id
WHERE actors.actor_name = "Vin Diesel";

 * sqlite:///TMDB.db
Done.


characters,Unique_Role
Finger (voice),16


### Question 8

What are the genres of the movie “The Royal Tenenbaums”?


**Options:**
 - Action, Romance
 - Drama, Comedy
 - Crime, Thriller
 - Drama, Romance

In [16]:
%%sql

SELECT 
    g.genre_name,
    m.title,
    gm.genre_id
FROM 
    movies m
INNER JOIN genremap gm
ON gm.movie_id = m.movie_id
INNER JOIN genres g
ON g.genre_id = gm.genre_id
WHERE m.title = "The Royal Tenenbaums";

 * sqlite:///TMDB.db
Done.


genre_name,title,genre_id
Drama,The Royal Tenenbaums,18
Comedy,The Royal Tenenbaums,35


### Question 9

What are the three production companies that have the highest movie popularity score on average, as recorded within the database?


**Options:**

 - MCL Films S.A., Turner Pictures, and George Stevens Productions
 - The Donners' Company, Bulletproof Cupid, and Kinberg Genre
 - Bulletproof Cupid, The Donners' Company, and MCL Films S.A
 - B.Sting Entertainment, Illumination Pictures, and Aztec Musique

In [138]:
%%sql

SELECT 
    production_company_name, AVG(popularity) AS Avg_Popularity_score
FROM 
    movies m
INNER JOIN productioncompanymap pcm
    ON pcm.movie_id = m.movie_id 
INNER JOIN productioncompanies pc
    ON pc.production_company_id = pcm.production_company_id 
GROUP BY production_company_name
ORDER BY 
    Avg_popularity_score DESC
LIMIT 3;

 * sqlite:///TMDB.db
Done.


production_company_name,Avg_Popularity_score
The Donners' Company,514.569956
Bulletproof Cupid,481.098624
Kinberg Genre,326.92099900000005


### Question 10

How many female actors (i.e. gender = 1) have a name that starts with the letter "N"?


**Options:**

 - 0
 - 355
 - 7335
 - 1949

In [302]:
%%sql

SELECT gender, COUNT (*) 
FROM actors a
WHERE gender = 1 AND actor_name LIKE 'N%';

 * sqlite:///TMDB.db
Done.


gender,COUNT (*)
1,355


### Question 11

Which genre has, on average, the lowest movie popularity score? 


**Options:**

 - Science Fiction
 - Animation
 - Documentary
 - Foreign

In [307]:
%%sql


SELECT genre_name, AVG(popularity) AS AVG_Popularity_Genre
FROM 
    movies m
INNER JOIN genremap gm
ON gm.movie_id = m.movie_id
INNER JOIN genres g
ON g.genre_id = gm.genre_id
GROUP BY genre_name
ORDER BY 
AVG_Popularity_Genre ASC
LIMIT 1;

 * sqlite:///TMDB.db
Done.


genre_name,AVG_Popularity_Genre
Foreign,0.686786794117647


### Question 12

Which award category has the highest number of actor nominations (actors can be male or female)? (Hint: `Oscars.name` contains both actors' names and film names.)

**Options:**

- Special Achievement Award
- Actor in a Supporting Role
- Actress in a Supporting Role
- Best Picture



In [282]:
%%sql

SELECT award,COUNT(*) AS nomination_counts
FROM 
    oscars o
INNER JOIN actors a
    ON o.name = a.actor_name
GROUP BY award
ORDER BY nomination_counts DESC
LIMIT 1;

 * sqlite:///TMDB.db
Done.


award,nomination_counts
Actor in a Supporting Role,356


### Question 13

For all of the entries in the Oscars table before 1934, the year is stored differently than in all the subsequent years. For example, the year would be saved as “1932/1933” instead of just “1933” (the second indicated year).  Which of the following options would be the appropriate code to update this column to have the format of the year be consistent throughout the entire table (second indicated year only shown)?


**Options:**

- `UPDATE Oscars SET year = RIGHT(year, -4)`
- `UPDATE Oscars SET year = SELECT substr(year, -4)`
- `UPDATE Oscars SET year = substr(year, -4)`
- `UPDATE Oscars year =  substr(year, 4)`

In [91]:
%%sql

UPDATE Oscars SET year = substr(year, -4);

 * sqlite:///TMDB.db
9964 rows affected.


[]

### Question 14

DStv will be having a special week dedicated to the actor Alan Rickman. Which of the following queries would create a new _view_ that shows the titles, release dates, taglines, and overviews of all movies that Alan Rickman has played in?



**Options:**

- SELECT title, release_date, tagline, overview 
FROM Movies LEFT JOIN Casts ON Casts.movie_id = Movies.movie_id Left JOIN Actors ON Casts.actor_id = Actors.actor_id 
WHERE Actors.actor_name = 'Alan Rickman'
AS VIEW Alan_Rickman_Movies

- CREATE VIEW Alan_Rickman_Movies AS  
SELECT title, release_date, tagline, overview FROM Movies  
LEFT JOIN Casts ON Casts.movie_id = Movies.movie_id Left JOIN Actors
ON Casts.actor_id = Actors.actor_id
WHERE Actors.actor_name = 'Alan Rickman' 


- CREATE NEW VIEW  Name  = Alan_Rickman_Movies AS SELECT title, release_date, tagline, overview FROM Movies LEFT JOIN Casts ON Casts.movie_id = Movies.movie_id Left JOIN Actors ON Casts.actor_id = Actors.actor_id WHERE Actors.actor_name = 'Alan Rickman'

- VIEW Alan_Rickman_Movies AS SELECT title, release_date, tagline, overview FROM Movies LEFT JOIN Casts ON Casts.movie_id = Movies.movie_id Left JOIN Actors ON Casts.actor_id = Actors.actor_id WHERE Actors.actor_name = 'Alan Rickman'

In [None]:
%%sql


CREATE VIEW Alan_Rickman_Movies AS
SELECT title, release_date, tagline, overview 
FROM Movies
LEFT JOIN Casts 
    ON Casts.movie_id = Movies.movie_id 
Left JOIN Actors 
    ON Casts.actor_id = Actors.actor_id 
WHERE Actors.actor_name = 'Alan Rickman';

### Question 15

Which of the statements about database normalisation are true?

**Statements:**
 
i) Database normalisation improves data redundancy, saves on storage space, and fulfils the requirement of records to be uniquely identified.

ii) Database normalisation supports up to the Third Normal Form and removes all data anomalies.

iii) Database normalisation removes inconsistencies that may cause the analysis of our data to be more complicated.

iv) Database normalisation increases data redundancy, saves on storage space, and fulfils the requirement of records to be uniquely identified.

**Options:**

 - (i) and (ii)
 - (i) and (iii)
 - (ii) and (iv)
 - (iii) and (iv)

In [None]:
Number 15 Answer: (i) and (ii)