## The TMDb database

I will be exploring [The Movie Database](https://www.themoviedb.org/) – an online movie and TV show database that houses some of the most popular movies and TV shows at your fingertips. The TMDb database supports 39 official languages used in over 180 countries daily and dates back all the way to 2008. 


<img src="https://github.com/Explore-AI/Pictures/blob/master/sql_tmdb.jpg?raw=true" width=80%/>


Below is an Entity Relationship Diagram (ERD) of the TMDb database:

<img src="https://github.com/Explore-AI/Pictures/blob/master/TMDB_ER_diagram.png?raw=true" width=70%/>

As can be seen from the ERD, the TMDb database consists of `12 tables` containing information about movies, cast, genre, and so much more.  

Let's get started!

In [1]:
# Load and activate the SQL extension to allow us to execute SQL in a Jupyter notebook. 
# If you get an error here, make sure that mysql and pymysql are installed correctly. 

%load_ext sql

In [2]:
# Establish a connection to the local database using the '%sql' magic command.

%sql sqlite:///TMDB.db

'Connected: @TMDB.db'

### Question 1

Who won the Oscar for “Actor in a Leading Role” in  2015?

(Hint: The winner is indicated as '1.0'.)

In [3]:
%%sql

SELECT
    *
FROM
    OSCARS
WHERE
    year = 2015 AND
    award = 'Actor in a Leading Role' AND
    winner = '1.0'; 

 * sqlite:///TMDB.db
Done.


year,award,winner,name,film
2015,Actor in a Leading Role,1.0,Leonardo DiCaprio,The Revenant


### Question 2

What are the ten oldest movies in the database?

In [4]:
%%sql

SELECT
    movie_id,
    title,
    release_date
FROM
    movies
WHERE 
    release_date IS NOT NULL
ORDER BY
    release_date asc
limit 10;

 * sqlite:///TMDB.db
Done.


movie_id,title,release_date
3059,Intolerance,1916-09-04 00:00:00.000000
3060,The Big Parade,1925-11-05 00:00:00.000000
19,Metropolis,1927-01-10 00:00:00.000000
905,Pandora's Box,1929-01-30 00:00:00.000000
65203,The Broadway Melody,1929-02-08 00:00:00.000000
22301,Hell's Angels,1930-11-15 00:00:00.000000
22649,A Farewell to Arms,1932-12-08 00:00:00.000000
3062,42nd Street,1933-02-02 00:00:00.000000
43595,She Done Him Wrong,1933-02-09 00:00:00.000000
3078,It Happened One Night,1934-02-22 00:00:00.000000


### Question 3

How many unique awards are there in the Oscars table?

In [5]:
%%sql

SELECT 
    count(distinct award) as unique_awards
FROM
    oscars;

 * sqlite:///TMDB.db
Done.


unique_awards
114


### Question 4

How many movies are there that contain the word “Spider” within their title?


In [6]:
%%sql

SELECT
    count(title) as movies_with_spider
FROM
    movies
WHERE
    title like '%Spider%';

 * sqlite:///TMDB.db
Done.


movies_with_spider
9


### Question 5

How many movies are there that are both in the "Thriller" genre and contain the word “love” anywhere in the keywords?


In [7]:
%%sql

SELECT
    COUNT(DISTINCT gm.movie_id) AS thriller_love
FROM
    genremap as gm
JOIN
    genres as g on g.genre_id = gm.genre_id
JOIN
    keywordmap as km on km.movie_id = gm.movie_id
JOIN
    keywords as key on key.keyword_id = km.keyword_id
WHERE
    g.genre_name like 'thriller' AND
    key.keyword_name like '%love%';

 * sqlite:///TMDB.db
Done.


thriller_love
48


### Question 6

How many movies are there that were released between 1 August 2006 ('2006-08-01') and 1 October 2009 ('2009-10-01') that have a popularity score of more than 40 and a budget of less than 50 000 000?

 
**Options:**

 - 29
 - 23
 - 28
 - 35

In [8]:
%%sql

SELECT
    count(*) as old_low_budget_movies
FROM
    movies
WHERE
    release_date BETWEEN '2006-08-01' AND '2009-10-01'
    and popularity > 40
    and budget < 50000000
limit 5;

 * sqlite:///TMDB.db
Done.


old_low_budget_movies
29


### Question 7

How many unique characters has "Vin Diesel" played so far in the database?

**Options:**
 - 24
 - 19
 - 18
 - 16

In [9]:
%%sql

SELECT
    count(distinct characters) as unique_vindiesel
FROM
    casts as c
JOIN
    actors as a on c.actor_id = a.actor_id
JOIN
    movies as m on m.movie_id = c.movie_id
WHERE
    actor_name = 'Vin Diesel';

 * sqlite:///TMDB.db
Done.


unique_vindiesel
16


### Question 8

What are the genres of the movie “The Royal Tenenbaums”?


**Options:**
 - Action, Romance
 - Drama, Comedy
 - Crime, Thriller
 - Drama, Romance

In [10]:
%%sql

SELECT
    m.title,
    g.genre_name
FROM
    movies as m 
JOIN
    genremap as gm on gm.movie_id = m.movie_id
JOIN
    genres as g on g.genre_id = gm.genre_id
WHERE
    m.title = 'The Royal Tenenbaums'
limit 5;

 * sqlite:///TMDB.db
Done.


title,genre_name
The Royal Tenenbaums,Drama
The Royal Tenenbaums,Comedy


### Question 9

What are the three production companies that have the highest movie popularity score on average, as recorded within the database?


**Options:**

 - MCL Films S.A., Turner Pictures, and George Stevens Productions
 - The Donners' Company, Bulletproof Cupid, and Kinberg Genre
 - Bulletproof Cupid, The Donners' Company, and MCL Films S.A
 - B.Sting Entertainment, Illumination Pictures, and Aztec Musique

In [11]:
%%sql

SELECT
    p.production_company_name as cmpy_name, 
    round(AVG(m.popularity)) as avg_popularity
FROM
    movies as m 
JOIN
    productioncompanymap as pcm on m.movie_id = pcm.movie_id
JOIN
    productioncompanies as p on p.production_company_id = pcm.production_company_id
GROUP BY
    p.production_company_name 
ORDER BY
    avg_popularity DESC
limit 3;

 * sqlite:///TMDB.db
Done.


cmpy_name,avg_popularity
The Donners' Company,515.0
Bulletproof Cupid,481.0
Kinberg Genre,327.0


### Question 10

How many female actors (i.e. gender = 1) have a name that starts with the letter "N"?


**Options:**

 - 0
 - 355
 - 7335
 - 1949

In [12]:
%%sql

SELECT
    count(*) as female_N
FROM
    actors
WHERE
    gender = 1
    AND actor_name like 'N%'
;

 * sqlite:///TMDB.db
Done.


female_N
355


### Question 11

Which genre has, on average, the lowest movie popularity score? 


**Options:**

 - Science Fiction
 - Animation
 - Documentary
 - Foreign

In [13]:
%%sql

SELECT
    g.genre_name,
    AVG(m.popularity) as avg_pop
FROM
    genres as g
JOIN
    genremap as gm on g.genre_id = gm.genre_id
JOIN
    movies as m on m.movie_id = gm.movie_id
GROUP BY
    g.genre_name
ORDER BY 
    avg_pop asc
limit 1;

 * sqlite:///TMDB.db
Done.


genre_name,avg_pop
Foreign,0.686786794117647


### Question 12

Which award category has the highest number of actor nominations (actors can be male or female)? (Hint: `Oscars.name` contains both actors' names and film names.)

**Options:**

- Special Achievement Award
- Actor in a Supporting Role
- Actress in a Supporting Role
- Best Picture



In [14]:
%%sql

SELECT
    o.award,
    count(o.award) as highest_actor_nom
FROM
    oscars as o
JOIN
    actors as a on a.actor_name = o.name
GROUP BY
    o.award
ORDER BY
    highest_actor_nom desc
LIMIT 1;

 * sqlite:///TMDB.db
Done.


award,highest_actor_nom
Actor in a Supporting Role,356


### Question 13

For all of the entries in the Oscars table before 1934, the year is stored differently than in all the subsequent years. For example, the year would be saved as “1932/1933” instead of just “1933” (the second indicated year).  Which of the following options would be the appropriate code to update this column to have the format of the year be consistent throughout the entire table (second indicated year only shown)?


**Options:**

- `UPDATE Oscars SET year = RIGHT(year, -4)`
- `UPDATE Oscars SET year = SELECT substr(year, -4)`
- `UPDATE Oscars SET year = substr(year, -4)`
- `UPDATE Oscars year =  substr(year, 4)`

In [16]:
%%sql

CREATE TEMPORARY TABLE IF NOT EXISTS temp_oscars as
SELECT 
    * 
FROM 
    oscars;


UPDATE temp_oscars SET year = substr(year, -4);

SELECT 
    * 
FROM 
    temp_oscars 
WHERE year < '1934'
ORDER BY year DESC
LIMIT 5;

 * sqlite:///TMDB.db
Done.
9964 rows affected.
Done.


year,award,winner,name,film
1933,Actor,,Leslie Howard,Berkeley Square
1933,Actor,1.0,Charles Laughton,The Private Life of Henry VIII
1933,Actor,,Paul Muni,I Am a Fugitive from a Chain Gang
1933,Actress,1.0,Katharine Hepburn,Morning Glory
1933,Actress,,May Robson,Lady for a Day


### Question 14

DStv will be having a special week dedicated to the actor Alan Rickman. create a new _view_ that shows the titles, release dates, taglines, and overviews of all movies that Alan Rickman has played in?


In [17]:
%%sql

DROP VIEW Alan_Rickman_Movies;
CREATE VIEW Alan_Rickman_Movies AS  
SELECT 
    m.title, 
    m.release_date, 
    m.tagline, 
    m.overview 
FROM 
    movies as m 
LEFT JOIN 
    casts as c ON c.movie_id = m.movie_id 
Left JOIN 
    actors as a ON c.actor_id = a.actor_id
WHERE a.actor_name = 'Alan Rickman';

SELECT
    *
FROM
    Alan_Rickman_Movies;

 * sqlite:///TMDB.db
Done.
Done.
Done.


title,release_date,tagline,overview
Love Actually,2003-09-07 00:00:00.000000,The ultimate romantic comedy.,Follows seemingly unrelated people as their lives begin to intertwine while they fall in û and out û of love. Affections languish and develop as Christmas draws near.
Die Hard,1988-07-15 00:00:00.000000,40 Stories. Twelve Terrorists. One Cop.,"NYPD cop, John McClane's plan to reconcile with his estranged wife is thrown for a serious loop when minutes after he arrives at her office, the entire building is overtaken by a group of terrorists. With little help from the LAPD, wisecracking McClane sets out to single-handedly rescue the hostages and bring the bad guys down."
Harry Potter and the Philosopher's Stone,2001-11-16 00:00:00.000000,Let the Magic Begin.,"Harry Potter has lived under the stairs at his aunt and uncle's house his whole life. But on his 11th birthday, he learns he's a powerful wizard -- with a place waiting for him at the Hogwarts School of Witchcraft and Wizardry. As he learns to harness his newfound powers with the help of the school's kindly headmaster, Harry uncovers the truth about his parents' deaths -- and about the villain who's to blame."
Harry Potter and the Chamber of Secrets,2002-11-13 00:00:00.000000,Hogwarts is back in session.,"Ignoring threats to his life, Harry returns to Hogwarts to investigate û aided by Ron and Hermione û a mysterious series of attacks."
Harry Potter and the Prisoner of Azkaban,2004-05-31 00:00:00.000000,Something wicked this way comes.,"Harry, Ron and Hermione return to Hogwarts for another magic-filled year. Harry comes face to face with danger yet again, this time in the form of escaped convict, Sirius Black û and turns to sympathetic Professor Lupin for help."
Harry Potter and the Goblet of Fire,2005-11-05 00:00:00.000000,Dark And Difficult Times Lie Ahead.,"Harry starts his fourth year at Hogwarts, competes in the treacherous Triwizard Tournament and faces the evil Lord Voldemort. Ron and Hermione help Harry manage the pressure û but Voldemort lurks, awaiting his chance to destroy Harry and all that he stands for."
Harry Potter and the Order of the Phoenix,2007-06-28 00:00:00.000000,Evil Must Be Confronted.,"Returning for his fifth year of study at Hogwarts, Harry is stunned to find that his warnings about the return of Lord Voldemort have been ignored. Left with no choice, Harry takes matters into his own hands, training a small group of students û dubbed 'Dumbledore's Army' û to defend themselves against the dark arts."
Harry Potter and the Half-Blood Prince,2009-07-07 00:00:00.000000,Dark Secrets Revealed,"As Harry begins his sixth year at Hogwarts, he discovers an old book marked as 'Property of the Half-Blood Prince', and begins to learn more about Lord Voldemort's dark past."
Galaxy Quest,1999-12-23 00:00:00.000000,A comedy of Galactic Proportions.,"The stars of a 1970s sci-fi show - now scraping a living through re-runs and sci-fi conventions - are beamed aboard an alien spacecraft. Believing the cast's heroic on-screen dramas are historical documents of real-life adventures, the band of aliens turn to the ailing celebrities for help in their quest to overcome the oppressive regime in their solar system."
Perfume: The Story of a Murderer,2006-09-13 00:00:00.000000,Based on the best-selling novel,"Jean-Baptiste Grenouille, born in the stench of 18th century Paris, develops a superior olfactory sense, which he uses to create the world's finest perfumes. However, his work takes a dark turn as he tries to preserve scents in the search for the ultimate perfume."
