# TMDb Movie Database Analysis

## Movie Database

The TMDb database supports 39 official languages used in over 180 countries daily, and dates all the way back to 2008. 


Below is an Entity Relationship Diagram (ERD) of the TMDb database:

![TMDB_ER_diagram.png](attachment:TMDB_ER_diagram.png)

Looking at the above ER diagram, the TMDb database consists of `12 tables` containing information about movies, cast, genre and more information.  

## Loading the database

In [1]:
# Loading the magic command to prepare the SQL environment

%load_ext sql

In [2]:
# Loading the database

%%sql 

sqlite:///TMDB.db

## Database Analysis

### Question 1

Who won the Oscar for “Actor in a Leading Role” in  2015 ?

In [3]:
%%sql 

SELECT name
FROM oscars
WHERE winner = 1.0 AND award = "Actor in a Leading Role" AND year = "2015"

 * sqlite:///TMDB.db
Done.


name
Leonardo DiCaprio


**Solution:** Leonardo DiCaprio won the "Actor in a Leading Role" in 2015.

### Question 2

Write a query that will produce the ten oldest movies in the TMDb database?

In [4]:
%%sql 

SELECT *
FROM movies
WHERE release_date IS NOT NULL
ORDER BY release_date ASC
LIMIT 10

 * sqlite:///TMDB.db
Done.


movie_id,title,release_date,budget,homepage,original_language,original_title,overview,popularity,revenue,runtime,release_status,tagline,vote_average,vote_count
3059,Intolerance,1916-09-04 00:00:00.000000,385907,,en,Intolerance,"The story of a poor young woman, separated by prejudice from her husband and baby, is interwoven with tales of intolerance from throughout history.",3.232447,8394751.0,197.0,Released,The Cruel Hand of Intolerance,7.4,60
3060,The Big Parade,1925-11-05 00:00:00.000000,245000,,en,The Big Parade,"The story of an idle rich boy who joins the US Army's Rainbow Division and is sent to France to fight in World War I, becomes friends with two working class men, experiences the horrors of trench warfare, and finds love with a French girl.",0.785744,22000000.0,151.0,Released,,7.0,21
19,Metropolis,1927-01-10 00:00:00.000000,92620000,,de,Metropolis,"In a futuristic city sharply divided between the working class and the city planners, the son of the city's mastermind falls in love with a working class prophet who predicts the coming of a savior to mediate their differences.",32.351527,650422.0,153.0,Released,There can be no understanding between the hands and the brain unless the heart acts as mediator.,8.0,657
905,Pandora's Box,1929-01-30 00:00:00.000000,0,,de,Die Bnchse der Pandora,The rise and inevitable fall of an amoral but naive young woman whose insouciant eroticism inspires lust and violence in those around her.,1.824184,0.0,109.0,Released,,7.6,45
65203,The Broadway Melody,1929-02-08 00:00:00.000000,379000,,en,The Broadway Melody,"Harriet and Queenie Mahoney, a vaudeville act, come to Broadway, where their friend Eddie Kerns needs them for his number in one of Francis Zanfield's shows. Eddie was in love with Harriet, but when he meets Queenie, he falls in love to her, but she is courted by Jock Warriner, a member of the New Yorker high society. It takes a while till Queenie recognizes, that she is for Jock nothing more than a toy, and it also takes a while till Harriet recognizes, that Eddie is in love with Queenie",0.968865,4358000.0,100.0,Released,The pulsating drama of Broadway's bared heart speaks and sings with a voice to stir your soul!,5.0,19
22301,Hell's Angels,1930-11-15 00:00:00.000000,3950000,,en,Hell's Angels,"Two brothers attending Oxford enlist with the Royal Flying Corps when World War I breaks out. Roy and Monte Rutledge have very different personalities. Monte is a freewheeling womanizer, even with his brother's girlfriend Helen. He also proves to have a yellow streak when it comes to his Night Patrol duties. Roy is made of strong moral fiber and attempts to keep his brother in line. Both volunteer for an extremely risky two man bombing mission for different reasons. Monte wants to lose his cowardly reputation and Roy seeks to protect his brother. Roy loves Helen; Helen enjoys an affair with Monte; before they leave on their mission over Germany they find her in still another man's arms. Their assignment to knock out a strategic German munitions facility is a booming success, but with a squadron of fighters bearing down on them afterwards, escape seems unlikely.",8.484123,8000000.0,127.0,Released,Howard Hughes' Thrilling Multi-Million Dollar Air Spectacle,6.1,19
22649,A Farewell to Arms,1932-12-08 00:00:00.000000,4,,en,A Farewell to Arms,"British nurse Catherine Barkley (Helen Hayes) and American Lieutenant Frederic Henry (Gary Cooper) fall in love during the First World War in Italy. Eventually separated by Frederic's transfer, tremendous challenges and difficult decisions face each, as the war rages on. Academy Awards winner for Best Cinematography and for Best Sound, Recording. Nominated for Best Picture and for Best Art Direction.",1.199451,25.0,89.0,Released,Every woman who has loved will understand,6.2,28
3062,42nd Street,1933-02-02 00:00:00.000000,439000,,en,42nd Street,"A producer puts on what may be his last Broadway show, and at the last moment a chorus girl has to replace the star.",1.933366,2281000.0,89.0,Released,,6.1,37
43595,She Done Him Wrong,1933-02-09 00:00:00.000000,200000,,en,She Done Him Wrong,"""New York singer and nightclub owner Lady Lou has more men friends than you can imagine. Unfortunately one of them is a vicious criminal who's escaped and is on the way to see """"his"""" girl, not realising she hasn't exactly been faithful in his absence. Help is at hand in the form of young Captain Cummings a local temperance league leader though.""",0.622752,2200000.0,66.0,Released,Mae West gives a 'Hot Time' to the nation!,5.1,27
3078,It Happened One Night,1934-02-22 00:00:00.000000,325000,,en,It Happened One Night,"Ellie Andrews has just tied the knot with society aviator King Westley when she is whisked away to her father's yacht and out of King's clutches. Ellie jumps ship and eventually winds up on a bus headed back to her husband. Reluctantly she must accept the help of out-of- work reporter Peter Warne. Actually, Warne doesn't give her any choice: either she sticks with him until he gets her back to her husband, or he'll blow the whistle on Ellie to her father. Either way, Peter gets what he wants... a really juicy newspaper story!",11.871424,4500000.0,105.0,Released,TOGETHER... for the first time,7.7,275


**Solution:** 10 movies were produced where they were ordered from the oldest movie to the latest movie.

## Question 3

How many unique awards are there in the Oscars table?

In [5]:
%%sql 

SELECT COUNT(DISTINCT(award))
FROM oscars

 * sqlite:///TMDB.db
Done.


COUNT(DISTINCT(award))
114


**Solution:** There are 114 unique awards in the Oscars table.

## Question 4

How many movies are there that contain the word “Spider” within their title?

In [6]:
%%sql 

SELECT COUNT(*)
FROM Movies
WHERE title LIKE '%Spider%'
   OR title LIKE '%spider%'

 * sqlite:///TMDB.db
Done.


COUNT(*)
9


**Solution:** There are 9 movies that have the word "Spider" in their title.

## Question 5

How many movies are there that are both in the "Thriller" Genre and contain the keyword “love”?

In [7]:
%%sql 

SELECT COUNT(*)
FROM genres, keywords
WHERE genre_name = "Thriller" AND keyword_name = "love"

 * sqlite:///TMDB.db
Done.


COUNT(*)
1


**Solution:** There is only 1 movie that is a Thriller and has the word "love' as a keyword.

## Question 6

How many movies are there that were released between 1 August 2006 ('2006-08-01') and 1 October 2009 ('2009-10-01') that have a popularity score of more than 40 and a budget of less than 50 000 000?

In [8]:
%%sql 

SELECT count(*)
FROM movies
WHERE release_date BETWEEN "2006-08-01" AND "2009-10-01"
AND popularity > 40
AND budget < 50000000

 * sqlite:///TMDB.db
Done.


count(*)
29


**Solution:** There are 29 movies that have a popularity score of 40 or higher and have a budget of 50 000 000 or lower and was released between 1 August 2006 and 1 October 2009.

## Question 7 

How many unique characters has "Vin Diesel" played so far in the database?

In [9]:
%%sql 

SELECT count(characters)
FROM casts c
INNER JOIN actors a
ON c.actor_id = a.actor_id
WHERE a.actor_name = "Vin Diesel"

 * sqlite:///TMDB.db
Done.


count(characters)
19


**Solution:** There are 19 unique characters Vin Diesel played as in movies.

## Question 8

What are the Genres of the movie “The Royal Tenenbaums”?

In [10]:
%%sql 

SELECT m.title, g.genre_name
FROM genremap AS gm
INNER JOIN movies AS m
ON gm.movie_id = m.movie_id
INNER JOIN genres AS g
ON g.genre_id = gm.genre_id
WHERE m.title = "The Royal Tenenbaums"

 * sqlite:///TMDB.db
Done.


title,genre_name
The Royal Tenenbaums,Drama
The Royal Tenenbaums,Comedy


**Solution:** "The Royal Tenenbaums" movie has a Drama and Comedy genre.

## Question 9

What are the three production companies that have the highest movie popularity score on average, as recorded within the database?

In [12]:
%%sql 

SELECT DISTINCT(pcs.production_company_name), avg(m.popularity)
FROM movies as m
INNER JOIN productioncompanymap as pcm
ON m.movie_id = pcm.movie_id
INNER JOIN productioncompanies pcs
ON pcm.production_company_id = pcs.production_company_id
GROUP BY pcs.production_company_name
ORDER BY avg(m.popularity) DESC
LIMIT 3

 * sqlite:///TMDB.db
Done.


production_company_name,avg(m.popularity)
The Donners' Company,514.569956
Bulletproof Cupid,481.098624
Kinberg Genre,326.92099900000005


**Solution:** The three production companies that have the highest movie popularity score on average are: "The Donners' Company", "Bulletproof Cupid" and "Kinberg Genre".

## Question 10

How many female actors (i.e. gender = 1) have a name that starts with the letter "N"?

In [13]:
%%sql

SELECT count(*)
FROM actors
WHERE gender = "1" and actor_name like "N%"

 * sqlite:///TMDB.db
Done.


count(*)
355


**Solution:** There are 355 females that have a name the begins with the letter "N".

## Question 11

Which genre has, on average, the lowest movie popularity score?

In [14]:
%%sql

SELECT g.genre_name, CAST(ROUND(AVG(m.popularity),2) AS DEC(10,2)) avg_popularity_score
FROM genremap AS gm
INNER JOIN movies AS m
ON gm.movie_id = m.movie_id
INNER JOIN genres AS g
ON g.genre_id = gm.genre_id
GROUP BY g.genre_name
ORDER BY m.popularity

 * sqlite:///TMDB.db
Done.


genre_name,avg_popularity_score
Documentary,3.95
Foreign,0.69
TV Movie,6.39
Music,13.1
Comedy,18.22
Crime,22.85
History,17.44
War,23.78
Western,18.24
Horror,18.3


**Solution:** The genre that has the lowest rating is "Foreign".

## Question 12

Which award category has the highest number of actor nominations (actors can be male or female)? (Hint `Oscars.name` contains both actors names and film names)

In [15]:
%%sql

SELECT count(o.name) as "Actor Nominations", o.award
FROM oscars as o
INNER JOIN actors as a
ON o.name = a.actor_name
GROUP BY o.award
ORDER BY count(o.name) DESC
LIMIT 4

 * sqlite:///TMDB.db
Done.


Actor Nominations,award
356,Actor in a Supporting Role
331,Actress in a Supporting Role
198,Actress in a Leading Role
197,Actor in a Leading Role


**Solution:** The award category that has the highest number of actor/actress nominations is "Actor in a Supporting Role".

## Question 13

For all of the entries in the Oscars table before 1934, the year is stored differently than in all the subsequent years. E.g the year would be saved as “1932/1933” instead of just “1933” (the second indicated year).

Write a query that would be the appropriate code to update this column to have the format of the year be consistent throughout the entire table (second indicated year only shown)?

In [16]:
%%sql

UPDATE Oscars SET year = substr(year, -4)

 * sqlite:///TMDB.db
9964 rows affected.


[]

**Solution:** The above code shows that 9964 rows within the database have their values changed to the required format.

## Question 14

DStv will be having a special week dedicated to the actor Alan Rickman.

Write a query that would create a new _view_ that shows the titles, release dates, taglines, and overviews of all movies that Alan Rickman has played in?

In [17]:
%%sql

CREATE VIEW Alan_Rickman_Movies AS
SELECT title, release_date, tagline, overview
FROM movies as m
LEFT JOIN casts as c
ON c.movie_id = m.movie_id
LEFT JOIN actors as a
ON c.actor_id = a.actor_id
WHERE a.actor_name = "Alan Rickman"

 * sqlite:///TMDB.db
Done.


[]

**Solution:** The above code shows that a new view was created.