# SQL Movie-Rating Query Exercises

You've started a new movie-rating website, and you've been collecting data on reviewers' ratings of various movies. There's not much data yet, but you can still try out some interesting queries. Here's the schema: 

Movie ( mID, title, year, director ) 
English: There is a movie with ID number _mID_, a _title_, a release _year_, and a _director_.

Reviewer ( rID, name ) 
English: The reviewer with ID number _rID_ has a certain _name_.

Rating ( rID, mID, stars, rating_date ) 
English: The reviewer _rID_ gave the movie _mID_ a number of _stars_ rating (1-5) on a certain *rating\_date*.

Your queries will run over a small data set conforming to the schema. [View the database](https://lagunita.stanford.edu/c4x/DB/SQL/asset/moviedata.html). (You can also [download the schema and data](https://s3-us-west-2.amazonaws.com/prod-c2g/db/Winter2013/files/rating.sql).) 

**Notes:**
- I've renamed _ratingDate_ to *rating\_date*
- I'm using PostgreSQL
- I've combined the "regular" and "extra" questions

In [485]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [486]:
%%sql
postgresql://kbammarito@localhost:5432/kbammarito

'Connected: kbammarito@kbammarito'

## Create the schema for the tables

In [488]:
%%sql
create table movie(mID int, title text, year int, director text);
create table reviewer(rID int, name text);
create table rating(rID int, mID int, stars int, rating_date date);

Done.
Done.
Done.


[]

## Populate the tables with the data

In [489]:
%%sql
insert into movie values(101, 'Gone with the Wind', 1939, 'Victor Fleming');
insert into movie values(102, 'Star Wars', 1977, 'George Lucas');
insert into movie values(103, 'The Sound of Music', 1965, 'Robert Wise');
insert into movie values(104, 'E.T.', 1982, 'Steven Spielberg');
insert into movie values(105, 'Titanic', 1997, 'James Cameron');
insert into movie values(106, 'Snow White', 1937, null);
insert into movie values(107, 'Avatar', 2009, 'James Cameron');
insert into movie values(108, 'Raiders of the Lost Ark', 1981, 'Steven Spielberg');

insert into reviewer values(201, 'Sarah Martinez');
insert into reviewer values(202, 'Daniel Lewis');
insert into reviewer values(203, 'Brittany Harris');
insert into reviewer values(204, 'Mike Anderson');
insert into reviewer values(205, 'Chris Jackson');
insert into reviewer values(206, 'Elizabeth Thomas');
insert into reviewer values(207, 'James Cameron');
insert into reviewer values(208, 'Ashley White');

insert into rating values(201, 101, 2, '2011-01-22');
insert into rating values(201, 101, 4, '2011-01-27');
insert into rating values(202, 106, 4, null);
insert into rating values(203, 103, 2, '2011-01-20');
insert into rating values(203, 108, 4, '2011-01-12');
insert into rating values(203, 108, 2, '2011-01-30');
insert into rating values(204, 101, 3, '2011-01-09');
insert into rating values(205, 103, 3, '2011-01-27');
insert into rating values(205, 104, 2, '2011-01-22');
insert into rating values(205, 108, 4, null);
insert into rating values(206, 107, 3, '2011-01-15');
insert into rating values(206, 106, 5, '2011-01-19');
insert into rating values(207, 107, 5, '2011-01-20');
insert into rating values(208, 104, 3, '2011-01-02');

1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


[]

## View each table

In [490]:
%%sql
select * from movie;

8 rows affected.


mid,title,year,director
101,Gone with the Wind,1939,Victor Fleming
102,Star Wars,1977,George Lucas
103,The Sound of Music,1965,Robert Wise
104,E.T.,1982,Steven Spielberg
105,Titanic,1997,James Cameron
106,Snow White,1937,
107,Avatar,2009,James Cameron
108,Raiders of the Lost Ark,1981,Steven Spielberg


In [491]:
%%sql
select * from reviewer;

8 rows affected.


rid,name
201,Sarah Martinez
202,Daniel Lewis
203,Brittany Harris
204,Mike Anderson
205,Chris Jackson
206,Elizabeth Thomas
207,James Cameron
208,Ashley White


In [492]:
%%sql
select * from rating;

14 rows affected.


rid,mid,stars,rating_date
201,101,2,2011-01-22
201,101,4,2011-01-27
202,106,4,
203,103,2,2011-01-20
203,108,4,2011-01-12
203,108,2,2011-01-30
204,101,3,2011-01-09
205,103,3,2011-01-27
205,104,2,2011-01-22
205,108,4,


## Q1: Find the titles of all movies directed by Steven Spielberg.

In [493]:
%%sql
select title
from movie
where director = 'Steven Spielberg'
order by title;

2 rows affected.


title
E.T.
Raiders of the Lost Ark


## Q2: Find all years that have a movie that received a rating of 4 or 5, and sort them in increasing order.

In [494]:
%%sql
select distinct year
from movie join rating using(mid)
where stars >= 4
order by year;

4 rows affected.


year
1937
1939
1981
2009


## Q3: Find the titles of all movies that have no ratings.

In [495]:
%%sql
select title
from movie
where mid not in
(select mid
  from rating)
order by title;

2 rows affected.


title
Star Wars
Titanic


## Q4: Some reviewers didn't provide a date with their rating. Find the names of all reviewers who have ratings with a NULL value for the date.

In [496]:
%%sql
select name
from rating join reviewer using(rid)
where rating_date is null
order by name;

2 rows affected.


name
Chris Jackson
Daniel Lewis


## Q5: Write a query to return the ratings data in a more readable format: reviewer name, movie title, stars, and ratingDate. Also, sort the data, first by reviewer name, then by movie title, and lastly by number of stars.

In [497]:
%%sql
select name, title, stars, rating_date
from (rating join reviewer using(rid)) join movie using(mid)
order by name, title, stars;

14 rows affected.


name,title,stars,rating_date
Ashley White,E.T.,3,2011-01-02
Brittany Harris,Raiders of the Lost Ark,2,2011-01-30
Brittany Harris,Raiders of the Lost Ark,4,2011-01-12
Brittany Harris,The Sound of Music,2,2011-01-20
Chris Jackson,E.T.,2,2011-01-22
Chris Jackson,Raiders of the Lost Ark,4,
Chris Jackson,The Sound of Music,3,2011-01-27
Daniel Lewis,Snow White,4,
Elizabeth Thomas,Avatar,3,2011-01-15
Elizabeth Thomas,Snow White,5,2011-01-19


## Q6: For all cases where the same reviewer rated the same movie twice and gave it a higher rating the second time, return the reviewer's name and the title of the movie.

In [498]:
%%sql
select S2.name, S2.title
from
(select name, title, stars, rating_date
  from (movie join rating using(mid)) join reviewer using(rid)) S1,
(select name, title, stars, rating_date
  from (movie join rating using(mid)) join reviewer using(rid)) S2
where S2.name = S1.name
and S2.title = S1.title
and S2.rating_date > S1.rating_date
and S2.stars > S1.stars
order by name;

1 rows affected.


name,title
Sarah Martinez,Gone with the Wind


## Q7: For each movie that has at least one rating, find the highest number of stars that movie received. Return the movie title and number of stars. Sort by movie title.

In [499]:
%%sql
select title, max(stars)
from
(select title, stars
  from (movie join rating using(mid)) join reviewer using(rid)
  order by title, stars) X
group by title
order by title;

6 rows affected.


title,max
Avatar,5
E.T.,3
Gone with the Wind,4
Raiders of the Lost Ark,4
Snow White,5
The Sound of Music,3


## Q8: For each movie, return the title and the 'rating spread', that is, the difference between highest and lowest ratings given to that movie. Sort by rating spread from highest to lowest, then by movie title.

In [500]:
%%sql
select title, (max(stars) - min(stars)) as rating_spread
from rating join movie using(mid)
group by title
order by rating_spread desc, title;

6 rows affected.


title,rating_spread
Avatar,2
Gone with the Wind,2
Raiders of the Lost Ark,2
E.T.,1
Snow White,1
The Sound of Music,1


## Q9: Find the difference between the average rating of movies released before 1980 and the average rating of movies released after 1980. (Make sure to calculate the average rating for each movie, then the average of those averages for movies before 1980 and movies after. Don't just calculate the overall average rating before and after 1980.)

In [501]:
%%sql
select abs(avg(after1980.after) - avg(before1980.before)) as difference
from
(select avg(stars) as after
  from movie join rating using(mid)
  where year > 1980
  group by title) as after1980,
(select avg(stars) as before
  from movie join rating using(mid)
  where year < 1980
  group by title) as before1980;

1 rows affected.


difference
0.0555555555555555


## Q10: Find the names of all reviewers who rated _Gone with the Wind_.

In [502]:
%%sql
select distinct name
from (reviewer join rating using(rid)) join movie using(mid)
where title = 'Gone with the Wind'
order by name;

2 rows affected.


name
Mike Anderson
Sarah Martinez


## Q11: For any rating where the reviewer is the same as the director of the movie, return the reviewer name, movie title, and number of stars.

In [503]:
%%sql
select name, title, stars
from (reviewer join rating using(rid)) join movie using(mid)
where name = director
order by name;

1 rows affected.


name,title,stars
James Cameron,Avatar,5


## Q12: Return all reviewer names and movie names together in a single list, alphabetized. (Sorting by the first name of the reviewer and first word in the title is fine; no need for special processing on last names or removing "The".)

In [504]:
%%sql
select name as all_names
from
(select name
  from reviewer
  union
  select title
  from movie) as X
order by name;

16 rows affected.


all_names
Ashley White
Avatar
Brittany Harris
Chris Jackson
Daniel Lewis
E.T.
Elizabeth Thomas
Gone with the Wind
James Cameron
Mike Anderson


## Q13: Find the titles of all movies not reviewed by Chris Jackson.

In [505]:
%%sql
select title
from movie
where title not in
(select title
  from (reviewer join rating using(rid)) join movie using(mid)
  where name = 'Chris Jackson')
order by title;

5 rows affected.


title
Avatar
Gone with the Wind
Snow White
Star Wars
Titanic


## Q14: For all pairs of reviewers such that both reviewers gave a rating to the same movie, return the names of both reviewers. Eliminate duplicates, don't pair reviewers with themselves, and include each pair only once. For each pair, return the names in the pair in alphabetical order.

In [506]:
%%sql
select distinct S1.name as reviewer1, S2.name as reviewer2
from ((reviewer join rating using(rid)) join movie using(mid)) S1,
((reviewer join rating using(rid)) join movie using(mid)) S2
where S1.title = S2.title and S1.name < S2.name
order by S1.name, S2.name;

5 rows affected.


reviewer1,reviewer2
Ashley White,Chris Jackson
Brittany Harris,Chris Jackson
Daniel Lewis,Elizabeth Thomas
Elizabeth Thomas,James Cameron
Mike Anderson,Sarah Martinez


## Q15: For each rating that is the lowest (fewest stars) currently in the database, return the reviewer name, movie title, and number of stars.

In [507]:
%%sql
select name, title, stars
from
(select name, title, stars
  from (reviewer join rating using(rid)) join movie using(mid)) X
where stars <= all (select stars from rating)
order by name, title;

4 rows affected.


name,title,stars
Brittany Harris,Raiders of the Lost Ark,2
Brittany Harris,The Sound of Music,2
Chris Jackson,E.T.,2
Sarah Martinez,Gone with the Wind,2


## Q16: List movie titles and average ratings, from highest-rated to lowest-rated. If two or more movies have the same average rating, list them in alphabetical order.

In [508]:
%%sql
select title, avg(stars) as average_rating
from movie join rating using(mid)
group by title
order by average_rating desc, title;

6 rows affected.


title,average_rating
Snow White,4.5
Avatar,4.0
Raiders of the Lost Ark,3.333333333333333
Gone with the Wind,3.0
E.T.,2.5
The Sound of Music,2.5


## Q17: Find the names of all reviewers who have contributed three or more ratings.

In [509]:
%%sql
select name
from reviewer join rating using(rid)
group by name
having count(*) >= 3
order by name;

2 rows affected.


name
Brittany Harris
Chris Jackson


## Q18: Some directors directed more than one movie. For all such directors, return the titles of all movies directed by them, along with the director name. Sort by director name, then movie title.

In [510]:
%%sql
select title, director
from
(select director
  from movie
  group by director
  having count(*) >=2) X
join movie using(director)
order by director, title;

4 rows affected.


title,director
Avatar,James Cameron
Titanic,James Cameron
E.T.,Steven Spielberg
Raiders of the Lost Ark,Steven Spielberg


## Q19: Find the movie(s) with the highest average rating. Return the movie title(s) and average rating.

In [511]:
%%sql
select title, average_rating
from (select title, avg(stars) as average_rating
  from movie join rating using(mid)
  group by title) R1
where average_rating in
(select max(average_rating)
  from (select title, avg(stars) as average_rating
    from movie join rating using(mid)
    group by title) R2)
order by title;

1 rows affected.


title,average_rating
Snow White,4.5


## Q20: Find the movie(s) with the lowest average rating. Return the movie title(s) and average rating.

In [512]:
%%sql
select title, average_rating
from (select title, avg(stars) as average_rating
  from movie join rating using(mid)
  group by title) Y
where average_rating in
(select min(average_rating)
  from (select title, avg(stars) as average_rating
    from movie join rating using(mid)
    group by title) X)
order by title;

2 rows affected.


title,average_rating
E.T.,2.5
The Sound of Music,2.5


## Q21: For each director, return the director's name together with the title(s) of the movie(s) they directed that received the highest rating among all of their movies, and the value of that rating. Ignore movies whose director is NULL.

In [513]:
%%sql
select director, title, highest_rating
from
((select director, max(stars) as highest_rating
  from movie join rating using(mid)
  where director is not null
  group by director) X
join
(select director, title, max(stars) as highest_rating
  from movie join rating using(mid)
  where director is not null
  group by director, title) Y using(director, highest_rating))
order by director;

4 rows affected.


director,title,highest_rating
James Cameron,Avatar,5
Robert Wise,The Sound of Music,3
Steven Spielberg,Raiders of the Lost Ark,4
Victor Fleming,Gone with the Wind,4


## Q22: Add the reviewer Roger Ebert to your database, with an rID of 209.

In [514]:
%%sql
select * from reviewer;

8 rows affected.


rid,name
201,Sarah Martinez
202,Daniel Lewis
203,Brittany Harris
204,Mike Anderson
205,Chris Jackson
206,Elizabeth Thomas
207,James Cameron
208,Ashley White


In [515]:
%%sql
insert into reviewer
    values ('209', 'Roger Ebert');

1 rows affected.


[]

In [516]:
%%sql
select * from reviewer;

9 rows affected.


rid,name
201,Sarah Martinez
202,Daniel Lewis
203,Brittany Harris
204,Mike Anderson
205,Chris Jackson
206,Elizabeth Thomas
207,James Cameron
208,Ashley White
209,Roger Ebert


## Q23: Insert 5-star ratings by James Cameron for all movies in the database. Leave the review date as NULL.

### James Cameron's rid is 207 and there are eight movies to add

In [517]:
%%sql
select * from rating
where rid = 207;

1 rows affected.


rid,mid,stars,rating_date
207,107,5,2011-01-20


In [518]:
%%sql
insert into rating
    select
    (select rid
        from rating join reviewer using (rid)
        where name = 'James Cameron'), mID, 5 , null
    from
    (select mID
        from movie) a;

8 rows affected.


[]

In [519]:
%%sql
select * from rating
where rid = 207;

9 rows affected.


rid,mid,stars,rating_date
207,107,5,2011-01-20
207,101,5,
207,102,5,
207,103,5,
207,104,5,
207,105,5,
207,106,5,
207,107,5,
207,108,5,


## Q24: For all movies that have an average rating of 4 stars or higher, add 25 to the release year. (Update the existing tuples; don't insert new tuples.)

In [520]:
%%sql
select *
from movie
where mid in
(select mid
    from
    (select mid, avg(stars) as "4+"
        from movie join rating using(mid)
        group by mid) a
    where "4+" >= 4)
order by mid;

4 rows affected.


mid,title,year,director
102,Star Wars,1977,George Lucas
105,Titanic,1997,James Cameron
106,Snow White,1937,
107,Avatar,2009,James Cameron


In [521]:
%%sql
update movie
    set year = year + 25
    where mid in
    (select mid
        from movie
        where mid in
        (select mid
            from
            (select mid, avg(stars) as "4+"
                from movie join rating using(mid)
                group by mid) a
            where "4+" >= 4));

4 rows affected.


[]

In [522]:
%%sql
select *
from movie
where mid in
(select mid
    from
    (select mid, avg(stars) as "4+"
        from movie join rating using(mid)
        group by mid) a
    where "4+" >= 4)
order by mid;

4 rows affected.


mid,title,year,director
102,Star Wars,2002,George Lucas
105,Titanic,2022,James Cameron
106,Snow White,1962,
107,Avatar,2034,James Cameron


## Q25: Remove all ratings where the movie's year is before 1970 or after 2000, and the rating is fewer than 4 stars.

In [523]:
%%sql
update movie
    set year = year - 25
    where mid in
    (select mid
        from movie
        where mid in
        (select mid
            from
            (select mid, avg(stars) as "4+"
                from movie join rating using(mid)
                group by mid) a
            where "4+" >= 4));

4 rows affected.


[]

In [524]:
%%sql
select rID, rating.mID, title, year, stars
from rating join movie on (rating.mID = movie.mID)
order by year, title, stars;

22 rows affected.


rid,mid,title,year,stars
202,106,Snow White,1937,4
206,106,Snow White,1937,5
207,106,Snow White,1937,5
201,101,Gone with the Wind,1939,2
204,101,Gone with the Wind,1939,3
201,101,Gone with the Wind,1939,4
207,101,Gone with the Wind,1939,5
203,103,The Sound of Music,1965,2
205,103,The Sound of Music,1965,3
207,103,The Sound of Music,1965,5


In [525]:
%%sql
select *
from movie
where mid in
(select mid
    from
    (select mid, avg(stars) as "4+"
        from movie join rating using(mid)
        group by mid) a
    where "4+" >= 4)
order by mid;

4 rows affected.


mid,title,year,director
102,Star Wars,1977,George Lucas
105,Titanic,1997,James Cameron
106,Snow White,1937,
107,Avatar,2009,James Cameron


In [526]:
%%sql
delete from rating
    where mid in
    (select mid
        from movie left join rating using(mid)
        where year < 1970 or year > 2000) and stars < 4;

5 rows affected.


[]

In [527]:
%%sql
select rID, rating.mID, title, year, stars
from rating join movie on (rating.mID = movie.mID)
order by year, title, stars;

17 rows affected.


rid,mid,title,year,stars
202,106,Snow White,1937,4
207,106,Snow White,1937,5
206,106,Snow White,1937,5
201,101,Gone with the Wind,1939,4
207,101,Gone with the Wind,1939,5
207,103,The Sound of Music,1965,5
207,102,Star Wars,1977,5
203,108,Raiders of the Lost Ark,1981,2
205,108,Raiders of the Lost Ark,1981,4
203,108,Raiders of the Lost Ark,1981,4
