# Lab | SQL Queries 8

In this lab, you will be using the [Sakila](https://dev.mysql.com/doc/sakila/en/) database of movie rentals. You have been using this database for a couple labs already, but if you need to get the data again, refer to the official [installation link](https://dev.mysql.com/doc/sakila/en/sakila-installation.html).

The database is structured as follows:
![DB schema](https://education-team-2020.s3-eu-west-1.amazonaws.com/data-analytics/database-sakila-schema.png)

### Instructions

1. Rank films by length (filter out the rows with nulls or zeros in length column). Select only columns title, length and rank in your output.
2. Rank films by length within the `rating` category (filter out the rows with nulls or zeros in length column). In your output, only select the columns title, length, rating and rank.  
3. How many films are there for each of the categories in the category table? **Hint**: Use appropriate join between the tables "category" and "film_category".
4. Which actor has appeared in the most films? **Hint**: You can create a join between the tables "actor" and "film actor" and count the number of times an actor appears.
5. Which is the most active customer (the customer that has rented the most number of films)? **Hint**: Use appropriate join between the tables "customer" and "rental" and count the `rental_id` for each customer.

**Bonus**: Which is the most rented film? (The answer is Bucket Brotherhood).

This query might require using more than one join statement. Give it a try. We will talk about queries with multiple join statements later in the lessons.

**Hint**: You can use join between three tables - "Film", "Inventory", and "Rental" and count the *rental ids* for each film.

# Your solution here:

In [2]:
import pymysql
from sqlalchemy import create_engine
import pandas as pd
import getpass  # To get the password without showing the input

##### Prepare SQL connection

In [3]:
password = getpass.getpass()
connection_string = 'mysql+pymysql://root:' + password + '@localhost/bank'
engine = create_engine(connection_string)
%load_ext sql
%sql {connection_string}

 ·············


'Connected: root@bank'

##### 1. Rank films by length (filter out the rows with nulls or zeros in length column). Select only columns title, length and rank in your output.


In [19]:
%%sql ranked_length <<
select title, length, rank() over (order by length desc)
from sakila.film
where length is not null;

 * mysql+pymysql://root:***@localhost/bank
1000 rows affected.
Returning data to local variable ranked_length


##### 2. Rank films by length within the `rating` category (filter out the rows with nulls or zeros in length column). In your output, only select the columns title, length, rating and rank.  


In [22]:
%%sql rank_by_length_and_rating <<
select title, length, rating, rank() over (partition by rating order by length desc)
from sakila.film
where length is not null;

 * mysql+pymysql://root:***@localhost/bank
1000 rows affected.
Returning data to local variable rank_by_length_and_rating


##### 3. How many films are there for each of the categories in the category table? **Hint**: Use appropriate join between the tables "category" and "film_category".


In [45]:
%%sql films_by_category <<
select name, count(film_id) as number_of_films
from sakila.film_category as f
join sakila.category as c
on f.category_id = c.category_id
group by name;

 * mysql+pymysql://root:***@localhost/bank
16 rows affected.
Returning data to local variable films_by_category


##### 4. Which actor has appeared in the most films? **Hint**: You can create a join between the tables "actor" and "film actor" and count the number of times an actor appears.


In [63]:
%%sql most_actor <<
select first_name, last_name, count(f.actor_id) as amount
from sakila.film_actor as f
join sakila.actor as a
on f.actor_id = a.actor_id
group by f.actor_id
order by amount desc
limit 1; 

 * mysql+pymysql://root:***@localhost/bank
1 rows affected.
Returning data to local variable most_actor


##### 5. Which is the most active customer (the customer that has rented the most number of films)? **Hint**: Use appropriate join between the tables "customer" and "rental" and count the `rental_id` for each customer.

In [74]:
%%sql most_customer <<
select first_name, last_name, count(r.rental_id) as rentals
from sakila.rental as r
join sakila.customer as c
on r.customer_id = c.customer_id
group by r.customer_id
order by rentals desc;

 * mysql+pymysql://root:***@localhost/bank
599 rows affected.
Returning data to local variable most_customer


In [75]:
most_customer

first_name,last_name,rentals
ELEANOR,HUNT,46
KARL,SEAL,45
CLARA,SHAW,42
MARCIA,DEAN,42
TAMMY,SANDERS,41
SUE,PETERS,40
WESLEY,BULL,40
RHONDA,KENNEDY,39
MARION,SNYDER,39
TIM,CARY,39
