# Sakila

* Classic sample database designed to resemble read-world database system.

* Database to manage a chain of movie rental stores, like Blockbuster...

* Here is the entire schema. You might want to open this image in an image editor to see it bigger. It is located in: images/sakila.png

![](images/sakila.png)

In [2]:
from sqlalchemy import create_engine
import pandas as pd
from warnings import filterwarnings
import pymysql
filterwarnings('ignore', category=pymysql.Warning)
import os
engine = create_engine('mysql+pymysql://root:kcmo1728@localhost/sakila') 

## How many distinct actors last names are there?

In [3]:
sql_query = """
select count(distinct last_name) 
from actor;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,count(distinct last_name)
0,121


## Which last names are not repeated?

In [3]:
sql_query = """
select last_name from actor group by last_name having count(*) = 1;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,last_name
0,ASTAIRE
1,BACALL
2,BALE
3,BALL
4,BARRYMORE


## Which last names appear more than once?

In [4]:
sql_query = """
select last_name from actor group by last_name having count(*) > 1;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,last_name
0,AKROYD
1,ALLEN
2,BAILEY
3,BENING
4,BERRY


## Which actor has appeared in the most films?

In [5]:
sql_query = """
select actor.actor_id, actor.first_name, actor.last_name,
       count(actor_id) as film_count
from actor join film_actor using (actor_id)
group by actor_id
order by film_count desc
limit 1;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,actor_id,first_name,last_name,film_count
0,107,GINA,DEGENERES,42


In [11]:
sql_query = """
select UPPER(CONCAT(actor.first_name, ' ', actor.last_name)) AS `Actor Name`,
count(actor_id) as `Film Count`
from actor join film_actor using (actor_id)
group by actor_id
order by `Film Count` desc
limit 1;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,Actor Name,Film Count
0,GINA DEGENERES,42


### Find the films that contain the letters QUEST

In [15]:
## step one
sql_query = """
select title from film
where title like '%%QUEST%%';
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,title
0,CONGENIALITY QUEST
1,QUEST MUSSOLINI


## How many copies of  ‘ZOOLANDER FICTION’ exist in Store 1?

In [17]:
sql_query = """
select film.title, film.film_id from film
where film.title = 'ZOOLANDER FICTION';
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,title,film_id
0,ZOOLANDER FICTION,999


In [19]:
sql_query = """
select * from inventory
where inventory.film_id = 999;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,inventory_id,film_id,store_id,last_update
0,4569,999,1,2006-02-15 05:09:17
1,4570,999,1,2006-02-15 05:09:17
2,4571,999,2,2006-02-15 05:09:17
3,4572,999,2,2006-02-15 05:09:17
4,4573,999,2,2006-02-15 05:09:17


In [20]:
sql_query = """
select film.title, film.film_id, inventory.store_id 
from inventory
join film using (film_id)
where inventory.film_id = 999;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,title,film_id,store_id
0,ZOOLANDER FICTION,999,1
1,ZOOLANDER FICTION,999,1
2,ZOOLANDER FICTION,999,2
3,ZOOLANDER FICTION,999,2
4,ZOOLANDER FICTION,999,2


In [25]:
sql_query = """
select film.title, film.film_id, inventory.store_id 
from inventory
join film using (film_id)
where film.title = 'ZOOLANDER FICTION' and inventory.store_id = 1;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,title,film_id,store_id
0,ZOOLANDER FICTION,999,1
1,ZOOLANDER FICTION,999,1


## What is that average running time of all the films in the sakila DB?

In [26]:
## step one
sql_query = """
select avg(length) from film;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return.head()

Unnamed: 0,avg(length)
0,115.272


## What is the average running time of films by category?

In [28]:
## step one
sql_query = """
select category.name, avg(length) from film 
join film_category 
using (film_id) 
join category 
using (category_id)
group by category.name
order by avg(length) desc;
"""
query_return = pd.read_sql_query(sql_query, engine)
query_return

Unnamed: 0,name,avg(length)
0,Sports,128.2027
1,Games,127.8361
2,Foreign,121.6986
3,Drama,120.8387
4,Comedy,115.8276
5,Family,114.7826
6,Music,113.6471
7,Travel,113.3158
8,Horror,112.4821
9,Classics,111.6667
