## Exploring  PostgreSQL Database

Using commmon sql command to explore a database of DVD rentals

- http://www.postgresqltutorial.com/postgresql-sample-database/

In [1]:
%load_ext sql

Connect to the empty database made with pgadmin

In [2]:
%sql postgresql://postgres:eric@localhost:5432/postgres

'Connected: postgres@postgres'

### The Tables

In [6]:
%sql select * from language limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


language_id,name,last_update
1,English,2017-02-15 15:02:19+00:00


In [7]:
%sql select * from film limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


film_id,title,description,release_year,language_id,rental_duration,rental_rate,length,replacement_cost,rating,special_features
1,ACADEMY DINOSAUR,A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies,2010,6,6,0.99,86,20.99,PG,"{""Deleted Scenes"",""Behind the Scenes""}"


### Analyse italian and french films from 2005

In [16]:
%sql SELECT title, description FROM film AS f LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


title,description
ACADEMY DINOSAUR,A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies
ACE GOLDFINGER,A Astounding Epistle of a Database Administrator And a Explorer who must Find a Car in Ancient China
ADAPTATION HOLES,A Astounding Reflection of a Lumberjack And a Car who must Sink a Lumberjack in A Baloon Factory
AFFAIR PREJUDICE,A Fanciful Documentary of a Frisbee And a Lumberjack who must Chase a Monkey in A Shark Tank
AFRICAN EGG,A Fast-Paced Documentary of a Pastry Chef And a Dentist who must Pursue a Forensic Psychologist in The Gulf of Mexico


add language data by joins

In [9]:
%%sql
SELECT title, description
FROM film as f
INNER JOIN language AS l
ON f.language_id = l.language_id
LIMIT 5

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


title,description
ACE GOLDFINGER,A Astounding Epistle of a Database Administrator And a Explorer who must Find a Car in Ancient China
AGENT TRUMAN,A Intrepid Panorama of a Robot And a Boy who must Escape a Sumo Wrestler in Ancient China
DATE SPEED,A Touching Saga of a Composer And a Moose who must Discover a Dentist in A MySQL Convention
ALLEY EVOLUTION,A Fast-Paced Drama of a Robot And a Composer who must Battle a Astronaut in New Orleans
AMERICAN CIRCUS,A Insightful Drama of a Girl And a Astronaut who must Face a Database Administrator in A Shark Tank


use IN to limit languages and ensure year = 2005

In [15]:
%%sql
SELECT title, description
FROM film AS f
INNER JOIN language AS l
  ON f.language_id = l.language_id
WHERE name IN ('Italian','French')
  AND release_year = 2005
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


title,description
ALI FOREVER,A Action-Packed Drama of a Dentist And a Crocodile who must Battle a Feminist in The Canadian Rockies
BEHAVIOR RUNAWAY,A Unbelieveable Drama of a Student And a Husband who must Outrace a Sumo Wrestler in Berlin
BIRCH ANTITRUST,A Fanciful Panorama of a Husband And a Pioneer who must Outgun a Dog in A Baloon
BOWFINGER GABLES,A Fast-Paced Yarn of a Waitress And a Composer who must Outgun a Dentist in California
BROTHERHOOD BLANKET,A Fateful Character Study of a Butler And a Technical Writer who must Sink a Astronaut in Ancient Japan


### Get list of top paying customers 


look at tables

In [18]:
%sql select * from customer limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


customer_id,first_name,last_name,email,address_id,active
1,MARY,SMITH,MARY.SMITH@sakilacustomer.org,5,False


In [19]:
%sql select * from payment limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


payment_id,customer_id,rental_id,amount,payment_date
16050,269,7,1.99,2017-01-25 02:40:19+00:00


In [21]:
%%sql
SELECT first_name,
       last_name,
       amount
FROM payment AS p
INNER JOIN customer AS c
  ON p.customer_id = c.customer_id
WHERE active = 'True'
ORDER BY amount DESC
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


first_name,last_name,amount
ROSEMARY,SCHMIDT,11.99
VICTORIA,GIBSON,11.99
ALMA,AUSTIN,11.99
NICHOLAS,BARFIELD,11.99
VANESSA,SIMS,11.99


### Transform numeric & strings
Run a 50% off promotion for films released prior to 2006. To prepare for this promotion return the films that qualify for this promotion, to make these titles easier to read convert them all to lower case. Return both the original_rate and the sale_rate.

In [22]:
%sql select * from film limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


film_id,title,description,release_year,language_id,rental_duration,rental_rate,length,replacement_cost,rating,special_features
1,ACADEMY DINOSAUR,A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies,2010,6,6,0.99,86,20.99,PG,"{""Deleted Scenes"",""Behind the Scenes""}"


In [25]:
%%sql
SELECT  LOWER(title) AS title, 
   rental_rate AS original_rate, 
   rental_rate * 0.5 AS sale_rate 
FROM film
WHERE release_year < 2006
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


title,original_rate,sale_rate
airport pollock,4.99,2.495
ali forever,4.99,2.495
alone trip,0.99,0.495
american circus,4.99,2.495
analyze hoosiers,2.99,1.495


### Using EXTRACT

In [26]:
%sql select * from payment limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


payment_id,customer_id,rental_id,amount,payment_date
16050,269,7,1.99,2017-01-25 02:40:19+00:00


In [30]:
%%sql
SELECT payment_date,
  EXTRACT(DAY FROM payment_date) AS payment_day 
FROM payment
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


payment_date,payment_day
2017-01-25 02:40:19+00:00,25.0
2017-01-25 20:16:50+00:00,25.0
2017-01-29 02:44:14+00:00,29.0
2017-01-29 05:58:02+00:00,29.0
2017-01-29 13:10:06+00:00,29.0


In [32]:
%%sql
SELECT payment_date,
EXTRACT(YEAR FROM payment_date) AS payment_year 
FROM payment
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


payment_date,payment_year
2017-01-25 02:40:19+00:00,2017.0
2017-01-25 20:16:50+00:00,2017.0
2017-01-29 02:44:14+00:00,2017.0
2017-01-29 05:58:02+00:00,2017.0
2017-01-29 13:10:06+00:00,2017.0


In [34]:
%%sql
SELECT payment_date,
EXTRACT(HOUR FROM payment_date) AS payment_hour 
FROM payment
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


payment_date,payment_hour
2017-01-25 02:40:19+00:00,2.0
2017-01-25 20:16:50+00:00,20.0
2017-01-29 02:44:14+00:00,2.0
2017-01-29 05:58:02+00:00,5.0
2017-01-29 13:10:06+00:00,13.0


### Aggregating finances
Explore the differences in payments between the customers who are active and those who are not.

In [35]:
%%sql
SELECT active, 
       COUNT(active) AS num_active, 
       AVG(amount) AS avg_amount, 
       SUM(amount) AS total_amount
FROM payment AS p
INNER JOIN customer AS c
  ON p.customer_id = c.customer_id
GROUP BY active;

 * postgresql://postgres:***@localhost:5432/postgres
2 rows affected.


active,num_active,avg_amount,total_amount
False,3278,4.23893227577773,13895.2199999994
True,12771,4.19084566596158,53521.2899999954


### Aggregating strings
To update the storefront window to demonstrate how family-friendly and multi-lingual your DVD collection is. To prepare for this you need to prepare a comma-separated list G-rated film titles by language released in 2010.

In [37]:
%%sql
SELECT name, 
STRING_AGG(title,',') AS film_titles
FROM film AS f
INNER JOIN language AS l
  ON f.language_id = l.language_id
WHERE release_year = 2010
  AND rating = 'G'
GROUP BY name;

 * postgresql://postgres:***@localhost:5432/postgres
6 rows affected.


name,film_titles
English,"ACE GOLDFINGER,VALLEY PACKER"
Japanese,"AMISTAD MIDSUMMER,BUGSY SONG,DOCTOR GRAIL,MARRIED GO"
German,BEAUTY GREASE
Mandarin,"ATLANTIS CAUSE,AUTUMN CROW,CASUALTIES ENCINO,GARDEN ISLAND,RINGS HEARTBREAKERS,SAMURAI LION,SUICIDES SILENCE"
French,"CAT CONEHEADS,DANCING FEVER,LUST LOCK"
Italian,"DESPERATE TRAINSPOTTING,DWARFS ALTER,GRAPES FURY,JAWS HARRY,PACIFIC AMISTAD,PANIC CLUB"


### What tables are in the database

In [39]:
%%sql
SELECT * 
FROM pg_catalog.pg_tables
WHERE schemaname = 'public';

 * postgresql://postgres:***@localhost:5432/postgres
10 rows affected.


schemaname,tablename,tableowner,tablespace,hasindexes,hasrules,hastriggers,rowsecurity
public,actor,postgres,,False,False,False,False
public,address,postgres,,False,False,False,False
public,category,postgres,,False,False,False,False
public,customer,postgres,,False,False,False,False
public,film,postgres,,False,False,False,False
public,film_actor,postgres,,False,False,False,False
public,inventory,postgres,,False,False,False,False
public,language,postgres,,False,False,False,False
public,payment,postgres,,False,False,False,False
public,rental,postgres,,False,False,False,False


How much does the business make per month?

In [40]:
%%sql
-- Explore the tables and fill in the correct one
SELECT * 
FROM payment 
LIMIT 10;

-- Prepare the result
SELECT EXTRACT(MONTH FROM payment_date) AS month, 
       SUM(amount) AS total_payment
FROM payment 
GROUP BY month;

 * postgresql://postgres:***@localhost:5432/postgres
10 rows affected.
5 rows affected.


month,total_payment
1.0,4781.54999999986
4.0,27427.0600000035
3.0,23886.5600000021
5.0,1646.58
2.0,9674.7599999996


### Find columns in database

In [43]:
%%sql
SELECT * 
FROM information_schema.columns
LIMIT 5

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


table_catalog,table_schema,table_name,column_name,ordinal_position,column_default,is_nullable,data_type,character_maximum_length,character_octet_length,numeric_precision,numeric_precision_radix,numeric_scale,datetime_precision,interval_type,interval_precision,character_set_catalog,character_set_schema,character_set_name,collation_catalog,collation_schema,collation_name,domain_catalog,domain_schema,domain_name,udt_catalog,udt_schema,udt_name,scope_catalog,scope_schema,scope_name,maximum_cardinality,dtd_identifier,is_self_referencing,is_identity,identity_generation,identity_start,identity_increment,identity_maximum,identity_minimum,identity_cycle,is_generated,generation_expression,is_updatable
postgres,public,actor,actor_id,1,,YES,integer,,,32.0,2.0,0.0,,,,,,,,,,,,,postgres,pg_catalog,int4,,,,,1,NO,NO,,,,,,NO,NEVER,,YES
postgres,public,actor,first_name,2,,YES,text,,1073741824.0,,,,,,,,,,,,,,,,postgres,pg_catalog,text,,,,,2,NO,NO,,,,,,NO,NEVER,,YES
postgres,public,actor,last_name,3,,YES,text,,1073741824.0,,,,,,,,,,,,,,,,postgres,pg_catalog,text,,,,,3,NO,NO,,,,,,NO,NEVER,,YES
postgres,public,address,address_id,1,,YES,integer,,,32.0,2.0,0.0,,,,,,,,,,,,,postgres,pg_catalog,int4,,,,,1,NO,NO,,,,,,NO,NEVER,,YES
postgres,public,address,address,2,,YES,text,,1073741824.0,,,,,,,,,,,,,,,,postgres,pg_catalog,text,,,,,2,NO,NO,,,,,,NO,NEVER,,YES


In [46]:
%%sql
SELECT table_name, column_name
FROM information_schema.columns
WHERE table_schema = 'public';

 * postgresql://postgres:***@localhost:5432/postgres
45 rows affected.


table_name,column_name
actor,actor_id
actor,first_name
actor,last_name
address,address_id
address,address
address,district
address,city
address,postal_code
address,phone
category,film_id


### A VIEW of all your columns
Using the system table information_schema.columns concatenate the list of each table's columns into a single entry.

Make query easily reusable by creating a new VIEW for it called table_columns.

In [50]:
%%sql
SELECT table_name, 
       STRING_AGG(column_name, ', ') AS columns
FROM information_schema.columns
WHERE table_schema = 'public'
GROUP BY table_name;

 * postgresql://postgres:***@localhost:5432/postgres
10 rows affected.


table_name,columns
rental,"rental_id, rental_date, inventory_id, customer_id, return_date"
film_actor,"actor_id, film_id"
film,"film_id, title, description, release_year, language_id, rental_duration, rental_rate, length, replacement_cost, rating, special_features"
customer,"customer_id, first_name, last_name, email, address_id, active"
actor,"actor_id, first_name, last_name"
language,"language_id, name, last_update"
payment,"payment_id, customer_id, rental_id, amount, payment_date"
category,"film_id, category"
inventory,"inventory_id, film_id"
address,"address_id, address, district, city, postal_code, phone"


- Store the previous query result in a new VIEW called table_columns.
- Query newly created view by SELECTing all its rows & columns.

In [61]:
#%sql DROP VIEW table_columns;

 * postgresql://postgres:***@localhost:5432/postgres
Done.


[]

In [62]:
%%sql
CREATE VIEW table_columns AS
SELECT table_name, 
STRING_AGG(column_name, ', ') AS columns
FROM information_schema.columns
WHERE table_schema = 'public'
GROUP BY table_name;

 * postgresql://postgres:***@localhost:5432/postgres
Done.


[]

In [63]:
%sql SELECT * FROM from table_columns;

 * postgresql://postgres:***@localhost:5432/postgres
(psycopg2.ProgrammingError) syntax error at or near "from"
LINE 1: SELECT * FROM from table_columns;
                      ^
 [SQL: 'SELECT * FROM from table_columns;'] (Background on this error at: http://sqlalche.me/e/f405)


### The average length of films by category
The tables film and category have the necessary information to calculate the average movie length for every category. They share a common field film_id which can be used to join these tables. Use this information to query a list of average length for each category.
- Calculate the average length and return this column as average_length.
- Join the two tables film and category.
- Ensure that the result is in ascending order by the average length of each category.

In [66]:
%sql select * from category limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


film_id,category
19,Action


In [67]:
%sql select * from film limit 1;

 * postgresql://postgres:***@localhost:5432/postgres
1 rows affected.


film_id,title,description,release_year,language_id,rental_duration,rental_rate,length,replacement_cost,rating,special_features
1,ACADEMY DINOSAUR,A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies,2010,6,6,0.99,86,20.99,PG,"{""Deleted Scenes"",""Behind the Scenes""}"


In [71]:
%%sql
SELECT category, 
AVG(length) AS average_length
FROM film AS f
INNER JOIN category AS c
ON f.film_id = c.film_id
GROUP BY category
ORDER BY average_length;

 * postgresql://postgres:***@localhost:5432/postgres
16 rows affected.


category,average_length
Sci-Fi,108.1967213114754
Documentary,108.75
Children,109.8
Animation,111.01515151515152
New,111.12698412698413
Action,111.609375
Classics,111.66666666666666
Horror,112.48214285714285
Travel,113.3157894736842
Music,113.6470588235294


### Which films are most frequently rented?

In [75]:
%%html
<img src="rel_diagram.png",width=200,height=200>;

In [78]:
%%sql
SELECT title, COUNT(title)
FROM film AS f
INNER JOIN inventory AS i
  ON f.film_id = i.film_id
INNER JOIN rental AS r
  ON i.inventory_id = r.inventory_id
GROUP BY title
ORDER BY count DESC
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


title,count
BUCKET BROTHERHOOD,34
ROCKETEER MOTHER,33
RIDGEMONT SUBMARINE,32
SCALAWAG DUCK,32
FORWARD TEMPLE,32


### Storing new data
Add a table to database containing the movies which won an Oscar for best film.

In [79]:
%%sql
CREATE TABLE oscars (
    title varchar,
    award varchar

);

 * postgresql://postgres:***@localhost:5432/postgres
Done.


[]

In [81]:
%%sql
INSERT INTO oscars (title, award)
VALUES
('TRANSLATION SUMMER', 'Best Film'),
('DORADO NOTTING',     'Best Film'),
('MARS ROMAN',         'Best Film'),
('CUPBOARD SINNERS',   'Best Film'),
('LONELY ELEPHANT',    'Best Film');

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


[]

In [82]:
%sql SELECT * FROM oscars

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


title,award
TRANSLATION SUMMER,Best Film
DORADO NOTTING,Best Film
MARS ROMAN,Best Film
CUPBOARD SINNERS,Best Film
LONELY ELEPHANT,Best Film


### Using existing data
Identify and store information about films that are family-friendly. Create a new table family_films using the data from the film table. This new table will contain a subset of films that have either the rating G or PG.

In [87]:
%sql SELECT * FROM film WHERE rating IN ('G','PG') LIMIT 3

 * postgresql://postgres:***@localhost:5432/postgres
3 rows affected.


film_id,title,description,release_year,language_id,rental_duration,rental_rate,length,replacement_cost,rating,special_features
1,ACADEMY DINOSAUR,A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies,2010,6,6,0.99,86,20.99,PG,"{""Deleted Scenes"",""Behind the Scenes""}"
2,ACE GOLDFINGER,A Astounding Epistle of a Database Administrator And a Explorer who must Find a Car in Ancient China,2010,1,3,4.99,48,12.99,G,"{Trailers,""Deleted Scenes""}"
4,AFFAIR PREJUDICE,A Fanciful Documentary of a Frisbee And a Lumberjack who must Chase a Monkey in A Shark Tank,2009,4,5,2.99,117,26.99,G,"{Commentaries,""Behind the Scenes""}"


Store the results in new table using the query

In [89]:
%%sql
CREATE TABLE family_films AS
SELECT *
FROM film
WHERE rating IN ('G', 'PG');

 * postgresql://postgres:***@localhost:5432/postgres
372 rows affected.


[]

### Update the price of rentals
Leverage the UPDATE command to modify the rental prices by increasing the rental_rate with the following logic.

- All films now cost 50 cents more to rent.
- R Rated films will go up by an additional 1 dollar.

In [91]:
%%sql
UPDATE film
SET rental_rate = rental_rate + 0.5

 * postgresql://postgres:***@localhost:5432/postgres
1000 rows affected.


[]

In [92]:
%%sql
UPDATE film
SET rental_rate = rental_rate + 1
WHERE rating = 'R'

 * postgresql://postgres:***@localhost:5432/postgres
195 rows affected.


[]

### Updated based on other tables
Lower the rental costs by 1 dollar of films who star the actors/actresses with the following last names: WILLIS, CHASE, WINSLET, GUINESS, HUDSON.

To UPDATE this data in the film table you will need to identify the film_id for these actors.

In [95]:
%%sql
SELECT film_id 
FROM actor AS a
INNER JOIN film_actor AS f
   ON a.actor_id = f.actor_id
WHERE last_name IN ('WILLIS', 'CHASE', 'WINSLET', 'GUINESS', 'HUDSON')
LIMIT 5;

 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


film_id
1
23
25
106
140


In [96]:
%%sql
UPDATE film
SET rental_rate = rental_rate - 1
WHERE film_id IN
  (SELECT film_id from actor AS a
   INNER JOIN film_actor AS f
      ON a.actor_id = f.actor_id
   WHERE last_name IN ('WILLIS', 'CHASE', 'WINSLET', 'GUINESS', 'HUDSON'));

 * postgresql://postgres:***@localhost:5432/postgres
280 rows affected.


[]

### Delete selected records

In [97]:
%%sql
DELETE FROM film
WHERE replacement_cost > 25;

 * postgresql://postgres:***@localhost:5432/postgres
236 rows affected.


[]

In [99]:
%%sql
/*Identify the film_id of all films that have a rating of R or NC-17*/
SELECT film_id
FROM film
WHERE rating IN ('R', 'NC-17')
LIMIT 5;


 * postgresql://postgres:***@localhost:5432/postgres
5 rows affected.


film_id
20
21
32
59
60


In [100]:
%%sql
/*Delete records from the `film` table that are either rated as R or NC-17.*/
DELETE FROM film
WHERE rating IN ('R', 'NC-17');

 * postgresql://postgres:***@localhost:5432/postgres
313 rows affected.


[]