# SQL-3

## Setup the environment

In [1]:
pip install ipython-sql psycopg2

Note: you may need to restart the kernel to use updated packages.


In [2]:
%load_ext sql

In [3]:
%sql postgresql://postgres:070804@localhost:5432/testdb_1

---

## NULL values in SQL

* Tuples may have a null value for some of their attributes, denoted by **null**
* A null value can imply
    * Missing value
    * N/A

**Q: Find return_date of rental with rental_id = 11496. (Note: Here return_date is null!)**

In [4]:
%sql select rental_id, return_date from rental where rental_id = 11496;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


rental_id,return_date
11496,


**Q: What is the where condition evaluating to?(True or False?)**

In [5]:
%%sql select customer.first_name 
from customer join rental on rental.customer_id = customer.customer_id 
where rental.return_date IN (select return_date from rental where rental.rental_id = 11496)

 * postgresql://postgres:***@localhost:5432/testdb_1
0 rows affected.


first_name


In [6]:
%%sql select rental_id, return_date 
from rental 
where rental_id = 11496 and return_date < '2005-5-30';

 * postgresql://postgres:***@localhost:5432/testdb_1
0 rows affected.


rental_id,return_date


**Q: The following two queries are the same in that where clause has condition that returns true, but why are the results different?**

In [7]:
%sql select count(*) from rental where true;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


count
16044


In [8]:
%%sql select count(*) 
from rental 
where return_date > '2005-05-30' or return_date <= '2005-05-30';

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


count
15861


There are null that's why diffrent

### Testing for NULL
* value IS NULL
* value IS NOT NULL

In [9]:
%%sql select count(*) 
from rental 
where return_date > '2005-05-30' or return_date <= '2005-05-30' 
or return_date is NULL;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


count
16044


### When comparing with NULL

* SQL treats as **unknown** the result of any comparison involving a null value (other than predicates **is null** and **is not null**)

* SQL follows a 3-valued logic
    1. TRUE = 1
    2. FALSE = 0
    3. UNKNOWN = 0.5

In the above example, `return_date < '2005-5-30';` is unknown if return_date is NULL!

* A resulting tuple is **only** produced if its truth value in where clause is **TRUE**
    - A AND B $\implies$ min(truth_value(A), truth_value(B))
    - A OR B $\implies$ max(truth_value(A), truth_value(B))
    - NOT A $\implies$ 1 - truth_value(A)

* Examples:
    - **WHERE** rental_id = 11496 and return_date < '2005-5-30'
    - min(1, 0.5) = 0.5, therefore the expression is unknown!

    - **WHERE** return_date > '2005-05-30' or return_date <= '2005-05-30'
    -- It will not count rows with null! (see query above)
    

In [10]:
%sql select 1=1 and 2=2 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
True


In [16]:
%sql select 1=1 or 2=2 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
True


In [11]:
%sql select 1=1 and 1=0 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
False


In [17]:
%sql select 1=1 or 1=0 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
True


In [12]:
%sql select 1=2 and 1=3 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
False


In [18]:
%sql select 1=2 or 1=3 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
False


In [13]:
%sql select 1=1 and 1 = null result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
""


In [19]:
%sql select 1=1 or 1 = null result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
True


In [14]:
%sql select 1 = 2 and 1 = null result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
False


In [20]:
%sql select 1 = 2 or 1 = null result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
""


In [15]:
%sql select null and null = 1 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
""


In [21]:
%sql select null or null = 1 result;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


result
""


**EXERCISE**

**Q: Try out the truth table for `OR` and `NOT` (like above)**

---

## More on Outer joins

**Q: Find number of films in each language. Make a note on behaviour of the specific join operator!**

In [26]:
%%sql select language.name, count(film.film_id) 
from language join film on film.language_id = language.language_id 
group by language.name;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


name,count
English,1000


In [22]:
%%sql select language.name, count(film.film_id) 
from language inner join film on film.language_id = language.language_id 
group by language.name;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


name,count
English,1000


diffrence b/w inner join and join?

In [23]:
%%sql select language.name, count(film.film_id) 
from language 
right join film on language.language_id = film.language_id 
group by language.name; 

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


name,count
English,1000


In [24]:
%%sql select language.name, count(film.film_id) 
from language 
left outer join film on language.language_id = film.language_id 
group by language.name; 

 * postgresql://postgres:***@localhost:5432/testdb_1
6 rows affected.


name,count
English,1000
French,0
Mandarin,0
German,0
Japanese,0
Italian,0


In [25]:
%%sql select language.name, count(film.film_id) 
from language 
full outer join film on language.language_id = film.language_id 
group by language.name; 

 * postgresql://postgres:***@localhost:5432/testdb_1
6 rows affected.


name,count
English,1000
French,0
Mandarin,0
German,0
Japanese,0
Italian,0


---

## WITH CLAUSE & COMMON TABLE EXPRESSIONS (CTE)
* Provide temporary table to build complex and "neat" SQL queries
* refer to table within other queries
* Can write **recursive queries**
* Syntax
    
    WITH [RECURSIVE]  CTE_NAME [(CTE_COLUMNS)] AS ( CDE_DEFINITION ) PRIMARY_STATEMENT;

**Q: Compare the number of films an actor has acted in to the average number of films that a actor acts in.**

In [27]:
%%sql with avg_film_actor as (
    select avg(a.num_of_films) average 
    from (
        select actor_id, count(film_id) num_of_films
        from film_actor 
        group by actor_id
    ) as a
)
select actor_id, count(film_id) num_of_films, average avg_per_actor
from film_actor 
cross join avg_film_actor
group by actor_id, average
limit 10;

 * postgresql://postgres:***@localhost:5432/testdb_1
10 rows affected.


actor_id,num_of_films,avg_per_actor
1,19,27.31
2,25,27.31
3,22,27.31
4,22,27.31
5,29,27.31
6,20,27.31
7,30,27.31
8,20,27.31
9,25,27.31
10,22,27.31


**Q: Compare a film's rent to the average rent of films that have the same rating as the film.**

In [28]:
%%sql with file_rating_avg as (
  select rating, avg(rental_rate) rating_average
  from film
  group by rating
)
select f.film_id, f.title, f.rating, f.rental_rate, a.rating_average
from film f, file_rating_avg a
where f.rating = a.rating
LIMIT 10;


 * postgresql://postgres:***@localhost:5432/testdb_1
10 rows affected.


film_id,title,rating,rental_rate,rating_average
133,Chamber Italian,NC-17,4.99,2.970952380952381
384,Grosse Wonderful,R,4.99,2.9387179487179487
8,Airport Pollock,R,4.99,2.9387179487179487
98,Bright Encounters,PG-13,4.99,3.0348430493273546
1,Academy Dinosaur,PG,0.99,3.0518556701030928
2,Ace Goldfinger,G,4.99,2.888876404494382
3,Adaptation Holes,NC-17,2.99,2.970952380952381
4,Affair Prejudice,G,2.99,2.888876404494382
5,African Egg,G,2.99,2.888876404494382
6,Agent Truman,PG,2.99,3.0518556701030928


### Using `RECURSIVE` in `WITH` query can refer to its own output

### Basic form of recursive queries

with recursive T as (\
    base_query \
    union all  \
    recursive_query involving T \
    )\
    query_involving T;


**Q: sum of numbers from 1 to 10**

In [None]:
%%sql with recursive t(n) as (
    values (1)#select 1 as n
    union all  
    select n+1 from t where n <10
)
select sum(n) from t;

**Q: List numbers from 1 to 10**

In [29]:
%%sql with recursive list as (
    select 1 as n
    union all
    select n+1 from list where n < 10
)
select * from list;

 * postgresql://postgres:***@localhost:5432/testdb_1
10 rows affected.


n
1
2
3
4
5
6
7
8
9
10


**Q: Writing fibonacci numbers in SQL**

In [30]:
%%sql with recursive fibonacci as (
    select 0 as n_1, 1 as n_2
    union all 
    select n_2, n_1 + n_2 
    from fibonacci
    where n_2 < 5
)
select n_2 as fibanacci_numbers  from fibonacci;


 * postgresql://postgres:***@localhost:5432/testdb_1
5 rows affected.


fibanacci_numbers
1
1
2
3
5


**Q: Write a recursive SQL query to compute the factorial (n!) of numbers up to N**

In [33]:
%%sql with recursive fact_t as (
	select 1 as n, 1 as fact
	union all
	select n+1, fact*(n+1) from fact_t where n < 5
)
select fact from fact_t;

 * postgresql://postgres:***@localhost:5432/testdb_1
5 rows affected.


fact
1
2
6
24
120


**Lets us create a new table as follows.**

In [34]:
%sql drop table if exists new_category;


 * postgresql://postgres:***@localhost:5432/testdb_1
Done.


[]

In [35]:
%%sql create table new_category (
    category_id integer NOT NULL,
    name character varying(25) NOT NULL,
    parent_category_id integer,
    last_update timestamp without time zone DEFAULT now() NOT NULL,
    primary key(category_id),
    constraint foreign_key_new_category
        foreign key(parent_category_id) references new_category(category_id)

)

 * postgresql://postgres:***@localhost:5432/testdb_1
Done.


[]

In [36]:
%%sql insert into new_category (category_id, name, parent_category_id)
values 
(1, 'ALL', NULL),
(2, 'Action/Adventure', 1),
(3, 'Comedy/Musical', 1),
(4, 'Drama', 1),
(5, 'Action', 2),
(6, 'Adventure', 2),
(12, 'Disaster', 5),
(13, 'Sports', 5),
(14, 'Historial', 6),
(15, 'Wild', 6),
(7, 'Comedy', 3),
(8, 'Musical', 3),
(9, 'Social', 4),
(10, 'Political', 4),
(11, 'Court Room', 4),
(16, 'Slapstic', 7),
(17, 'Black', 7),
(18, 'Romantic', 7),
(19, 'Theater', 8),
(20, 'Dance', 8)

 * postgresql://postgres:***@localhost:5432/testdb_1
20 rows affected.


[]

**Q: Lets find all subcategories of films under 'Comedy/Musical' genre.**

In [37]:
%%sql select category_id, name, parent_category_id, null::varchar as parent_name
from new_category 
where name = 'Comedy/Musical'

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


category_id,name,parent_category_id,parent_name
3,Comedy/Musical,1,


In [38]:
%%sql with recursive sub_categories as (
    select category_id, name, parent_category_id, null::varchar as parent_name
    from new_category 
    where name = 'Comedy/Musical'

    union all

    select c.category_id, c.name, c.parent_category_id, sc.name
    from new_category c join sub_categories sc on c.parent_category_id = sc.category_id
)
select name as category, parent_name as parent_category from sub_categories;

 * postgresql://postgres:***@localhost:5432/testdb_1
8 rows affected.


category,parent_category
Comedy/Musical,
Comedy,Comedy/Musical
Musical,Comedy/Musical
Slapstic,Comedy
Black,Comedy
Romantic,Comedy
Theater,Musical
Dance,Musical


**Q: Write a (recursive) query that counts total number of subcategories under each category**

In [40]:
%%sql with recursive sub_categories as (
    select category_id, name, parent_category_id, null::varchar as parent_name
    from new_category 
    where name = 'ALL'

    union all

    select c.category_id, c.name, c.parent_category_id, sc.name
    from new_category c join sub_categories sc on c.parent_category_id = sc.category_id
)
select count(*) from sub_categories;

 * postgresql://postgres:***@localhost:5432/testdb_1
1 rows affected.


count
20


**Q: Extend the above query to also output the category names**

**Q: Using the film_actor and the film relations, write a recursive query to find chain of actors who worked together via films. The output relation should have attributes actor_1, actor_2, film_name.**