## Exploration 

- Look at the inventory 
    - How many films are in the inventory
    - How many films are missing from the inventory 
    - Which films are missing from the inventory 
    - Questions 
        - What is the total value of our inventory 
        - Which films have the largest stock 
- Look at rental 
    - How many retanls have we had 
    - How many films have been returned 
    - What is the average duration between renting a film and returning it 
    - How many customers do we have 
    - Questions 
        - What is the distribution between films that are returned on time and not 
        - What is the spread of films that are returned on time, overdue and not returned 
        - Are there any customers who repeatedly return overdue films 
        - Are there customers who repeatedly don't return films 
        - Are there customers who both don't return films and done return on time 
        - How much is required to replace missing films
- Look at Payment 
    - Total number of payments made 
    - Have all of the films been paid for
    - How many films have not been paid for
    - Questions
        - How much money have we made 
        - Who has made us the most money 
        - Who has not paid for films 
        - Who has not returned films and not paid for the film

### Inventory

Let's take a look at the inventory table and see what information we can gather from it, we will take a look at the following.
- How many films are in the inventory 
- How many films are missing from the inventory 
- Which films are missing from the inventory 

Once we have looked at the table, we can think of what information would be useful to take a further look at.
- What is the value of our stock 
- Are the high stock films rented the most
- Based on the rental numbers, which movies should have a higher stock

In [2]:
-- How many films are in the inventory 
SELECT COUNT(*) AS total_inventory
FROM inventory;

total_inventory
4581


In [7]:
-- What is the stock of each distinct film 
SELECT film_id, COUNT(*) AS in_stock
FROM inventory
GROUP BY film_id
ORDER BY in_stock DESC
LIMIT 10;

film_id,in_stock
745,8
378,8
91,8
220,8
382,8
350,8
266,8
764,8
638,8
127,8


In [11]:
-- Which films are not in the inventory
SELECT f.film_id, i.inventory_id
FROM film AS f 
LEFT JOIN inventory AS i ON f.film_id = i.film_id
WHERE i.film_id IS NULL
ORDER BY f.film_id
LIMIT 10;


film_id,inventory_id
14,
33,
36,
38,
41,
87,
108,
128,
144,
148,


In [15]:
-- How many films in total are not in the inventory 
SELECT COUNT(DISTINCT(f.film_id)) AS count_not_in_inventory
FROM film AS f 
LEFT JOIN inventory AS i ON f.film_id = i.film_id
WHERE i.inventory_id IS NULL;

count_not_in_inventory
42


### Rental

  

We can take a look at the rental table and see what information can be gathered from this table. We can start off by looking at the following.

- <span style="color: var(--vscode-foreground);">How many rentals have we had</span>
- <span style="color: var(--vscode-foreground);">How many films have been returned</span>
- <span style="color: var(--vscode-foreground);">What is the average duration between renting a film and returning it</span>
- <span style="color: var(--vscode-foreground);">How many customers do we have</span>

  

Once we know how the data is structured and works, we can as a few questions.

- <span style="color: var(--vscode-foreground);">What is the distribution between films that are returned on time and not</span>
- <span style="color: var(--vscode-foreground);">What is the spread of films that are returned on time, overdue and not returned</span>
- <span style="color: var(--vscode-foreground);">Are there any customers who repeatedly return overdue films</span>
- <span style="color: var(--vscode-foreground);">Are there customers who repeatedly don't return films</span>
- <span style="color: var(--vscode-foreground);">Are there customers who both don't return films and done return on time</span>
- <span style="color: var(--vscode-foreground);">How much is required to replace missing films</span>
- <span style="color: var(--vscode-foreground);">Who is our most valuable customer and how much have they given us</span>

In [17]:
-- How many rentas have we had 
SELECT COUNT(*) AS total_number_of_rentals
FROM rental

total_number_of_rentals
16044


In [20]:
-- Have all of the films been returned
SELECT COUNT(*)
FROM rental
WHERE return_date IS NULL;

count
183


In [25]:
-- Average duration between renting and returning a film
SELECT ROUND(CAST(AVG(date_part('day', return_date - rental_date)) AS NUMERIC) , 2) AS average_rental_duration
FROM rental

average_rental_duration
4.53


In [28]:
-- Average days movies are rented for and average rental duration for a film 
SELECT ROUND(CAST(AVG(date_part('day', return_date - rental_date)) AS NUMERIC) , 2) AS average_days_rented_out, 
       ROUND(
        CAST(
            AVG(rental_duration) AS NUMERIC
        ), 2
       ) AS average_rental_duration
FROM film, rental

average_days_rented_out,average_rental_duration
4.53,4.99


In [30]:
-- How many distinct customers have we had VS the total number of customers
SELECT COUNT(DISTINCT(customer_id)) AS total_customers,
       COUNT(*) AS total_rentals
FROM rental

total_customers,total_rentals
599,16044


### Payment 

We can take a look at the payment table and see what informaiton we can gather from this table. We can take a look at the following things:
- Total number of payments made
- Have all of the films been paid for
- How many films have not been paid for

Once we understand how the data is being stored, we can ask a few questions.
- How much money have we made
- Who has made us the most money
- Who has not paid for films
- Who has not returned films and not paid for the film

In [32]:
-- Total number of payments made
SELECT COUNT(*)
FROM payment;

count
16049


In [34]:
-- Have all of the films been paid for 
SELECT amount, COUNT(*)
FROM payment 
GROUP BY amount
ORDER BY amount
LIMIT 10;


amount,count
0.0,24
0.99,2979
1.98,1
1.99,640
2.99,3542
3.98,8
3.99,1109
4.99,3789
5.98,7
5.99,1299


In [38]:
-- Compare rentals that have been paid for and retnal that have not been paid for
SELECT payment_status, COUNT(*) AS total
FROM (
    SELECT CASE 
            WHEN amount = 0 THEN 'Not Paid'
            ELSE 'PAID'
            END
            AS payment_status,
            payment_id,
            rental_id,
            customer_id
    FROM payment
) AS temp_payment
GROUP BY payment_status

payment_status,total
PAID,16025
Not Paid,24
