# Functions for Manipulating Data in PostgreSQL
Here you can access the tables used in the course. To access the table, you will need to specify the `dvdrentals` schema in your queries (e.g., `dvdrentals.film` for the `film` table and `dvdrentals.country` for the `country` table).

--- 
_Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.)._

## Take Notes

Add notes about the concepts you've learned and SQL cells with queries you want to keep.

_Add your notes here_

In [None]:
-- Add your own queries here
SELECT *
FROM dvdrentals.film
LIMIT 10

## Explore Datasets
Use the different tables to explore the data and practice your skills!
- Select the `title`, `release_year`, and `rating` of films in the `film` table.
    - Add a `description_shortened` column which contains the first 50 characters of the `description` column, ending with "...".
    - Filter the `film` table for rows where the `special_features` column contains "Commentaries".
- Select the `customer_id`, `amount`, and `payment_date` from the `payment` table.
    - Extract date information from the `payment_date` column, creating new columns for the `day`, `month`, `quarter`, and `year` of transaction.
    - Use the `rental` table to include a column containing the number of days rented (i.e., time between the `rental_date` and the `return_date`).
- Update the title column so that titles with multiple words are reduced to the first word and the first letter of the second word followed by a period. 
    - For example:
        - "BEACH HEARTBREAKERS" becomes "BEACH H."
        - "BEAST HUNCHBACK" becomes "BEAST H."
    - Reformat your shortened title to title case (e.g., "BEACH H." becomes "Beach H.").

### String operations: LEFT, RIGHT, SUBSTRING, ...

In [None]:
SUBSTRINGZ( full_string , start_char , [ length_char ] ) 

Error: SUBSTRINGZ( full_string , start_char , [ length_char ] )  - syntax error at or near "SUBSTRINGZ"

### String operations: Concatenate, Case, Replace ...


### Case: `UPPER`, `LOWER`, `INITCAP`...

In [4]:
SELECT 
  -- Concatenate the category name to coverted to uppercase to the film title converted to title case
  UPPER(c.category)  || ': ' || INITCAP(f.title) AS film_category, 
  -- Convert the description column to lowercase, practice replacement too
  REPLACE(LOWER(f.description), ' ', '_') AS description
FROM 
  dvdrentals.film AS f 
  INNER JOIN dvdrentals.category AS fc 
  	ON f.film_id = fc.film_id 
  INNER JOIN dvdrentals.category AS c 
  	ON fc.category = c.category
LIMIT 5;

Unnamed: 0,film_category,description
0,DOCUMENTARY: Academy Dinosaur,a_epic_drama_of_a_feminist_and_a_mad_scientist...
1,DOCUMENTARY: Academy Dinosaur,a_epic_drama_of_a_feminist_and_a_mad_scientist...
2,DOCUMENTARY: Academy Dinosaur,a_epic_drama_of_a_feminist_and_a_mad_scientist...
3,DOCUMENTARY: Academy Dinosaur,a_epic_drama_of_a_feminist_and_a_mad_scientist...
4,DOCUMENTARY: Academy Dinosaur,a_epic_drama_of_a_feminist_and_a_mad_scientist...


### Using `DATE_TRUNC` to truncate data at different precision levels. Especially useful for `GROUP BY`

Number of rentals per day of the month

In [8]:

SELECT 
  DATE_TRUNC('day', rental_date) AS rental_day,
  -- Count total number of rentals 
  COUNT(*) AS rentals 
FROM dvdrentals.rental
GROUP BY 1
ORDER BY rental_day;

Unnamed: 0,rental_day,rentals
0,2005-05-25 00:00:00+00:00,122
1,2005-05-26 00:00:00+00:00,166
2,2005-05-27 00:00:00+00:00,169
3,2005-05-28 00:00:00+00:00,193
4,2005-05-29 00:00:00+00:00,160
5,2005-05-30 00:00:00+00:00,159
6,2005-05-31 00:00:00+00:00,173
7,2005-06-01 00:00:00+00:00,14
8,2005-06-15 00:00:00+00:00,298
9,2005-06-16 00:00:00+00:00,335


### Built-in `timestamp` features, and using `CAST`

In [6]:
SELECT 
	-- Select the current date
	CURRENT_DATE,
    -- CAST the result of the NOW() function to a date
    CAST( CURRENT_TIME(0) AS time )

Unnamed: 0,current_date,current_time
0,2023-10-10 00:00:00+00:00,2023-10-10 13:28:23





|CURRENT_DATE |CURRENT_TIME|
|------------|------------  |
| 2023-10-10 | 15:27:48   |


### Computing whether a rental is overdue or not:

In [1]:
SELECT 
  c.first_name || ' ' || c.last_name AS customer_name, -- OR CONCAT(c.first_name, ' ', c.last_name) AS customer_name
  f.title,
  r.rental_date,
  -- Extract the day of week date part from the rental_date
  EXTRACT(dow FROM r.rental_date) AS dayofweek,
  AGE(r.return_date, r.rental_date) AS rental_days,  
  -- Use DATE_TRUNC to get days from the AGE function
  CASE WHEN DATE_TRUNC('day', AGE(r.return_date, r.rental_date)) > 
  -- Calculate number of d
    f.rental_duration * INTERVAL '1' day 
  THEN TRUE 
  ELSE FALSE END AS past_due 
  
FROM 
  dvdrentals.film AS f 
  INNER JOIN dvdrentals.inventory AS i 
  	ON f.film_id = i.film_id 
  INNER JOIN dvdrentals.rental AS r 
  	ON i.inventory_id = r.inventory_id 
  INNER JOIN dvdrentals.customer AS c 
  	ON c.customer_id = r.customer_id 
	
WHERE 
  -- Use an INTERVAL for the upper bound of the rental_date 
  r.rental_date BETWEEN CAST('2005-05-01' AS DATE) 
  AND CAST('2005-05-01' AS DATE) + INTERVAL '90 day'
LIMIT 30;

Unnamed: 0,customer_name,title,rental_date,dayofweek,rental_days,past_due
0,TOMMY COLLAZO,FREAKY POCUS,2005-05-25 02:54:33+00:00,3,"{'days': 3, 'hours': 20, 'minutes': 46}",False
1,MANUEL MURRELL,GRADUATE LORD,2005-05-25 03:03:39+00:00,3,"{'days': 7, 'hours': 23, 'minutes': 9}",False
2,ANDREW PURDY,LOVE SUICIDES,2005-05-25 03:04:41+00:00,3,"{'days': 9, 'hours': 2, 'minutes': 39}",True
3,DELORES HANSEN,IDOLS SNATCHERS,2005-05-25 03:05:21+00:00,3,"{'days': 8, 'hours': 5, 'minutes': 28}",True
4,NELSON CHRISTENSON,MYSTIC TRUMAN,2005-05-25 03:08:07+00:00,3,"{'days': 2, 'hours': 2, 'minutes': 24}",False
5,CASSANDRA WALTERS,SWARM GOLD,2005-05-25 03:11:53+00:00,3,"{'days': 4, 'hours': 21, 'minutes': 23}",False
6,MINNIE ROMERO,LAWLESS VISION,2005-05-25 03:31:46+00:00,3,"{'days': 3, 'minutes': 2}",False
7,ELLEN SIMPSON,MATRIX SNOWMAN,2005-05-25 04:00:40+00:00,3,"{'days': 3, 'minutes': 22}",False
8,DANNY ISOM,HANGING DEEP,2005-05-25 04:02:21+00:00,3,"{'days': 6, 'hours': 22, 'minutes': 42}",True
9,APRIL BURNS,WHALE BIKINI,2005-05-25 04:09:02+00:00,3,"{'days': 8, 'hours': 20, 'minutes': 47}",True


### Doing operations with the DateTime datatype.

In [5]:
-- prepare a CTE in order to make these columns usable for calculations and operations
WITH joined_film_incl_expected_return AS (

    SELECT 
        f.title,
        r.rental_date,
        f.rental_duration,
         -- Add the rental duration to the rental date
        INTERVAL '1' day * f.rental_duration + r.rental_date AS expected_return_date,
        r.return_date
    FROM dvdrentals.film AS f
    INNER JOIN dvdrentals.inventory AS i ON f.film_id = i.film_id
    INNER JOIN dvdrentals.rental AS r ON i.inventory_id = r.inventory_id
    ORDER BY f.title
)

SELECT
    *,
	-- compute the overdue_by interval
    J.return_date - expected_return_date AS overdue_by
FROM joined_film_incl_expected_return AS J;

Unnamed: 0,title,rental_date,rental_duration,expected_return_date,return_date,overdue_by
0,ACADEMY DINOSAUR,2005-08-03 00:13:10+00:00,6,2005-08-09 00:13:10+00:00,2005-08-12 01:35:10+00:00,"{'days': 3, 'hours': 1, 'minutes': 22}"
1,ACADEMY DINOSAUR,2005-08-02 04:47:19+00:00,6,2005-08-08 04:47:19+00:00,2005-08-03 04:02:19+00:00,"{'days': -5, 'minutes': -45}"
2,ACADEMY DINOSAUR,2005-07-10 17:07:31+00:00,6,2005-07-16 17:07:31+00:00,2005-07-16 17:03:31+00:00,{'minutes': -4}
3,ACADEMY DINOSAUR,2005-05-31 00:21:07+00:00,6,2005-06-06 00:21:07+00:00,2005-06-06 04:36:07+00:00,"{'hours': 4, 'minutes': 15}"
4,ACADEMY DINOSAUR,2005-08-23 03:56:37+00:00,6,2005-08-29 03:56:37+00:00,2005-08-25 22:58:37+00:00,"{'days': -3, 'hours': -4, 'minutes': -58}"
...,...,...,...,...,...,...
16039,ZORRO ARK,2005-06-16 01:50:32+00:00,3,2005-06-19 01:50:32+00:00,2005-06-17 05:02:32+00:00,"{'days': -1, 'hours': -20, 'minutes': -48}"
16040,ZORRO ARK,2005-08-01 14:11:25+00:00,3,2005-08-04 14:11:25+00:00,2005-08-06 08:52:25+00:00,"{'days': 1, 'hours': 18, 'minutes': 41}"
16041,ZORRO ARK,2005-06-16 04:52:51+00:00,3,2005-06-19 04:52:51+00:00,2005-06-20 23:33:51+00:00,"{'days': 1, 'hours': 18, 'minutes': 41}"
16042,ZORRO ARK,2005-07-07 18:22:45+00:00,3,2005-07-10 18:22:45+00:00,2005-07-08 19:10:45+00:00,"{'days': -1, 'hours': -23, 'minutes': -12}"


### Filtering using the 'Contains' Operator `@>`, which is alternative syntax to the `ANY` function.

In [35]:
SELECT 
  title, 
  special_features 
FROM dvdrentals.film 
-- Filter where special_features contains 'Deleted Scenes'
WHERE special_features::text[] @> ARRAY['Deleted Scenes']::text[]
-- cast as an array of text using the ::text[] syntax. This will allow the comparison to be performed correctly.
LIMIT 10;

Unnamed: 0,title,special_features
0,ACADEMY DINOSAUR,"{""Deleted Scenes"",""Behind the Scenes""}"
1,ACE GOLDFINGER,"{Trailers,""Deleted Scenes""}"
2,ADAPTATION HOLES,"{Trailers,""Deleted Scenes""}"
3,AFRICAN EGG,"{""Deleted Scenes""}"
4,AGENT TRUMAN,"{""Deleted Scenes""}"
5,AIRPLANE SIERRA,"{Trailers,""Deleted Scenes""}"
6,ALABAMA DEVIL,"{Trailers,""Deleted Scenes""}"
7,ALADDIN CALENDAR,"{Trailers,""Deleted Scenes""}"
8,ALASKA PHANTOM,"{Commentaries,""Deleted Scenes""}"
9,ALI FOREVER,"{""Deleted Scenes"",""Behind the Scenes""}"


### Filtering using the `ANY` operator

In [34]:
SELECT
    title, 
    special_features 
FROM 
    dvdrentals.film 
WHERE 
    'Trailers' = ANY (special_features::text[])
-- By adding ::text[] after special_features, we are casting it as an array of text, which allows the ANY operator to work correctly.
LIMIT 5;

Unnamed: 0,title,special_features
0,ACE GOLDFINGER,"{Trailers,""Deleted Scenes""}"
1,ADAPTATION HOLES,"{Trailers,""Deleted Scenes""}"
2,AIRPLANE SIERRA,"{Trailers,""Deleted Scenes""}"
3,AIRPORT POLLOCK,{Trailers}
4,ALABAMA DEVIL,"{Trailers,""Deleted Scenes""}"


### Accessing data in an array simple way, using `=`

** Cannot get it to work even with `::text[]`, best i can get is a limited selection (that only with "trailers" as the only special feature), by making them both arrays.

In [14]:
-- Select the title and special features column 
SELECT 
  title, 
  special_features 
FROM dvdrentals.film
-- Use the array index of the special_features column
WHERE special_features::text[][1] = 'Trailers';

Error: -- Select the title and special features column 
SELECT 
  title, 
  special_features 
FROM dvdrentals.film
-- Use the array index of the special_features column
WHERE special_features::text[][1] = 'Trailers'; - malformed array literal: "Trailers"

### Reading and doing calculations with Datetime:


In [32]:
SELECT
 	-- Select the rental and return dates
	rental_date,
	return_date,
 	-- Calculate the expected_return_date
	rental_date + interval '3 days' AS expected_return_date
FROM dvdrentals.rental
LIMIT 5;


Unnamed: 0,rental_date,return_date,expected_return_date
0,2005-05-25 02:54:33+00:00,2005-05-28 23:40:33+00:00,2005-05-28 02:54:33+00:00
1,2005-05-25 03:03:39+00:00,2005-06-02 02:12:39+00:00,2005-05-28 03:03:39+00:00
2,2005-05-25 03:04:41+00:00,2005-06-03 05:43:41+00:00,2005-05-28 03:04:41+00:00
3,2005-05-25 03:05:21+00:00,2005-06-02 08:33:21+00:00,2005-05-28 03:05:21+00:00
4,2005-05-25 03:08:07+00:00,2005-05-27 05:32:07+00:00,2005-05-28 03:08:07+00:00


### Let's see what data types we are dealing with:

In [31]:
-- Select all columns from the TABLES system database
 SELECT column_name, data_type
 FROM INFORMATION_SCHEMA.COLUMNS
 -- Filter by schema
 WHERE table_name in ('film') 
 	AND column_name in ('title', 'special_features');

Unnamed: 0,column_name,data_type
0,special_features,text
1,title,text


In [27]:
-- Select all columns from the TABLES system database
 SELECT column_name, data_type
 FROM INFORMATION_SCHEMA.COLUMNS
 -- Filter by schema
 WHERE table_name in ('rental');

Unnamed: 0,column_name,data_type
0,rental_id,integer
1,rental_date,timestamp with time zone
2,inventory_id,integer
3,customer_id,integer
4,return_date,timestamp with time zone


### Estabilishing the Version of PostgreSQL:

In [7]:
SELECT version();

Unnamed: 0,version
0,"PostgreSQL 13.10 on aarch64-unknown-linux-gnu,..."
