# Advanced SQL I: Special Functions
_**Author**: Boom Devahastin Na Ayudhya_
***

Throughout this entire session, we'll be running the queries in PostgreSQL. This Jupyter Notebook will just be a written record of what we've learned so that you'll have all of these functions in one location.

Note that **THIS IS BY NO MEANS AN EXHAUSTIVE LIST** -- I have cherry-picked the ones that are commonly asked in interviews and/or useful on the job from my experience.

### Preparation

You should have already downloaded [PostgreSQL](https://www.enterprisedb.com/downloads/postgres-postgresql-downloads). Make sure you have **pgAdmin 4** set up and that you've loaded the `GoT Schemas`.

## Contents
**I. String Manipulation**
- [`UPPER()`](#UPPER())
- [`LOWER()`](#LOWER())
- [`INITCAP()`](#LOWER())
- [`LENGTH()`](#LENGTH())
- [`TRIM()`](#TRIM())
- [`SUBSTRING()`](#SUBSTRING())
- [Concatenation Methods](#Concatenation)
- [`REPLACE()`](#REPLACE())
- [`COALESCE()`](#COALESCE())

**II. Conditionals**
- [Boolean Statements](#Boolean-Statements)
- [`CASE WHEN`](#CASE-WHEN)

**III. Date-Time Manipulation**
- [Type Conversion](#Type-Conversion)
- [`EXTRACT()`](#EXTRACT())

## I. String Manipulation

### `LOWER()`
This is the same as the `.lower()` method for strings in Python used to convert every letter in a string to lower case

_Example_: Convert all letters of the string `HeLlO, wOrLd!` to lower case
```MySQL
SELECT LOWER('HeLlO, wOrLd!')
```

**DISCUSS:** Why do you think this can be useful? Does case matter in SQL?


**THINK:** Consider the following queries. Which of these will run? <br>
(A) `SELECT first_name FROM people WHERE first_name = 'eddard'` <br>
(B) `select first_name from people where first_name = 'eddard'` <br>
(C) `SELECT first_name FROM people WHERE first_name = 'Eddard'` <br>
(D) `select first_name from people where first_name = 'Eddard'`

**EXERCISE 1:** Write a query that returns the first name of all living members of the ruling family of winterfell, but make sure the letters are all in lower case.

_Answer:_

```MySQL
SELECT p.first_name
FROM people AS p
    INNER JOIN houses AS h ON p.house = h.name
WHERE h.domain = 'winterfell' AND p.alive = 1
```

### `UPPER()`
For completeness, this is the same as the `.upper()` method for strings in Python used to capitalize every letter in a string

_Example_: Capitalize all letters of the string `Hello World`
```MySQL
SELECT 'Hello, world!'
```

**EXERCISE 2:** Write a query that capitalizes every letter of every unique noble house's domain from the `houses` table.

_Answer:_

```MySQL
SELECT UPPER(h.name)
FROM houses AS h
```

### `INITCAP()`
This is the same as the `.capitalize()` method for strings in Python that is used to convert the first letter to upper case.

**EXERCISE 3:** Write a SQL query that returns the first name and houses of all characters whose first name begins with the prefix "ae-" or "Ae-", but make sure that the only the first letter is capitalized in both of those columns.

```MySQL
SELECT INITCAP(c.first_name), INITCAP(c.house)
FROM people AS c
WHERE c.first_name ILIKE 'ae%'
```

### `LENGTH()`
This is the same as the `len()` function in Python. However, since we don't have lists or tuples in SQL, this is only applicable to objects with characters.

**EXERCISE 4:** Write a query that displays the first name and house of characters that are alive, but only if their house is at least 6 characters long. (_Hint: You'll need to group by `id`_)

_Answer:_

```MySQL
SELECT p.first_name, p.house
FROM people AS p
WHERE p.alive = 1
GROUP BY p.id
HAVING LENGTH(p.house) >= 6
```

### `TRIM()`
This is the same as the `.strip()` method for strings in Python that eliminates leading and trailing white spaces.

_Example:_ Write a query that strips out the white space from the string `'     Hello, world!     '`

```MySQL
SELECT TRIM('     Hello, world!     ')
```

### `SUBSTRING()`
Python doesn't have a function that extracts a substring since we can just do it by directly indexing through the string. If you're familiar with R though, then you'll recognize this is similar to the `substr()` function.

Syntax for this function:

```MySQL
SELECT SUBSTRING(string_column FROM <start_position> FOR <num_characters_ahead>)
```
OR
```MySQL
SELECT SUBSTRING(string_column, <start_position>, <num_characters_ahead>)
```

**Example #1:**
```MySQL
SELECT SUBSTRING('Hello there, friend! Hehe.' FROM 1 FOR 5)
```
OR
```MySQL
SELECT SUBSTRING('Hello there, friend! Hehe.', 1, 5)
```
will return `'Hello'`

**Example #2:**
```MySQL
SELECT SUBSTRING('Hello there, friend! Hehe.' FROM 14)
```
OR
```MySQL
SELECT SUBSTRING('Hello there, friend! Hehe.', 14)
```
will return `'friend! Hehe.`

### Concatenation

This is the equivalent of string concatenation in Python using `+`. The `+` in Python is replaced by `||` in PostgreSQL. Alternatively, you can use the `CONCAT()` function.

_Example:_ Write a query that prints every character's full name (i.e. first name then house)
```MySQL
SELECT INITCAP(p.first_name) || ' ' || INITCAP(p.house)
FROM people p
```

**EXERCISE 5:** Write a query that automatically generates the sentence `<bannermen>'s army has <size> soldiers.`

_Answer:_
```MySQL
SELECT INITCAP(b.name) || '''s army has ' || size || ' soldiers.'
FROM bannermen b
```

### `REPLACE()`

This is the equivalent of the `.replace()` method for strings in Python and the `gsub()` function in R.

_Example:_
```MySQL
SELECT house,
       REPLACE(house, 'lannister', 'Evil Ducks') AS new_house -- replace all 'Lannister' with 'Evil Ducks' in house col
FROM people
```

Does the function work when replacing `NULL` values though? Try this and let me know what you see
```MySQL
SELECT first_name,
       REPLACE(nickname, NULL, 'missing') AS new_nickname
FROM people
```

## `COALESCE()`
This is an extremely powerful function that lets us handle missing values on a column-by-column basis.

The syntax is pretty straight forward for this one: 
```MySQL
COALESCE(<column_name>, <fill_value>)
```

Alright, your turn!

**EXERCISE 6**: Write a query that prints every character's full name in one column and their nickname in another, but make sure to replace all `NULL` nicknames with `¯\_(ツ)_/¯`.

_Answer:_
```MySQL
SELECT first_name,
       COALESCE(nickname, '¯\_(ツ)_/¯') AS cleaned_nickname
FROM people
```

_____
## II. Conditionals

### Boolean Statements

**Review Discussion:** What is a Boolean statement? Can you think of an example where you've used this before?

We can also include Booleans to create dummy variables in SQL on the fly.

_Example:_
```MySQL
SELECT  b.name,
        b.size,
        b.size >= 30 AS "IsLarge"
FROM bannermen AS b
```

## `CASE WHEN`
This is the equivalent of if-elif-else statements, except embedded into a query. This takes Boolean Statements to the next level by allowing you to customize what happens on a case-by-case basis

_Example_: Write a query that groups bannermen army sizes into 'yuge' (35+), 'medium' (25-34), 'smol' (< 25) 

```MySQL
SELECT  b.name,
        b.size,
        CASE WHEN b.size >= 35 THEN 'yuge'                 -- if
             WHEN b.size BETWEEN 25 AND 34 THEN 'medium'   -- elif
             ELSE 'smol'                                   -- else
             END AS "size_group"                           -- end it! (and rename if you want)
FROM bannermen AS b
```

## III. Date-Time Manipulation

### Type Conversion
_(Complete documentation here: https://www.postgresql.org/docs/8.1/functions-formatting.html)_

#### `to_timestamp()`
If you have a string that's both date and want to convert it to a datetime objecttime want the date and time,
```MySQL
SELECT to_timestamp('2019 May 13 15:00:05', 'YYYY-MON-DD HH24:MI:SS')
```

#### `to_date()`
If you have a string where you want to convert to a date without any timestamp
```MySQL
SELECT to_date('2019 May 13 14:00:58', 'YYYY-MON-DD')
```

#### `current_date`
You can use this to pull the current date from your computer's clock and manipulate it as you desired.
```MySQL
SELECT current_date
```

**EXERCISE 7:** Write a query that returns what the date was 21 days ago

_Answer:_
    
```MySQL
SELECT current_date - 21
```

### `EXTRACT()`
_(More datetime manipulation functions: https://www.postgresql.org/docs/9.1/functions-datetime.html)_

If you want to extract certain parts of a datetime object, this function is MAGICAL!

```MySQL
SELECT current_timestamp AS today,
	   EXTRACT(day from current_date) AS "Day",
	   EXTRACT(month from current_date) AS "Month",
	   EXTRACT(year from current_timestamp) AS "Year",
	   EXTRACT(hour from current_timestamp) AS "Hour",
	   EXTRACT(minute from current_timestamp) AS "Minute"
```

### Challenge: Interview Questions
Lyft recently acquired the rights to add CitiBike to its app as part of its Bikes & Scooters business. You are a Data Scientist studying a `rides` table containing data on completed trips taken by riders, and a `deployed_bikes` table which contains information on the locations where each unique bike is deployed (i.e. where it is stationed).

**`rides`** schema: 
- `ride_id`: int **[PRIMARY KEY]**
- `bike_id`: int
- `ride_datetime`: string
- `duration`: int

**`deployed_bikes`** schema:
- `bike_id`: int **[PRIMARY KEY]**
- `deploy_location`: string

**EXERCISE 8: For the last week, find the number of rides that occured on each date, ordered from most recent to least recent**

_Answer:_
```MySQL
SELECT  ride_date,
        COUNT(ride_id)
FROM rides
WHERE to_date(ride_date, 'YYYY-MON-DD') BETWEEN (current_date - 7) AND current_date
GROUP BY ride_date
ORDER BY ride_date DESC
```

**EXERCISE 9: Which deployment location did the best over the past week?**

_Answer:_
    
```MySQL
SELECT  d.deploy_location,
        COUNT(r.ride_id)
FROM rides AS r
    INNER JOIN deployed_bikes AS d ON r.bike_id = d.bike_id
WHERE to_date(ride_date, 'YYYY-MON-DD') BETWEEN (current_date - 7) AND current_date
GROUP BY d.deploy_location
ORDER BY COUNT(ride_id) DESC
LIMIT 1
```