### PostgreSQL Fundamentals
### Relational Database

A relational database is a database where the data is stored in **tables** (**Relational Data Model**). It's the main **misconception** that the word ```relational``` is connected with relationships between tables. The word **relational** is connected with **relational algebra** where **relation means a subset.** Thus, a table is a set that consists of subsets (columns) and a relation is a relationships between columns in a table.

**SQL** stands for Structural Query Language. It was designed for accessing the data from tables using special commands from SQL.

### SELECT
It's the main operator that usually includes other operators such as: ```WHERE, GROUP BY, ORDER BY, LIMIT``` and others. 

**Important**

- All commands in SQL are **case-insensitive** (e.g. ```SELECT ~ select``` but it's recommended to use upper case)
- ```;``` is used to signal the end of SQL query or separation between queries
- ```*``` selects everything
- Calculations can be used ```SELECT``` (e.g. ```SELECT salary/months```)
- To list all databases in the current db server use: ```SELECT datname FROM pg_database;```
- ```ALIASES``` can be defined. They exist only during the query execution (e.g. ```SELECT col_name AS alias```)
- ```ALIASES``` can't be used in ```WHERE```
- ```ALIASES``` can be used only in ```GROUP BY```, ```ORDER BY``` and ```HAVING```
- ```AS``` is optional
- ```DISTINCT``` - selects only unique rows (e.g. ```SELECT DISTINCT col_1, col_2, ...```)
- ```NULL``` always considered as ```FALSE``` and **ANY** comparison with ```NULL``` returns ```FALSE```
- ```NULL``` values are ignored in aggregation functions
- In ```GROUP BY``` ```NULL``` values are grouped into one group
- ```LIMIT n OFFSET n``` is ued to skip some rows from the result query
- ```FETCH FIRST n ROWS ONLY``` is similar to ```LIMIT```. However, ```FETCH``` is compatible with other dbs


### WHERE 
This command is used to provide conditions. Normally, the following functions are defined inside ```WHERE```:
- ```col_name IN (value_1, value_2, ...)``` - mutiple columns can be define before IN
- ```col_name [NOT] IN (value_1, value_2, ...)``` 
- ```col_name [NOT] BETWEEN low AND high```
- ```col_name [NOT] LIKE 'ABC%'``` - ```%``` - sequence of zero or more characters, ```_``` - matches a single character
- ```col_name [NOT] ILIKE 'patter'``` - case-insensitive, in patter reg_exp can be defined
- ```col_name IS [NOT] NULL``` - checks if a value is NULL

Logical operators has an order:
1. ```NOT```
2. ```AND```
3. ```OR```

- ```WHERE``` is used to filter rows using conditions

### ORDER BY
Allows sorting rows returned by a ```SELECT``` clause in ascending or descending order. By default ```ASC``` is used

**Example**
```
...
ORDER BY col_name
OFFSET 10 ROWS
FETCH FIRST 10 ROWS ONLY
```

**Important**

- Alias can be used in ```ORDER BY```
- If two columns must be ordered, the first column is ordered first, then the second
- ```NULL``` can be either first ```NULLS FIRST``` or last ```NULLS LAST```

### JOINS
Joins are used to combine columns from one or more tables based on values of common tables (i.e. primary key). Distinguish the following types of joins:
- ```SELF-JOIN``` - joins a table with itself and compares rows with the same table
- ```INNER JOIN``` - Returns a result set that contains row in the left table that match with the row in the right table
- ```LEFT JOIN``` - Returns a complete set of rows from the left table with the matching rows if available from the right table
- ```RIGHT JOIN``` - is opposite to left
- ```FULL JOIN``` - returns a result set that contains all rows from both left and right tables, with the matching rows from both sides if available.
- ```CROSS JOIN``` - produces a ```Cartesion Product NxM```
- ```NATURAL JOIN``` - by default equals to ```INNER JOIN``` but it can be any of ```INNER, LEFT or RIGHT```

<br>
<img src="img/joins.png" alt="drawing" width="800"/>
<br>

**Cross Join Illustration**
<br>
<img src="img/cross-join.png" alt="drawing" width="300"/>
<br>

To define a join, the following command can be used:
```
SELECT ...
FROM ...
INNER JOIN table_name ON table_1.column_name = table_2.column_name or INNER JOIN table_name USING (common col_name)
```
**Important**

- Table names are used in ```JOIN``` because some columns in both tables can have the same name
- Instead of ```ON``` ```USING``` operator can be used (e.g. ```INNER JOIN table_1 USING (customer_id)```)
- Word ```OUTER``` is optional and can be omitted (e.g. ```RIGHT OUTER JOIN ~ RIGHT JOIN```)
- ```SELF-JOIN``` is often used for hierarchical data or for comparing rows within the same table
- Tables using ```SELF-JOIN``` must have **different table names**
- ```FULL JOINS``` combines the results of both ```LEFT``` and ```RIGHT JOIN```
- In **MySQL** ```FULL JOIN``` doesn't exist. They use ```LEFT JOIN``` + ```UNION``` + ```RIGHT JOIN```
- It isn't recommended using ```NATURAL JOIN``` as it may lead to unexpected results
<br><br>
- More info: https://www.postgresqltutorial.com/postgresql-self-join/

### GROUP BY
```GROUP BY``` devides the final result returned by a query into groups. The following aggregation function can be applied:
- ```MIN()``` - finds the min group value
- ```MAX()``` - finds the max group value
- ```COUNT()``` - counts number of elements in a group
- ```COUNT(*)``` - counts all rows including ```NULL```
- ```SUM()``` - returns the group sum
- ```AVG()``` - returns the group avg

```NULL``` values are ignored in aggregation functions

**GROUPING SETS**

Allows defining multiple grouping sets in the same query. Let's define ```(col_1, col_2), (col_1), (col_2) and ()``` grouping sets:

```
GROUPING SETS (
    (col_1, col_2),
    (col_1),
    (col_2),
    ()
);
```

**GROUPING**

Returns bit 0 if the argument is a member of the current grouping set and 1 otherwise. Example:
```
SELECT GROUPING(col_name or expression),
GROUPING(col_name or expression),
...
```

- More info: https://www.postgresqltutorial.com/postgresql-grouping-sets/

**CUBE**

Allows generating multiple grouping sets. It creates ```2^n combinations```

- More info: https://www.postgresqltutorial.com/postgresql-cube/

**ROLL UP**

Assumes a **hierarchy** among columns and generates sets that based on this hierarchy. Commonly used to calculate the aggregates of hierarchical data such as sales by year > quarter > month.


**Important**

- ```GROUP BY``` allows applying aggregation functions on groups
- **NULL values** are combined into a single group 
- ```HAVING``` normally used together with ```GROUP BY``` to **filter groups**
- Several columns can be defined in ```GROUP BY```

### UNION
Combines results of seveal queries. However, queries must have the following rules:
- Number of column and their order must match
- Data types must be compatible

**Important**

- Be default ```UNION``` removes all duplicates. To retain duplicates, use ```UNION ALL```
- ```ORDER BY``` can be used only once and usually defined in the last query
- ```UNION``` combines rows vertically 

### INTERSECT
It combines result sets of two or more ```SELECT``` statements into a single result set. It returns any rows that are available in both result sets. Queries have the same restrictions as for ```UNION```. ```ORDER BY``` must be placed at the end of a query.

Normally, it's used to get the rows that **present in both tables**

### EXCEPT
Returns distinct rows from the first (left) query that are not in the output of the second (right) query. Queries have the same restrictions as for ```UNION```.```ORDER BY``` must be placed at the end of a query.

Normally, it's used to get the rows from the first query that don't appear in the result set of the second query.

**Visualization of UNION, INTERSECT and EXCEPT**
<br>
<img src="img/union-except-intersect.jpg" alt="drawing" width="600"/>
<br>

### SUBQUERY
It is a query nested inside another query. Normally, subqueries are used as values either to check a condition or to be passed as values for functions.

**ANY Operator**

Compares a value to a set of values returned by a subquery. It has the following syntax ```expression operator ANY(subquery)```
This operator must follow these rules:
- A subquery must return **exactly one column**
- ```ANY``` operator must be preceded by one of the following operators: ```=, <=, >, <, > and <>```
- ```ANY``` returns ```TRUE``` if any value of the subquery meets the condition

**ALL Operator**

Is similar to ```ANY``` but here a value must **meet a condition for all values in a subquery.** The syntax:

```comparison_operator ALL (subquery)```

**EXISTS**

Tests for existence of rows in a subquery. If the subquery returns at least one row, the result of ```EXISTS``` **is true.** In case the subquery returns no row, the result is of ```EXISTS``` **is false.** It is often used with the correlated subquery. 

If subquery returns ```NULL```, ```EXISTS``` returns true.
```[NOT] EXISTS(subquery)```

**Important**

- A subquery is always executed first

**Order of SQL Commands Execution**

All operators defined before ```SELECT``` **can't use column aliases**
<br>
<img src="img/operators_order.png" alt="drawing" width="100"/>
<br>

**Example**
```
SELECT ...
FROM ...
WHERE ...
GROUP BY ...
HAVING ... 
ORDER BY ...
LIMIT ...
```