# PostgreSQL (PSQL) Database
We store "data models" as tables. A table can be thought of a single CSV file. An inventor will be a table whereas a patent will be another table. We will then "join" them at runtime if necessary.

For the actual database to be used, we will use PostgreSQL to allow for complex queries and support large scale operations (in preparation when it's used extensively).


Read more about PostgreSQL vs MySQL here:

https://www.fivetran.com/blog/postgresql-vs-mysql

**Most of the things we learn here applies to all SQL databases**

## Creating a table
```sql
CREATE TABLE IF NOT EXISTS inventors (
	key_id VARCHAR PRIMARY KEY,
	first_name VARCHAR NOT NULL,
	last_name VARCHAR NOT NULL,
	location_city VARCHAR NOT NULL,
	location_state VARCHAR NOT NULL,
	location_longitude FLOAT,
	location_latitude FLOAT
);
```
- `CREATE TABLE` indicates that we are creating a new table named `inventors`
- `IF NOT EXISTS` indicates that the database server should ignore us if it already exists (avoiding errors)
- The format for each element in the "array" is:
	- `column_name` `data_type` `column_constraint`
		- `PRIMARY KEY` indicates the column will be used as the primary "index", which optimizes searching by the primary key
			- `PRIMARY KEY` also implements `NOT NULL` and `UNIQUE`
		- `UNIQUE` indicatest that the value in the column must not be duplicated within the same table column.
		- `NOT NULL` indicates that the column must not be `NULL` (empty)
		- it's possible to include a "length" limitation for the datatype. For example, `VARCHAR(64)` will limit the column's length to only 64 characters. Useful for optimizing the database if the maximum length is predictable.

More details can be found here:

https://www.postgresqltutorial.com/postgresql-create-table/

https://www.postgresql.org/docs/14/sql-createtable.html

## Inserting data to the table

```sql
INSERT INTO inventors(
    key_id, first_name, last_name, location_city, location_state, location_longitude, location_latitude
)
VALUES (
    '1', 'test first name', 'test last name', 'test city', 'test state', 0.0, 0.0
);
```
- `INSERT INTO` indicates that we are storing values to the table named `inventors`
- The "list" after the `inventors` defines the columns that we want to update
- The "list" after the `VALUES` defines the values that we want to insert for each column (order matters)

More details can be found here:

https://www.postgresqltutorial.com/postgresql-insert/

https://www.postgresql.org/docs/14/sql-insert.html

## Fetching data from the table

```sql
SELECT * FROM inventors;
```

```sql
SELECT key_id, first_name, last_name FROM inventors;
```
- `SELECT` indicates that we will fetch data from a table given the columns
    - `*` indicates that we will fetch all columns from the table
- `FROM` defines the table to fetch from

More details can be found here:

https://www.postgresqltutorial.com/postgresql-select/

https://www.postgresql.org/docs/14/sql-select.html

## psycopg2
`psycopg2` is a library that we can use in Python to manage our connection to the PSQL server.

To install use the following commands:
```sh
pip install psycopg2-binary
```
- If there are issues during installation, consult me or Google because the issues vary for each environment.


More details can be found here:

https://www.postgresqltutorial.com/postgresql-python/connect/

In [12]:
import psycopg2

conn = None
try:
    conn = psycopg2.connect(
        host="localhost",
        database="docker",
        user="docker",
        password="docker"
    )

    cur = conn.cursor()

    print('PostgreSQL database version:')
    cur.execute('SELECT * from inventors')

    db_version = cur.fetchone()
    print(db_version)
finally:
    if conn:
        conn.close()

PostgreSQL database version:
('1', 'test first name', 'test last name', 'test city', 'test state', 0.0, 0.0)


- `conn = psycopg2.connect(...)` starts a connection to the database and stores it to a variable named `conn`. Ensure that the connection also gets closed once not needed because a "connection" takes up CPU resources for the server
- `cur = conn.cursor()` prepares an environment to execute the SQL commands and stores it to `cur`
- `cur.execute(...)` executes the input SQL commands
- `cur.fetchone()` fetches the first result of the previous execution
- `conn.close()` closes the connection, freeing up the resources for the database

## .env
`.env` is a file that contains environment variables that is configured at the system level (even before the application is ran). This can contain "secrets" or sensitive information that must not be written on the application. It can also contain configurations that allow the application to differentiate environments (like production and development), which can be used to change the application's behavior accordingly.

In [7]:
import psycopg2
from dotenv import load_dotenv
from os import environ

load_dotenv()  # take environment variables from .env.


database_host = environ.get('DB_HOST')
database_name = environ.get('DB_NAME')
database_username = environ.get('DB_USERNAME')
database_password = environ.get('DB_PASSWORD')

conn = None
try:
    conn = psycopg2.connect(
        host=database_host,
        database=database_name,
        user=database_username,
        password=database_password,
    )

    cur = conn.cursor()

    print('PostgreSQL database version:')
    cur.execute('SELECT version()')

    db_version = cur.fetchone()
    print(db_version)
finally:
    if conn:
        conn.close()

PostgreSQL database version:
('PostgreSQL 12.9 (Ubuntu 12.9-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit',)
