## Base SQL

To effectively work with SQL in a scripting environment we must understand the following concepts.

### Transaction
A sequence of one or more operations that are treated as a single unit of work. These operations can include inserting, updating, or deleting data in one or more tables. This includes the keywords:

* BEGIN; … COMMIT;  (Begin transaction, and commit if successful)
* SAVEPOINT name;   (Set a save-point to rollback to)
* ROLLBACK; 	  (Rollback to that save-point)

Transactions are used to ensure that a set of operations is executed atomically, meaning that either all of the operations are completed successfully, or none of them are. This helps to maintain the integrity and consistency of the data in the database.

In [None]:
BEGIN;
	SAVEPOINT undo_change;
	UPDATE inventory SET store_id = 2 WHERE inventory_id = 4581;
	-- woops, undo
	ROLLBACK TO undo_change;
COMMIT;

### Cursor
An object that encapsulates (contains) a query. While basic queries immediately query for data, cursors reads a few rows at a time instead. Great for queries that result in a large amount of data. In postgresql, cursors can only be implemeneted in `pl/psql`

In [None]:
CREATE OR REPLACE FUNCTION cursor_func () RETURNS VARCHAR AS $$ 
    DECLARE 
        curs1 CURSOR FOR SELECT * FROM film; -- THIS IS A CURSOR OBJECT
        film_row film%ROWTYPE;  -- create variable that stores a row of data
    BEGIN
        OPEN curs1; -- WE "OPEN" A CURSOR TO PREPARE A SPACE OF MEMORY TO GET ONE ROW OF DATA

        FETCH curs1 INTO film_row; -- HERE WE FETCH ONE ROW INTO A CURSOR

        LOOP -- HERE WE UTILIZE A LOOP TO FETCH ALL ROWS (ONE ROW AT A TIME) INTO A CURSOR
            FETCH curs1 INTO film_row;
            EXIT WHEN NOT FOUND;
        END LOOP;

        CLOSE curs1; -- CLOSE THE MEMORY ADDRESS (or bad things happen)

        RETURN film_row.title; -- RETURN TITLE OF LAST FILM
	END; $$ 
LANGUAGE plpgsql;

SELECT cursor_func()

### Desired Qualities of a Database

**ACID**
* Atomicity: All changes are performed or NONE.
* Consistency: Data is not lost or unpredictably changed
* Isolation: Intermediate state of transaction is not visible to other transactions
* Durability: After a transaction completes, changes to data are persistent 

## Psycopg2

The `psycopg2` package is an interface used for interacting with sql-database. The data that we pull and write shouldn't be just limited to csv files (insecure!) but should instead contain the whole breadth of API's, CSV's, and databases. 

In the `psycopg2` package, the concepts from the `SQL` are expanded to the following objects.

### Connection

Before we begin manipulating or reading a database, we must understand how we can connect to a database. By calling the `connect()` function from `psycopg2` we establishe a network connection to the PostgreSQL server using the PostgreSQL protocol. 

Think back to how we connected to an API service. We loaded in the `requests` module and then passed in some URL which contained a string of relevant information that we then used to connect to some service. Same concept here.

In [None]:
import psycopg2

params = {
    "host"      : "localhost",
    "dbname"    : "postgres",
    "user"      : "postgres",
    "password"  : "password",
    "port" : "5432"     
}

# establish network connection (simple enough)
conn = psycopg2.connect(**params)

### Cursor

A cursor object in `psycopg2` is a way to interact with a PostgreSQL database. This is like the base cursor object from `SQL`, except for the fact that it allows you to **modify** data as well as read it.

In addition, you are not limited to just executing one query. After completing a query, you can always pass in a new sql statement to implement some new behavior.

When you execute a query with a `psycopg2` cursor, the database returns a result set, which is a collection of rows that match the query criteria. A cursor object provides a way to navigate through this result set and extract individual rows of data.

In [None]:
import psycopg2

# create a cursor to the database connection (and open)
cur = conn.cursor()

# pass in a SQL query to execute
cur.execute("SELECT * FROM mytable")

# fetch all results from the query and loop
for row in cur.fetchall():
    print(row)

# close the cursor 
cur.close()
conn.close()

In [None]:
# this is better expressed with a context manager 
with conn.cursor() as cursor:
    # query 1
    cur.execute("SELECT * FROM mytable")
    row1 = cur.fetchone()

    # query 2
    cur.execute("SELECT * FROM othertable")
    row1 = cur.fetchone()

### Transaction

Let's say you are interested in using a cursor not only for reading data, but also for manipulating (or creating for that matter). In this case, you must utilize functions related to `SQL` transactions.

In `psycopg2`, when you perform a sequence of operations using a cursor, these operations are executed as part of a **transaction**. 

However, this transaction is not committed to the database when the operations are complete. Instead, you must **indicate** when transactions should be committed via `conn.commit()`. Keep in mind that transactions allow us to rollback previous changes in case an error occurs.

However, this rollback will not be done immediately. If you do not handle this rollback elegantly, it could "clog" your database transactions until we rollback. Therefore, this is a good opportunity for a "try-except" block.

In [None]:
with conn.cursor() as cursor:
    # try except block:  https://docs.python.org/3/tutorial/errors.html
    try:
        cur.execute("INSERT INTO mytable (column1, column2) VALUES (1, 'hello')")
        cur.execute("UPDATE mytable SET column2 = 'world' WHERE column1 = 1")

        # Step 4: Commit the transaction
        conn.commit()
    except Exception as e:
        conn.rollback()
        print(e)

## SQLAlchemy

Lastly, `sqlalchemy` is a package that encapsulates `psycopg2` in order to manipulate and read a database without the need for writing actual SQL queries. Keep in mind, programmers hate context-switching. We only want to operate within one language in our script to make bugs easier to find and offload cognitive complexity.

### Connection

The create_engine() function is used to create a connect to a databsse. It takes a connection string as its argument, which specifies the location of the database and the credentials needed to access it. 

The connection string is usually in the form `dialect+driver://username:password@host:port/database`, where `dialect` is the name of the database system you are using (e.g., "postgresql" for PostgreSQL), `driver` is the name of the driver you are using (e.g., "psycopg2" for PostgreSQL), and the other parameters provide the authentication details.

If you recall, we've utilized a connection string in the following format to connect to our Amazon database: `postgresql+psycopg2://postgres:...@rds-pg-jobs.chfavwsr5bmp.us-east-1.rds.amazonaws.com:5432/postgres`.

In [None]:
from sqlalchemy import create_engine
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session

engine = create_engine("dialect+driver://username:password@host:port/database")

### AutoMapper

This function is used to generate Python classes that correspond to the tables in your database. It takes a database connection as its argument, and returns a base class that can be used to create mapped classes for each table in the database.

In [None]:
# prepare base (automap for reflection, declerative for creation)
Base = automap_base()
# reflect tables from database into Base object
Base.prepare(engine, reflect=True)

# save these objects as variables in Python
t1 = Base.classes.t1
t2 = Base.classes.t2
t3 = Base.classes.t3

We can also go in the reverse! If we truly wanted to, we could define an object in Python and create database schemas from this Python object. Clearly a useful feature for web-development. 

In [None]:
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

engine = create_engine("dialect+driver://username:password@host:port/database")

# Define a table using a Python class
Base = declarative_base()

# create the table Object using sqlalchemy objects
class MyTable(Base):
    __tablename__ = 'mytable'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    age = Column(Integer)

# prepare base for creation & insertion (https://docs.sqlalchemy.org/en/20/core/metadata.html#creating-and-dropping-database-tables)
Base.metadata.create_all(engine)

# create and bind a session to the engine
Session = sessionmaker(bind=engine)
session = Session()

# insert data via your objects
row1 = MyTable(name='Alice', age=25)
row2 = MyTable(name='Bob', age=30)
session.add_all([row1, row2])
session.commit()

### Session

The `Session()` class is used interact with the database. While a connection is simply a "pointer" to the database, the session is the object that actually manipulates and queries a database at a "higher level of abstraction." 

The session object automatically handles the creation of transactions, cursors, and other vital objects.

In more general terms, a session can be thought of as a container that holds information about the user's interaction with a program, such as the user's login credentials, preferences, and activity history. 

In [None]:
session = Session(engine)

### Simple Querying

Using SQLalchemy, we can then construct sql queries using function chaining. This includes DQL language statements:

https://docs.sqlalchemy.org/en/14/orm/query.html

In [None]:
# SELECT * FROM t1
results = session.query(t1).all()

# SELECT c1, c2 FROM t1
results = session.query(t1.c1, t1.c2).all()

# (get first row of ) SELECT c1, c2 FROM t1
results = session.query(t1.c1, t1.c2).one()

# SELECT * FROM t1 WHERE c1='red'
results = session.query(t1.c1, t1.c2).filter(t1.c1 == 'red').all()

### DDL & DML

As well as DDL & DML language statements. Keep in mind that we need to commit these changes and should be prepared for a transactional error using a `try-except` block!

In [None]:
try:
    # INSERT INTO t1 VALUES ('red', 3, 4)
    new_row = t1(c1='red', c2=3, c3=4)
    session.add(t1)

    # UPDATE t1 SET c1='green' WHERE c1='red' 
    rows_to_update = session.query(t1.c1, t1.c2).filter(t1.c1 == 'red').update({t1.c1: 'green'})

    # DELETE FROM t1 WHERE c1 == 'green'
    session.query(t1).filter(t1.c1 == 'green').delete()

    # commit 
    session.commit()
except Exception as e:
    session.rollback()
    print('An error occurred:', e)

# consider: how can we use a context manager for our session?
session.close()