# Writing SQL the Python way with SQLAlchemy

![SQLAlchemy Logo](images/sqlalchemy-logo.png)

SQLAlchemy is the default toolkit for writing SQL in Python - it supports a powerful Core-level layer, mapping Core SQL constructs into Python code. This Core-level in turn supports a higher-level ORM (Object Relational Mapper), which abstracts away the SQL, turning SQL rows into Python objects. 

The power in SQLAlchemy is in it's philosophy of not trying to hide away the database from the programmer, inevitably resulting in a leaky abstraction, but to allow the programmer to switch between Core and ORM at will, getting the best of both worlds.


![SQLAlchemy layers](images/sqlalchemy_layers.jpg)

In this course, we will start by writing Core statements, followed by a look at ORM. We'll cover the components of each layer along the way

# What's the DBAPI? - A bit of history

[PEP-249](https://peps.python.org/pep-0249/) was introduced in 2001 to standardize the API of the various database provider libraries. This meant that all the database libraries now used the same commands to connect, execute and return rows from the database, at a low level. 

In [None]:
# SQlite is built-into Python
import sqlite3

SQLite is a single-file embedded database that's also included in Python - so we can run a database without having to worry about installing anything

In [None]:
db_file = "./local.db"

Before we can store any data, we need to create a table

In [None]:
# SQLite is a single-file database, let's store it in a `local.db` file
with sqlite3.connect(db_file) as conn:
    conn.execute("""CREATE TABLE IF NOT EXISTS test (
                 col1 integer, 
                 col2 string
                )
                """)

In [None]:
with sqlite3.connect(db_file) as conn:
    # We can parametrize the query to avoid SQL injection attacks
    conn.execute("""INSERT INTO test VALUES (:val_1, :val_2)""", {"val_1": 1, "val_2": "text"})
    result = conn.execute("SELECT * FROM test").fetchall()
result

In [None]:
type(result[0])

This code would look very similar in `cx_oracle` or `pyodbc`, thanks to the standardization introduced by the DB-API spec. 

There are a few issues with this code:
- The queries are raw strings
- Hardcoded to a specific database
- Not using any of the Power of Python™

# Moving to SQLAlchemy

SQLAlchemy was created to address these issues and "upgrade" the experience of working with SQL from inside Python. 

First, let's review the Core SQLAlchemy objects we need in order to get started

In [None]:
import sqlalchemy as sa

# The MetaData

MetaData is SQLAlchemy's register over all the defined tables - it allows SQLAlchemy to understand how Tables are connected, handle foreign keys and when issuing DDL (CREATE, DROP, ALTER etc).

The metadata object should be a global object, and all Tables should use the same metadata object

In [None]:
meta = sa.MetaData()

# The Table

To interact with a database, we need to represent the Table in Python code. 

This is what SQLAlchemy will use to generate correct SQL when selecting data

In [None]:
test_table = sa.Table("test", 
                      meta, 
                      sa.Column("col1", sa.Integer), 
                      sa.Column("col2", sa.String))

The table doesn't know about the database, we're just declaring a Python object that SQLAlchemy can use later

# The Engine

The Engine is what talks to the underlying DB-API library. To create an Engine, we need a properly formatted connection string, so the Engine knows what DB-API it needs to talk to.

Creating an engine doesn't connect to the database, so it's merely doing some URL validation and preparing the correct dialect. This is also where we can set various connection options

In [None]:
import sqlalchemy as sa

# SQLAlchemy 2.0 is still in beta, but we can opt-in to the future behaviour
engine = sa.create_engine("sqlite:///local.db", future=True)

# The SQL

Now we're ready to write some SQL - SQLAlchemy style.

In [None]:
sql = sa.select(test_table)
sql

We can print the SQL that will be emitted

In [None]:
print(sql)

SQLAlchemy overloads operators to generate SQL statements, for example the `==` operator

In [None]:
test_table.c.col1 == "test"

In [None]:
print(test_table.c.col1 == "test")

That makes it simple to add a `where`clause

In [None]:
print(sql.where(test_table.c.col1 == "test"))

To run the SQL, we need to establish a connection - this is the first time we're actually doing anything outside our Python process

In [None]:
with engine.connect() as conn:
    result = conn.execute(sql).all()

`result` looks the same

In [None]:
result

But it's actually an "upgraded" version of what we had before

In [None]:
type(result[0])

In [None]:
result[0].col1

In [None]:
dict(result[0])

Note that here we used `.all` to fetch all the results at once - the result of `.execute` won't produce anything until we ask it to. 

- `all` returns all the results in a list
- `one` return a single result and raises an exception if there's not exactly one
- `one_or_none` returns a single result or None and raises an exception if there's > 1 results
- `first` grabs the first result
- `partitions(size)` yields chunks of length `size`
- `yield_per(num)` generally only used if the DBAPI driver supports streaming results - batches up results from the stream

# Some Behind-the-scenes

SQLAlchemy did a few different things for us here out of the box. 

- It generated the correct SQL for our Database (SQLite in this case). 
- It wrapped the result of the query in a `Row` object, allowing us to write `.col1` or convert to a dictionary out-of-the-box. 
- It will convert the DB results to the types declared in the table
- Behind the scenes, SQLAlchemy also creates a pool of connections, depending on the backend. Every time you run a query, you'll check out a connection from the pool instead of creating a new one every time.