# CRUD
This notebook demonstrated basic CRUD (Create, Read, Update, Delete) operations for SQLAlchemy Core.

Core means using table definitions, and *not* using the ORM.

A lot of queries are different in the ORM syntax.<br>
**However**, any sufficiently complex query will inevitably need to use Core features.

This notebook covers the following topics:

- Tables
- Transactions
- Insert
- Select
  - Scalar(s)
  - Order By
  - Offset/Limit
  - Where
  - Transformation
  - Aggregate/Group By
  - Union
- Update
- Delete

> For better readability, consider turning on "Table of Content" or "Show Line Numbers" in the 'View' menu.

## Logging
This is an explicit logger.

In future notebooks, this will be loaded using ``from utils import logs``.

In [None]:
from contextlib import contextmanager
import logging
import sys

handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.WARN)

logger = logging.getLogger('sqlalchemy.engine')
logger.setLevel(logging.DEBUG)
logger.addHandler(handler)

@contextmanager
def logs(level=logging.INFO):
    state = handler.level
    handler.setLevel(level)
    try:
        yield
    finally:
        handler.setLevel(state)

## Tables
Although SQLAlchemy is perfectly capable of running *raw* text queries, it can also use Table definitions.

Table definitions serve a few purposes:
1. Simplify renaming/refactoring of table names and column definitions.
2. Creating a table in live databases, or specifically for testing.

These table definitions are set up by creating instancing of the Table class.<br>
Tables will be sharing a MetaData object. This metadata is used to describe the database in one way or another.<br>
For servers with multi-database setups, this metadata can allow a single network connection to be used for multiple database at the same time.

The signature of the constructor is roughly this (slightly edited for clarity).
```python
class Table(...):
    def __init__(self, tablename: str, metadata: MetaData, *columns: Column, **kwargs):
        ...
```
It is also possible to create table definitions which do not use a MetaData object.<br>
For the purposes of an introduction, this will be introduced only in later notebooks.

In [None]:
import sqlalchemy as sa

In [None]:
metadata = sa.MetaData()
Products = sa.Table(
    'products', 
     metadata,  # remember: the metadata groups tables together in a registry, useful to create a (mock) database with.
    sa.Column('id', sa.Integer, primary_key=True, autoincrement=True),
    sa.Column('name', sa.VARCHAR(255), default=None, nullable=True),
    sa.Column('category', sa.VARCHAR(255), default=None, nullable=True)
)

-----
Creating tables in this way can feel chaotic or unmanaged.<br>
The usage of a capital 'P' in the variable name was intentional.<br>
In later notebooks we might use 'product' or 'products' in lowercase as a pure variable.<br>
By giving it a name styled like that of a class, it automatically discourages reassignment.

In [None]:
print('Table:', Products.name)
for column in Products.columns:
    print(f'> {column.key:<10s} NULLABLE={str(column.nullable):<8s}  PK={str(column.primary_key):<8s} {str(column.type):<10s}')

-----
Tables have their own properties, but accessing columns is necessary.
Columns can be accessed using an attribute name or index.

In [None]:
print(Products.name, Products.fullname)
print('\nUsing .columns')
print(repr(Products.columns.name))
print(repr(Products.columns['name']))
print('\nUsing the .c shorthand')
print(repr(Products.c.name))
print(repr(Products.c['name']))

In [None]:
engine = sa.create_engine('sqlite://')
con = engine.connect()

In [None]:
# This is where we actually create the tables.
# The Metadata understands the tables, and giving it the engine to do so.
# Running this multiple times will not recreate tables (data will stay).
metadata.create_all(engine)

## Basic Query
Here are a few 'raw' queries to demonstrate that everything is in working order.<br>
This means using `sa.text(query:str)` to perform a literal query.<br>
This function is used to supply SQL code that might be very specific, or already optimized.<br>
Avoid string formatting combined with `sa.text` as that usually creates code vulnerable to SQL Injection.

SQLAlchemy likes using [method chaining](https://en.wikipedia.org/wiki/Method_chaining) to build queries.<br>
This means queries can be written as `select(...).where(...).order_by(...)`.<br>
Do note that each method call returns a new object. Queries are not edited in place.

When the connection is used to execute SQL, it returns a Result object (or a subclass of it).
```python
connection.execute(...) -> Result
```
Any data returned will be put in the Result object.<br>
This Result object holds returned rows, counting, and provides some convenience methods.

In [None]:
# Raw Insert
result = con.execute(sa.text('INSERT INTO products(name)VALUES("jack");'))
print(result.rowcount)

In [None]:
# Raw Select
result = con.execute(sa.text('SELECT * FROM products;'))
for row in result:
    print(row)

In [None]:
# This is how to use a Table for this.
result = con.execute(Products.select())
for row in result:
    print(row)

## Transactions
It's important to talk about Transactions because SQLAlchemy uses them a lot, and a lack of understanding will make bugs very difficult to solve.

Database transactions are sets of database operations (select, read, update, etc.) which should be applied or reverted as a whole.<br>
If one operation causes an error, all the preceding operation should be reverted.

Remember these things:
1. Other database connections cannot read from your transactions unless they're doing it on purpose.
2. Transactions tend to lock database rows, which can make things slow.
3. SQLAlchemy will try to start transactions every chance it gets. Event when it's just reading, it will start a transaction implicitly.

----------
To start a transaction properly, call `.begin()` and use it as a context decorator.

```python
with connection.begin() as transaction:
    ...
```

Transactions *should* automatically commit upon leaving the indented block.<br>
If any exception occurs, it will try to rollback instead..<br>
This notebook uses a lot rollbacks to keep the demos clean.

In [None]:
# Quick check-up, also demonstrating how one might need to diagnose things.
if con.closed:
    print("Connection is closed.")
if con.in_transaction():
    print("Connection is in transaction.")
    con.rollback()
    print('Rolled back, changes discarded')

### InvalidRequestError: A common Exception
Calling `connection.begin()` while a transaction is already running will result in the following exception:

> InvalidRequestError: This connection has already initialized a SQLAlchemy Transaction() object via begin() or autobegin; can't call begin() here unless rollback() or commit() is called first.

This usually happens for one of two reasons:
1. Something is using `with engine.begin()`, which starts both a connection and a transaction. (Generally a bad idea).
2. A piece of code dirtied the connection (`sqlalchemy.inspect(...).whatever()` can be the cause of this).

If it's on the connection, this can be solved by using `connection.commit()` or `connection.rollback()`.

## Insert
There are many ways to insert a record.
They're similar, but not quite the same.
When a record has been inserted, SQLAlchemy will return the primary key that has been created/calculated (when applicable).

In [None]:
# Passing the new row as a dictionary.
result = con.execute(Products.insert(), {'name': 'Record 1'})
print(f'Inserted {result.rowcount:d} row(s).')
if con.in_transaction():
    print('commit!')
    con.commit()

In [None]:
with con.begin() as transaction:
    # Connection.execute(query, parameters)
    result = con.execute(Products.insert(), {'name': 'Record 2'})
    print(f'Inserted {result.rowcount:d} row(s).')
    transaction.rollback()

with con.begin() as transaction:
    # Prepared object by using `.values()`
    query = sa.insert(Products).values({'name': 'Record 3'})
    con.execute(query)
    print(f'Inserted {result.rowcount:d} row(s).')
    transaction.rollback()

----------
**Primary Keys:** When data gets inserted into a table, it's *possible* to get the newly created primary key without having to make another query.

A key thing to note is multi-row inserts.
Any time that multiple rows are inserted, the return value will usually omit any and all inserted primary keys.

**PostgreSQL** still return the *all* inserted keys in the expected way.
This is the exception, not th rule.

In [None]:
with con.begin() as t:
    result = con.execute(Products.insert(), {'name': 'Record 2'})
    print(f'Inserted {result.rowcount:d} row(s)')
    print('Newly created primary key:', result.inserted_primary_key)  # yes, that's a tuple.
    t.rollback()

----------
The `inserted_primary_key` (phrased in singular) is returning a tuple for technical consistency.

Relational Database Systems can have a primary key consisting of multiple fields.<br>
SQLAlchemy accounts for this by always returning the inserted key as a tuple.

Things change a bit when inserting multiple records:

In [None]:
with con.begin() as t:
    # Multiple Inserts
    as_list = [{'name': 'Record 3'}, {'name': 'Record 4'}]
    result = con.execute(Products.insert(), as_list)
    print(f'Inserted {result.rowcount} records')
    print(result.inserted_primary_key_rows)

    # SQLite and others will return a list of empty tuples.
    as_list = [{'name': 'Record 5'},]
    result = con.execute(Products.insert(), as_list)
    print(f'Inserted {result.rowcount} records')
    print(result.inserted_primary_key_rows)
    
    t.rollback()

----------
The above demonstrates that multi-row inserts will not return primary keys.<br>
If this feature is needed, look at the `returning` feature in the next notebook file.

Although the `execute` call  is effectively the same (both passing a list), it is the number of records that determines wether or not a primary key is returned.

**Note:** When writing tests, it's important to remember this detail and write for multiple records if that is a possible situation.
Otherwise the code might be tested on the assumption that primary keys are always returned.

## Select
This is the bread and butter of most queries.

Queries can be built using Table or Column objects.<br>
This query is fed into an `.execute` function, which returns a CursorResult (subclass of Result) object.

The 'Cursor' in CursorResult refers to a database cursor, which is a server-side mechanism to iteratore over found rows.<br>
In SQLAlchemy this is also a facade, as it provides database cursor features even though all of the data might already be in memory.

**CursorResult/Row**<br>
When iterating over a CursorResult, it will return Row objects.<br>
These objects can treated like a Named Tuple.<br>
This means a value can be accessed by numeric index (in the order specified by the Select statement),<br>
or accessing the attribute name corresponding to the column name.

**MappingResult/RowMapping**<br>
The CursorResult can also be converted to a MappingResult.<br>
Iterating over this results in a RowMapping, providing a dictionary-like access which also accepst columns objects as keys.<br>
The advantage of this approach is its abstraction of literal column names.

In [None]:
# Querying for ALL Columns
style_a = Products.select()
style_b= sa.select(Products)

print('SQL A:', str(style_a).replace('\n', ''))
print('SQL B:', str(style_b).replace('\n', ''))
print('All columns are spelled out by SQLAlchemy, it will not use a wildcard when using this syntax.\n')

result = con.execute(style_a)
print(result)  # CursorResult
for row in result:  # Row (tuple-styled)
    print(row, type(row))
    # Access like a 'named tuple'
    print('Named->', row.id, row.name, row.category)

print('\n---\n')

result = con.execute(style_b).mappings()  
print(type(result))  # MappingResult
for row in result:  # RowMapping (dict-styled)
    print(row, type(row))


In [None]:
# Selecting specific columns
# The columns contain metadata about the table(s) to query.
# Note that '.columns' and '.c' are the same thing, it is just a writing aid.
query = sa.select(Products.c.id, Products.c.name)

# These can be accessed using the bracket syntax as well. 
query = sa.select(Products.c['id'], Products.c['name'])
# unpack it like a tuple (it stays in order).
for pk, name in con.execute(query):
    print(pk, name)

----------
The above shows rows as tuples and dictionaries when printed.<br>
The underlying object is usually a `Row` or `RowMapping` object.

These allow array-like access with some extras.

**Example:** Queries sometimes get created dynamically, and that includes columns.<br>
The system that adds columns also has to read them. Doing this by a textual key can be a bit iffy.<br>


In [None]:
# After using .mappings(), fields can be access using column definitions.
# This comes in handy for 'calculated columns' later on.
column_id = Products.c.id
column_name = Products.c.name

query = sa.select(column_id, column_name)
for entry in con.execute(query).mappings():
    print(entry[column_id], entry[column_name])

In [None]:
query = sa.select(Products.c.id)
print('A:', query)
query = query.add_columns(Products.c.name)
print('B:', query)

In [None]:
# Appending `scalars()` will return the first column for each row.
# The query will still fetch the entire set.
for pk in con.execute(Products.select()).scalars():
    print(pk)

In [None]:
# Appending `scalar()` will return the first column of the first row.
# The query will still fetch the entire set.
pk = con.execute(Products.select()).scalar()
print(pk)

In [None]:
# In order to select individual columns, `sa.select(*columns)` is used.
query = sa.select(Products.columns.name)
for entry in con.execute(query).mappings():
    print(entry)

### Scalar(s)
The 'Scalar' is the value of the first column.<br>
In SQLAlchemy, the `CursorResult.scalar` (singular) returns the first column of the first row (any additional data becomes inaccessible).<br>
When using `CursorResult` (plural) it returns the first column of *every* row.

This conversion happens in the Python application, not the server.<br>
All the data still transfered across the network, and gets reduced to a singular column.<br>
It may be wise to add `limit(1)` to singular scalar queries.<br>

Description | method
-----|-----
First Row, First Column | .scalar()
All Rows, First Column | .scalars()

The example below has logs turned on to show the query being executed.

In [None]:
with logs():
    query = sa.select(Products)
    print(con.scalar(query))

### Order By
The standard SQL `order by` is alive and well in SQLAlchemy.<br>
It is always recommended to be explicit about ascending and descending ordering.<br>
Most DBMS systems will sort as ascending by default, but SQLAlchemy will not provide a client-side default.

The `order_by(*columns)` method accepts any number of column-like expressions.<br>
Most column-like objects offer a `.asc()` or `.desc()` for ascending or descending expression respectively.<br>
If the sorting is configurable/dynamic, the `sqlalchemy.asc(column)` and `sqlalchemy.desc(column)` functions are available.

In [None]:
# Order by Name, descending:
print(Products.select().order_by(Products.c.name.desc()))

### Offset/Limit
The offset and limit are well supported in most DBMS systems.<br>
The 'limit' can be used without any issues at all.

The offset parameter can be a bit tricky.<br>
While some systems work without effort, Microsoft SQL usually demands an ORDER BY clause to be present.<br>
Testing for support in different dialects is shown in notebook #30.

In [None]:
print(Products.select().offset(10).limit(5))

### Where
The WHERE clause is set in a relatively intuitive manner, using `.where(*conditions)`.<br>
When multiple conditions are provided, it is an implied `AND`.<br>
When writing a comparision, always put the SQLAlchemy object on the left.

**Note:** Using subqueries and joins are shown in the 'READ' notebook.<br>
Many more transformation and operations can be found in 'READ' and 'Transform' notebooks.

In [None]:
query = sa.select(Products).where(Products.c.id <= 2)
with logs():
    for product in con.execute(query):
        print(product)

In [None]:
query = sa.select(Products).where(Products.columns.id.in_([4,5,6]))
with logs():
    for product in con.execute(query):
        print(product)

### Transformation

In [None]:
query = sa.select(Products.columns.id.label('double'))

In [None]:
for product in con.execute(query).mappings():
    print(product)

## Update
The update system was made to apply modifications to one table at a time.<br>
This is the most basic setup.

In [None]:
# The preceding 'select' section did not use transactions, so it might be dirty.
if con.in_transaction():
    con.rollback()

In [None]:
with con.begin():
    for row in con.execute(Products.select()):
        print(row)

In [None]:
with con.begin() as t:
    query = (
        Products.update()
        .where(Products.c.id == 1)
        .values(name='Record X')
    )
    con.execute(query)
    for row in con.execute(Products.select()):
        print(row)    
    t.rollback()

## Delete
Deleting rows is always an action to take with care.<br>
For the sake of experimentation, it's a good idea to count the rows before and after.

In [None]:
query_count = sa.select(sa.func.count(Products.c.name).label('count'))
with con.begin() as t:
    r = con.scalar(query_count)
    print(f'The {Products.name} table contains {r} rows.')

In [None]:
with con.begin() as t:
    result = con.execute(Products.delete())
    print(f'deleted {result.rowcount} rows')
    print(f'{con.scalar(query_count)} rows in database')
    # Rollback, do not actually delete things, this notebook is an experiment.
    t.rollback()

**One Last Thing:**
SQLAlchemy allows queries to copy one another's 'WHERE clause.<br>
This means a delete query can take the filtering or a select query.<br>
This can create a system where it's possible to see what is being deleted before actually doing it.

Additionally, systems supporting 'returning' can also return the deleted rows as if it were a select.

In [None]:
with con.begin() as t:
    q_select = sa.select(Products).where(Products.c.id==1).limit(1)
    result = con.execute(q_select)
    for row in result.mappings():
        print(row)
        
    q_delete = sa.delete(Products).where(q_select.whereclause).returning(Products)
    print('Delete Query:', q_delete)
    result = con.execute(q_delete)
    print('Deleted Records:')
    for row in result:
        print(row)
    t.rollback()