# Core Create

This Notebook focuses on the process of inserting records.

The following topics are covered:
- Insert Basics
- Insert with default values
- Insert from another table (insert into ... select ...)
- Returning
- Upsert / Update or Insert / ON CONFLICT
- Common Table Expressions / WITH (select)

Further Reading:
- [INSERT syntax](https://www.sqlite.org/lang_insert.html) by SQLite
- ["Using INSERT Statments"](https://docs.sqlalchemy.org/en/20/tutorial/data_insert.html) by SQLAlchemy

# Setup

Although a Foreign Key *should* enforce the existence of data in a remote table, SQLite does not enforce this by default.<br>
Enforcing this is enabled by using a PRAGMA.

In [None]:
import sqlalchemy as sa
from utils import logs, rollback, try_fix_quirks

engine = sa.create_engine('sqlite://')
con = engine.connect()
try_fix_quirks(con)

In [None]:
metadata = sa.MetaData()
Categories = sa.Table(
    'categories', 
    metadata,
    sa.Column('id', sa.Integer, primary_key=True),  # Not using autoincrement for demonstration purposes.
    sa.Column('name', sa.VARCHAR(255), nullable=False, unique=True),
)

Products = sa.Table(
    'products', 
    metadata, 
    sa.Column('id', sa.Integer, primary_key=True, autoincrement=True),
    sa.Column('name', sa.VARCHAR(255), default=None, nullable=True),
    sa.Column('category_id', sa.INTEGER, sa.ForeignKey(Categories.c.id), nullable=False),
    sa.Column('price', sa.DOUBLE, nullable=False)
)

metadata.create_all(engine)

with con.begin():
    result = con.execute(Categories.insert(), [{'id': 1, 'name': 'No Category'}, {'id': 2, 'name': 'Smartphones'}])
    default_category: int = 1
    
print('Default category:', default_category)

# Insert Basics

These are the basic insert queries, a quick refresher from the core CRUD notebook.<br>
After calling `connection.execute(insert)`, the returned result is quite useful.

`result.returns_rows` is a boolean that speaks for itself.

`result.rowcount` provides an integer with the number of rows inserted.


In [None]:
with rollback(con):
    result = con.execute(Products.insert(), {'name': 'Android', 'price': 100, 'category_id': default_category})
    print('rows affected:', result.rowcount)
    print('returns rows:', result.returns_rows)
    for row in con.execute(Products.select()):
        print(row.id)

## Inserted Primary Key

Tables can be configured to generate a primary key when a record is added.<br>
SQLAlchemy can return this value.

`result.inserted_primary_key` provides a tuple
`result.inserted_primary_key_rows`


In [None]:
with rollback(con):
    result = con.execute(Products.insert(), {'name': 'Android', 'price': 100, 'category_id': default_category})
    print('primary key:', result.inserted_primary_key)
    print('primary key rows:', result.inserted_primary_key_rows)

# Insert with Defaults
Data might have a few constant values which keep getting reused.<br>
For this example, it is adding products with the same category.

`sa.execute(Products.insert(), data)`

This is a straightforward piece of code, but what if *some* data didn't have a category ID assigned yet?

```python
data = [{}]
for d in data:
    d.setdefault('category_id', default_category)
```

*Although* this is simple code, SQLAlchemy has a fancy way of doing this.<br>
When creating the insert query, using `.values` can be used to set a default for missing values.<br>
When running and logging the query, no special 'default' syntax is used.<br>
SQLAlchemy is adding the value whenever it is missing.

In [None]:
with rollback(con):
    query = (
        Products.insert()
        # Set the default for category_id.
        .values(category_id=default_category)
    )
    data = [
      {'name': 'Android', 'price': 100}, 
      {'name': 'iPhone', 'price':200, 'category_id':2 },
    ]
    with logs():
        con.execute(query, data)
    print('---')
    for row in con.execute(Products.select()):
        print(row)

# Insert ... Select
This is an insert that copies its data from another select statment.<br>
In SQL, this expression would be:
```SQL
INSERT INTO MyTable (name) SELECT name FROM OtherTable;
```

SQLAlchemy is still very much capable of this.
1. Create a Select query with the data you want.
2. Make sure the columns in the Select query matches order of the Insert.

from_select(columns to fill in, selection with right number of columns)

**Note**
> This will not provide a Primary Key, even if only 1 record was inserted.
> 
> The `rowcount` member should still be available as usual.
>
> As always, certain DBMS may implement while others will not.


In [None]:
with con.begin() as t:
    con.execute(Products.insert(), {'name': 'Android', 'category_id': default_category, 'price': 100})
    for row in con.execute(Products.select()):
        print(row)
    
    original = sa.select(Products.c['name'], Products.c['price'], Products.c['category_id'])
    inserting = Products.insert()
    with logs():
        query = inserting.from_select([Products.c['name'], Products.c['price'], Products.c['category_id']], original)
        result = con.execute(query)
    print('--')
    print('rowcount:', result.rowcount)
    print('returns_rows:', result.returns_rows)
    print('inserted_primary_keys:', result.inserted_primary_key_rows)
    print('closed(cursor):', result.closed)
    print('lastrowid:', result.lastrowid)
    print('--')
    for row in con.execute(Products.select()):
        print(row)
    
    t.rollback()

# Returning

The `RETURNING` clause is a relatively recent addition to database, and is not part of the official SQL language.<br>
For `INSERT` statement, this means returning row data immediately after the insert.<br>
This can make it much easier to query columns for server-calculated default values.

**Note:** As a non-standard feature, not all DBMS may support this.<br>
PostgreSQL and SQLite (since 2021, version 3.35.0) should support this feature.

In [None]:
with con.begin() as t:
    # Multiple Inserts
    as_list = [{'name': 'Blackberry', 'price': 50}, {'name': 'Nokia', 'price': 60}]
    query = Products.insert().values(category_id=default_category).returning(Products.c['id'], Products.c['name'])
    with logs():
        result = con.execute(query, as_list)
    print(f'Inserted {result.rowcount} records')
    print(f'Has rows: {result.returns_rows}')
    for entry in result:
        print(entry)
    
    t.rollback()

**Note:** Apparently `RETURNING` makes SQLite return the primary keys for multiple rows, where it first could not.

# Upsert (on_conflict)
Upsert, also known as "update or insert" is a feature which *tries* to insert, but will do an update if the there's a conflict.<br>
SQLAlchemy names this "on_conflict_do_update" or "on_conflict_do_nothing".

These features are available for SQLite and PostgreSQL, **but** they need to have their specific version of 'insert' to be used.<br>
This has to be imported by Python code. (if there is a fix for this, let me know).

When using 'returning' with "do_nothing", it will only return a row when an insert is successful.
When using 'returning' with "do_update", it will return the inserted or updated row.

MySQL instead provides the [on_duplicate_key_update](https://docs.sqlalchemy.org/en/20/dialects/mysql.html#insert-on-duplicate-key-update-upsert) method (demo not shown here yet).

In [None]:
from sqlalchemy.dialects.sqlite import insert

with con.begin() as t:
    data = [{'id': 1, 'name': 'General'}]
    query = insert(Categories).on_conflict_do_nothing()
    result = con.execute(query, data)
    # on_conflict_do_nothing
    t.rollback()

In [None]:
from sqlalchemy.dialects.sqlite import insert

with con.begin() as t:
    data = [{'id': 1, 'name': 'General'}]
    query = insert(Categories).on_conflict_do_update(set_={'name': 'General'})
    result = con.execute(query, data)
    # on_conflict_do_nothing
    for row in con.execute(sa.select(Categories)):
        print(row)
    t.rollback()

*((Any examples for MySQL's "on_duplicate_key_update" should go here))*

# WITH (common table expressions)
The "Common Table Expression" (CTE) is similar to a subquery, but working slightly differently.<br>
Where a subquery is single use, a CTE can be use in multiple queries without having to be repeated.

In [None]:
q = sa.select(
    sa.literal(1).label('one'), 
    sa.literal(2).label('two')
)


In [None]:
cte_1 = q.cte('CTE')

In [None]:
query = (
    Products.insert()
    .from_select(
        [
            Products.c.name,
            Products.c.price,
            Products.c.category_id
        ], 
        sa.select(cte_1.c.one))
)
print(str(query))


In [None]:
from sqlalchemy.dialects import mssql
print(str(query.compile(dialect=mssql.dialect())))

## caching
The CTE could be considered a view, but does it cache data?

Long story short: there's a chance.<br>
In the end, it's the specific server implementation which decides if data is cached or not.<br>
If a CTE is used only once, it might not cache anything and loads on demand like a subquery.

SQLite does perform some caching, as demonstrated by the following query (random number generation).

In [None]:
cte = sa.select(sa.func.random().label('value')).cte('randoms')
query = sa.select(cte.c.value).union_all(sa.select(cte.c.value))
for row in con.execute(query).scalars():
    print(row)