## Data Manipulation with the ORM

The previous section `Working with Data` remained focused on the _SQL Expression Language_ from a _Core perspective_, in order to __provide continuity__ across the _major SQL statement constructs_. This section will then build out the lifecycle of the `Session` and __how it interacts with these constructs__.

**Prerequisite Sections** - the ORM focused part builds upon two previous _ORM-centric sections_ in this document:

* __Executing with an ORM Session__ - introduces how to make an ORM `Session` object.

* __Defining Table Metadata with the ORM__ - where we set up our _ORM mappings_ of the `User` and `Address` entities.

* __Selecting ORM Entities and Columns__ - a few examples on _how to run SELECT statements_ for entities like `User`.

#### Inserting Rows with the ORM

When using the ORM, the `Session` object is responsible for __constructing Insert constructs__ and __emitting__ them for us in a _transaction_. The way we instruct the `Session` to do so is by __adding object entries__ to it; the `Session` then makes sure these _new entries_ will be _emitted to the database_ when they are needed, using a process known as a __`flush`__.

In [1]:
from sqlalchemy import (
    Column, Integer, String, ForeignKey,
    create_engine, select, update, delete
)
from sqlalchemy.orm import Session, registry, relationship

In [2]:
engine = create_engine("sqlite+pysqlite:///:memory:", echo=True, future=True)

In [3]:
mapped_registry = registry()
Base = mapped_registry.generate_base()

In [4]:
class User(Base):
    __tablename__ = "user_account"
    
    id = Column(Integer, primary_key=True)
    name = Column(String(30))
    fullname = Column(String)
    
    addresses = relationship("Address", back_populates="user")
    
    def __repr__(self):
        return f"User(id={self.id!r}, name={self.name!r}, fullname={self.fullname!r})"

In [5]:
class Address(Base):
    __tablename__ = "address"
    
    id = Column(Integer, primary_key=True)
    email_address = Column(String, nullable=False)
    user_id = Column(Integer, ForeignKey("user_account.id"))
    
    user = relationship("User", back_populates="addresses")
    
    def __repr__(self):
        return f"Address(id={self.id!r}, email_address={self.email_address!r})"

In [6]:
mapped_registry.metadata.create_all(engine)

2022-10-08 12:22:14,664 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-08 12:22:14,666 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("user_account")
2022-10-08 12:22:14,667 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 12:22:14,668 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("user_account")
2022-10-08 12:22:14,669 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 12:22:14,671 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("address")
2022-10-08 12:22:14,672 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 12:22:14,672 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("address")
2022-10-08 12:22:14,673 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 12:22:14,675 INFO sqlalchemy.engine.Engine 
CREATE TABLE user_account (
	id INTEGER NOT NULL, 
	name VARCHAR(30), 
	fullname VARCHAR, 
	PRIMARY KEY (id)
)


2022-10-08 12:22:14,676 INFO sqlalchemy.engine.Engine [no key 0.00065s] ()
2022-10-08 12:22:14,677 INFO sqlalchemy.engine.Engine 
C

##### Instances of Classes represent Rows

Whereas in the previous example we __emitted an `INSERT`__ using Python dictionaries to indicate the _data we wanted to add_, with the ORM we make _direct use of the custom Python classes_ we defined, back at `Defining Table Metadata with the ORM`. At the _class level_, the `User` and `Address` classes served as a place to define what the __corresponding database tables__ should look like. These classes also serve as __extensible data objects__ that we use to _create and manipulate rows within a transaction_ as well. Below we will create two `User` objects each representing a _potential database row_ to be `INSERT`ed.

In [7]:
sandy = User(name="sandy", fullname="Sandy Cheeks")
patrick = User(name="patrick", fullname="Patrick Star")
squidward = User(name="squidward", fullname="Squidward Tentacles")
krabs = User(name="ehkrabs", fullname="Eugene H. Krabs")

We are able to construct these objects using the _names of the mapped columns as keyword arguments_ in the constructor. This is possible as the `User` class includes an __automatically generated__ `__init__()` constructor that was _provided by the ORM mapping_ so that we could create each object using _column names as keys in the constructor_.

In a similar manner as in our _Core examples_ of `Insert`, we __did not include a primary key__ (i.e. an entry for the `id` column), since we would like to make use of the _auto-incrementing primary key feature of the database_, `SQLite` in this case, which the ORM also integrates with. The value of the `id` attribute on the above objects, if we were to view it, displays itself as `None`.

In [8]:
print(f"{squidward=}")

squidward=User(id=None, name='squidward', fullname='Squidward Tentacles')


The `None` value is provided by `SQLAlchemy` to indicate that the __attribute has no value__ as of yet. _SQLAlchemy-mapped attributes_ always __return a value in Python__ and _don't raise AttributeError if they're missing_, when dealing with a new object that has not had a value assigned.

At the moment, our two objects above are said to be in a _state_ called `transient` - they are __`not associated` with any database state__ and are __yet to be associated with a `Session` object__ that can _generate_ `INSERT` statements for them.

##### Adding objects to a Session

To illustrate the _addition_ process step by step, we will create a `Session` without using a _context manager_ (and hence we must make sure we close it later!).

In [9]:
session = Session(engine)

The __objects are then added to the `Session`__ using the `Session.add()` method. When this is called, the objects are in a _state_ known as __`pending`__ and _have not been inserted_ yet.

In [10]:
session.add(sandy)
session.add(patrick)
session.add(squidward)
session.add(krabs)

When we have _pending objects_, we can see this _state_ by looking at a _collection_ on the `Session` called `Session.new`.

In [11]:
session.new

IdentitySet([User(id=None, name='sandy', fullname='Sandy Cheeks'), User(id=None, name='patrick', fullname='Patrick Star'), User(id=None, name='squidward', fullname='Squidward Tentacles'), User(id=None, name='ehkrabs', fullname='Eugene H. Krabs')])

The above view is using a __collection__ called `IdentitySet` that is _essentially_ a `Python set` that __hashes on object identity__ in all cases (i.e., using Python built-in `id()` function, rather than the Python `hash()` function).

##### Flushing

The `Session` makes use of a _pattern_ known as __`unit of work`__. This generally means it __accumulates changes one at a time__, but _does not actually communicate them_ to the database __until needed__. This allows it to __make better decisions__ about _how SQL DML should be emitted_ in the `transaction` based on a given __set of `pending` changes__. When it does _emit SQL to the database to push out the current set of changes_, the process is known as a `flush`. We can illustrate the _flush process_ __manually__ by calling the `Session.flush()` method.

In [12]:
session.flush()

2022-10-08 12:22:15,375 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-08 12:22:15,378 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-08 12:22:15,380 INFO sqlalchemy.engine.Engine [generated in 0.00223s] ('sandy', 'Sandy Cheeks')
2022-10-08 12:22:15,381 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-08 12:22:15,382 INFO sqlalchemy.engine.Engine [cached since 0.005007s ago] ('patrick', 'Patrick Star')
2022-10-08 12:22:15,384 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-08 12:22:15,385 INFO sqlalchemy.engine.Engine [cached since 0.00717s ago] ('squidward', 'Squidward Tentacles')
2022-10-08 12:22:15,386 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-08 12:22:15,387 INFO sqlalchemy.engine.Engine [cached since 0.009419s ago] ('ehkrabs', 'Eugene H. Krabs')


Above we observe the `Session` was _first called_ upon to __emit SQL__, so it _created a new transaction_ and __emitted__ the _appropriate INSERT statements_ for the two objects. The `transaction` now __remains open__ _until_ we call any of the `Session.commit()`, `Session.rollback()`, or `Session.close()` methods of `Session`.

While `Session.flush()` may be used to _manually_ __push out pending changes__ to the _current transaction_, it is _usually unnecessary_ as the `Session` features a behavior known as __`autoflush`__, which we will illustrate later. It also _flushes out changes_ whenever `Session.commit()` is called.

##### Autogenerated primary key attributes

Once the _rows_ are __inserted__, the two _Python objects_ we've created are in a _state_ known as __`persistent`__, where they are _associated_ with the `Session` object in which they were __added or loaded__, and _feature lots of other behaviors_ that will be covered later.

Another effect of the `INSERT` that occurred was that the __ORM has retrieved the `new primary key identifiers` for each new object__; _internally_ it normally uses the same `CursorResult.inserted_primary_key` accessor we introduced previously. The `squidward` and `krabs` objects now have these _new primary key identifiers_ __associated with them__ and we can view them by _acesssing_ the `id` attribute.

In [13]:
print(f"{squidward.id = }")
print(f"{krabs.id = }")

squidward.id = 3
krabs.id = 4


> ##### Tip
> 
> __Why did the ORM emit `two separate INSERT statements` when it could have used `executemany`?__
> As we'll see in the next section, the `Session` when __flushing objects__ always _needs to know the primary key of newly inserted objects_. If a feature such as _SQLite's autoincrement_ is used (other examples include __PostgreSQL__ `IDENTITY` or `SERIAL`, `using sequences`, etc.), the `CursorResult.inserted_primary_key` feature usually requires that __each `INSERT`__ is __emitted one row at a time__. If we had _provided values for the primary keys ahead of time_, the ORM would have been __able to optimize the operation better__. Some database backends such as `psycopg2` __can__ also `INSERT` _many rows at once_ while still being __able to retrieve the primary key values__.

##### Getting Objects by Primary Key from the Identity Map

The `primary key identity` of the objects are __significant__ to the `Session`, as the objects are now _linked to this identity in memory_ using a feature known as the `identity map`. The `identity map` is an __in-memory store__ that __links__ all objects _currently loaded_ __`in memory`__ to their `primary key identity`. We can observe this by _retrieving_ one of the above objects using the `Session.get()` method, which will _return an entry from the identity map_ __if locally present__, _otherwise_ __emitting a `SELECT`__.

In [14]:
some_squidward = session.get(User, 3)
print(f"{some_squidward = }")

some_squidward = User(id=3, name='squidward', fullname='Squidward Tentacles')


The _important_ thing to note about the `identity map` is that it __maintains a unique instance__ of a _particular Python object per a particular database identity_, __within the scope__ of a particular `Session` object. We may observe that the `some_squidward` refers to the __same object__ as that of `squidward` previously.

In [15]:
print(f"{some_squidward is squidward = }")

some_squidward is squidward = True


The `identity map` is a _critical feature_ that __allows complex sets of objects__ to be __manipulated__ _within a transaction without things getting out of sync_.

##### Committing

There's much more to say about how the `Session` works which will be discussed further. For now we will __commit the transaction__ so that we can build up knowledge on how to `SELECT` rows before examining more ORM behaviors and features.

In [16]:
session.commit()

2022-10-08 12:22:15,772 INFO sqlalchemy.engine.Engine COMMIT


The above operation will __commit the transaction__ that was _in progress_. The `objects` which we've dealt with are __still attached__ to the `Session`, which is a _state_ they stay in __until the `Session` is closed__ (which is introduced at `Closing a Session`).

> ##### Tips
> 
> An _important_ thing to note is that `attributes on the objects` that we just worked with have been __`expired`__, _meaning_, when we next _access any attributes_ on them, the `Session` will __start a new transaction and re-load their state__. This _option_ is __sometimes problematic__ for both _performance reasons_, or if one wishes to _use the objects after closing_ the __`Session`__ (which is known as the `detached state`), as they _will not have any state_ and _will have no_ `Session` with which _to load that state_, leading to __`"detached instance"` errors__. The behavior is __controllable__ using a parameter called `Session.expire_on_commit`. More on this is at `Closing a Session`.

#### Updating ORM Objects

In the preceding section `Updating and Deleting Rows with Core`, we introduced the `Update` construct that represents _SQL UPDATE statement_. When using the ORM, there are __two ways__ in which this construct is used. The _primary way_ is that it is __emitted automatically__ as part of the _unit of work process_ used by the `Session`, where an `UPDATE` statement is _emitted on a per-primary key basis corresponding to individual objects_ that have changes on them. A _second_ form of `UPDATE` is called an __`"ORM enabled UPDATE"`__ and allows us to use the `Update` construct with the `Session` __explicitly__; this is described in the next section.

Supposing we loaded the `User` object for the _username_ `sandy` into a _transaction_ (also showing off the `Select.filter_by()` method as well as the `Result.scalar_one()` method).

In [17]:
sandy = session.execute(select(User).filter_by(name="sandy")).scalar_one()

2022-10-08 12:22:15,873 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-08 12:22:15,876 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account 
WHERE user_account.name = ?
2022-10-08 12:22:15,877 INFO sqlalchemy.engine.Engine [generated in 0.00067s] ('sandy',)


The _Python object_ `sandy` as mentioned before _acts as a_ __proxy__ for the _row in the database_, more specifically the _database row in terms of the_ __current transaction__, that has the primary key identity of 1.

In [18]:
print(f"{sandy = }")

sandy = User(id=1, name='sandy', fullname='Sandy Cheeks')


If we __alter__ the _attributes_ of this object, the `Session` __tracks this change__.

In [19]:
sandy.fullname = "Sandy Squirrel"

The _object_ __appears in a collection__ called `Session.dirty`, indicating the object is __`"dirty"`__.

In [20]:
sandy in session.dirty

True

When the `Session` _next_ __emits a flush__, an `UPDATE` will be _emitted_ that __updates__ this _value in the database_. As mentioned previously, a `flush` occurs __automatically__ _before we emit any SELECT_, using a behavior known as `autoflush`. We can __query directly__ for the `User.fullname` column from this row and we will get our _updated value back_.

In [21]:
sandy_fullname = session.execute(select(User.fullname).where(User.id == 1)).scalar_one()
print(f"{sandy_fullname = }")

2022-10-08 12:22:16,337 INFO sqlalchemy.engine.Engine UPDATE user_account SET fullname=? WHERE user_account.id = ?
2022-10-08 12:22:16,339 INFO sqlalchemy.engine.Engine [generated in 0.00150s] ('Sandy Squirrel', 1)
2022-10-08 12:22:16,341 INFO sqlalchemy.engine.Engine SELECT user_account.fullname 
FROM user_account 
WHERE user_account.id = ?
2022-10-08 12:22:16,343 INFO sqlalchemy.engine.Engine [generated in 0.00170s] (1,)
sandy_fullname = 'Sandy Squirrel'


We can see above that we requested that the `Session` __execute__ a _single_ `select()` statement. However the _SQL emitted_ shows that an `UPDATE were emitted as well`, which was the `flush process` __pushing out pending changes__. The `sandy` _Python object_ is now __no longer considered `dirty`__.

In [22]:
sandy in session.dirty

False

However note we are __still in a `transaction`__ and our _changes_ __have not been pushed__ to the _database's permanent storage_. Since _Sandy's last name_ is in fact `"Cheeks"` not `"Squirrel"`, we will __repair__ _this mistake_ later when we __roll back the transaction__. But first we'll make some more data changes.

##### ORM-enabled UPDATE statements

As previously mentioned, there's a _second way_ to __emit `UPDATE` statements__ in terms of the ORM, which is known as an __`ORM enabled UPDATE statement`__. This allows the use of a _generic SQL UPDATE statement_ that __can affect many rows at once__. For example to _emit_ an `UPDATE` that will _change_ the `User.fullname` column _based on a value_ in the `User.name` column.

In [23]:
session.execute(
    update(User).
    where(User.name == "sandy").
    values(fullname="Sandy Squirrel Extraordinaire")
)

2022-10-08 12:22:16,590 INFO sqlalchemy.engine.Engine UPDATE user_account SET fullname=? WHERE user_account.name = ?
2022-10-08 12:22:16,592 INFO sqlalchemy.engine.Engine [generated in 0.00122s] ('Sandy Squirrel Extraordinaire', 'sandy')


<sqlalchemy.engine.cursor.CursorResult at 0x2579b59c3d0>

When _invoking_ the __ORM-enabled `UPDATE` statement__, _special logic_ is used to __locate__ objects in the `current session` that _match the given criteria_, so that they are __refreshed__ _with the new data_. Above, the `sandy` _object identity_ was __located in memory and refreshed__.

In [24]:
sandy.fullname

'Sandy Squirrel Extraordinaire'

The _refresh logic_ is known as the `synchronize_session` option, and is described in detail in the section `UPDATE and DELETE with arbitrary WHERE clause`.

#### Deleting ORM Objects

To __round out__ the _basic persistence operations_, an _individual ORM object_ __may be marked for deletion__ by using the `Session.delete()` method. Let's load up patrick from the database.

In [25]:
patrick = session.get(User, 2)

2022-10-08 12:22:16,844 INFO sqlalchemy.engine.Engine SELECT user_account.id AS user_account_id, user_account.name AS user_account_name, user_account.fullname AS user_account_fullname 
FROM user_account 
WHERE user_account.id = ?
2022-10-08 12:22:16,845 INFO sqlalchemy.engine.Engine [generated in 0.00173s] (2,)


If we mark `patrick` for __deletion__, as is the case with other operations, _nothing actually happens yet until a flush proceeds_.

In [26]:
session.delete(patrick)

Current ORM behavior is that `patrick` __stays in the `Session` until the `flush` proceeds__, which as mentioned before _occurs if we emit a query_.

In [27]:
session.execute(select(User).where(User.name == "patrick")).first()

2022-10-08 12:22:17,046 INFO sqlalchemy.engine.Engine SELECT address.id AS address_id, address.email_address AS address_email_address, address.user_id AS address_user_id 
FROM address 
WHERE ? = address.user_id
2022-10-08 12:22:17,048 INFO sqlalchemy.engine.Engine [generated in 0.00245s] (2,)
2022-10-08 12:22:17,051 INFO sqlalchemy.engine.Engine DELETE FROM user_account WHERE user_account.id = ?
2022-10-08 12:22:17,052 INFO sqlalchemy.engine.Engine [generated in 0.00094s] (2,)
2022-10-08 12:22:17,055 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account 
WHERE user_account.name = ?
2022-10-08 12:22:17,056 INFO sqlalchemy.engine.Engine [cached since 1.18s ago] ('patrick',)


Above, the `SELECT` we asked to _emit_ was __preceded by a `DELETE`__, which indicated the __pending deletion__ for patrick __proceeded__. There was also a `SELECT` against the `address` table, which was _prompted_ by the ORM _looking for rows in this table which may be related to the target row_; this behavior is part of a behavior known as __cascade__, and __can be tailored__ to work __more efficiently__ by _allowing the database_ to _handle related rows_ in address _automatically_; the section delete has all the detail on this.

Beyond that, the `patrick` object instance now being _deleted_ is __no longer__ considered to be __persistent within the `Session`__, as is shown by the _containment check_.

In [28]:
patrick in session

False

However just like the `UPDATE`s we made to the `sandy` object, _every change_ we've made here is __local to an ongoing transaction__, which __won't become `permanent` if we don't `commit` it__. As _rolling the transaction back_ is actually more interesting at the moment, we will do that in the next section.

##### ORM-enabled DELETE Statements

Like `UPDATE` operations, there is also an _ORM-enabled version_ of `DELETE` which we can illustrate by using the `delete()` construct with `Session.execute()`. It also has a feature by which _non expired objects_ (see expired) that __match the given deletion criteria__ will be __automatically marked__ as `"deleted"` in the `Session`.

In [29]:
# refresh the target object for demonstration purposes
# only, not needed for the DELETE
squidward = session.get(User, 3)
session.execute(delete(User).where(User.name == "squidward"))

2022-10-08 12:22:17,269 INFO sqlalchemy.engine.Engine SELECT user_account.id AS user_account_id, user_account.name AS user_account_name, user_account.fullname AS user_account_fullname 
FROM user_account 
WHERE user_account.id = ?
2022-10-08 12:22:17,270 INFO sqlalchemy.engine.Engine [cached since 0.4267s ago] (3,)
2022-10-08 12:22:17,273 INFO sqlalchemy.engine.Engine DELETE FROM user_account WHERE user_account.name = ?
2022-10-08 12:22:17,276 INFO sqlalchemy.engine.Engine [generated in 0.00300s] ('squidward',)


<sqlalchemy.engine.cursor.CursorResult at 0x2579d801dc0>

The __`squidward` identity__, like that of `patrick`, is now also __in a `deleted` state__. Note that we had to _re-load_ squidward above in order to demonstrate this; if the object were _expired_, the `DELETE` operation __would not take the time to refresh expired objects__ just to see that they had been deleted.

In [30]:
squidward in session

False

#### Rolling Back

The `Session` has a `Session.rollback()` method that as expected __emits a `ROLLBACK` on the SQL connection in progress__. However, it also has an effect on the _objects_ that are __currently associated__ with the `Session`, in our previous example the Python object `sandy`. While we changed the `.fullname` of the `sandy` object to read `"Sandy Squirrel"`, we want to __roll back__ this change. Calling `Session.rollback()` will not only __roll back the transaction__ but also __expire all objects currently associated__ with this `Session`, which will have the effect that they will __refresh themselves__ when _next accessed_ using a process known as __`lazy loading`__.

In [31]:
session.rollback()

2022-10-08 12:22:17,528 INFO sqlalchemy.engine.Engine ROLLBACK


_To view_ the `"expiration"` process __more closely__, we may observe that the Python object `sandy` has __no state left__ within its Python `__dict__`, with the exception of a _special SQLAlchemy internal state object_.

In [32]:
sandy.__dict__

{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState at 0x2579c3bea60>}

This is the `"expired"` state; _accessing the attribute again_ will __autobegin__ a `new transaction` and `refresh` sandy with the __current database row__.

In [33]:
sandy.fullname

2022-10-08 12:22:17,793 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-08 12:22:17,794 INFO sqlalchemy.engine.Engine SELECT user_account.id AS user_account_id, user_account.name AS user_account_name, user_account.fullname AS user_account_fullname 
FROM user_account 
WHERE user_account.id = ?
2022-10-08 12:22:17,797 INFO sqlalchemy.engine.Engine [cached since 0.953s ago] (1,)


'Sandy Cheeks'

We may now observe that the _full database row_ was also __populated__ into the `__dict__` of the sandy object.

In [34]:
sandy.__dict__

{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState at 0x2579c3bea60>,
 'fullname': 'Sandy Cheeks',
 'id': 1,
 'name': 'sandy'}

For _deleted_ objects, when we earlier noted that `patrick` was __no longer in the session__, that _object's identity_ is also __restored__.

In [35]:
patrick in session

True

and of course the _database data_ is __present__ again as well.

In [36]:
session.execute(select(User).where(User.name == "patrick")).scalar_one() is patrick

2022-10-08 12:25:03,670 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account 
WHERE user_account.name = ?
2022-10-08 12:25:03,672 INFO sqlalchemy.engine.Engine [cached since 167.8s ago] ('patrick',)


True