## Data Manipulation with the ORM

The previous section `Working with Data` remained focused on the _SQL Expression Language_ from a _Core perspective_, in order to __provide continuity__ across the _major SQL statement constructs_. This section will then build out the lifecycle of the `Session` and __how it interacts with these constructs__.

**Prerequisite Sections** - the ORM focused part builds upon two previous _ORM-centric sections_ in this document:

* __Executing with an ORM Session__ - introduces how to make an ORM `Session` object.

* __Defining Table Metadata with the ORM__ - where we set up our _ORM mappings_ of the `User` and `Address` entities.

* __Selecting ORM Entities and Columns__ - a few examples on _how to run SELECT statements_ for entities like `User`.

#### Inserting Rows with the ORM

When using the ORM, the `Session` object is responsible for __constructing Insert constructs__ and __emitting__ them for us in a _transaction_. The way we instruct the `Session` to do so is by __adding object entries__ to it; the `Session` then makes sure these _new entries_ will be _emitted to the database_ when they are needed, using a process known as a __`flush`__.

In [1]:
from sqlalchemy import Column, Integer, String, ForeignKey, create_engine
from sqlalchemy.orm import Session, registry, relationship

In [2]:
engine = create_engine("sqlite+pysqlite:///:memory:", echo=True, future=True)

In [3]:
mapped_registry = registry()
Base = mapped_registry.generate_base()

In [4]:
class User(Base):
    __tablename__ = "user_account"
    
    id = Column(Integer, primary_key=True)
    name = Column(String(30))
    fullname = Column(String)
    
    addresses = relationship("Address", back_populates="user")
    
    def __repr__(self):
        return f"User(id={self.id!r}, name={self.name!r}, fullname={self.fullname!r})"

In [5]:
class Address(Base):
    __tablename__ = "address"
    
    id = Column(Integer, primary_key=True)
    email_address = Column(String, nullable=False)
    user_id = Column(Integer, ForeignKey("user_account.id"))
    
    user = relationship("User", back_populates="addresses")
    
    def __repr__(self):
        return f"Address(id={self.id!r}, email_address={self.email_address!r})"

In [6]:
mapped_registry.metadata.create_all(engine)

2022-10-08 00:30:28,652 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-08 00:30:28,654 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("user_account")
2022-10-08 00:30:28,655 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 00:30:28,657 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("user_account")
2022-10-08 00:30:28,658 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 00:30:28,660 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("address")
2022-10-08 00:30:28,661 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 00:30:28,663 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("address")
2022-10-08 00:30:28,664 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-08 00:30:28,666 INFO sqlalchemy.engine.Engine 
CREATE TABLE user_account (
	id INTEGER NOT NULL, 
	name VARCHAR(30), 
	fullname VARCHAR, 
	PRIMARY KEY (id)
)


2022-10-08 00:30:28,667 INFO sqlalchemy.engine.Engine [no key 0.00126s] ()
2022-10-08 00:30:28,669 INFO sqlalchemy.engine.Engine 
C

##### Instances of Classes represent Rows

Whereas in the previous example we __emitted an `INSERT`__ using Python dictionaries to indicate the _data we wanted to add_, with the ORM we make _direct use of the custom Python classes_ we defined, back at `Defining Table Metadata with the ORM`. At the _class level_, the `User` and `Address` classes served as a place to define what the __corresponding database tables__ should look like. These classes also serve as __extensible data objects__ that we use to _create and manipulate rows within a transaction_ as well. Below we will create two `User` objects each representing a _potential database row_ to be `INSERT`ed.

In [7]:
squidward = User(name="squidward", fullname="Squidward Tentacles")
krabs = User(name="ehkrabs", fullname="Eugene H. Krabs")

We are able to construct these objects using the _names of the mapped columns as keyword arguments_ in the constructor. This is possible as the `User` class includes an __automatically generated__ `__init__()` constructor that was _provided by the ORM mapping_ so that we could create each object using _column names as keys in the constructor_.

In a similar manner as in our _Core examples_ of `Insert`, we __did not include a primary key__ (i.e. an entry for the `id` column), since we would like to make use of the _auto-incrementing primary key feature of the database_, `SQLite` in this case, which the ORM also integrates with. The value of the `id` attribute on the above objects, if we were to view it, displays itself as `None`.

In [8]:
print(f"{squidward=}")

squidward=User(id=None, name='squidward', fullname='Squidward Tentacles')


The `None` value is provided by `SQLAlchemy` to indicate that the __attribute has no value__ as of yet. _SQLAlchemy-mapped attributes_ always __return a value in Python__ and _don't raise AttributeError if they're missing_, when dealing with a new object that has not had a value assigned.

At the moment, our two objects above are said to be in a _state_ called `transient` - they are __`not associated` with any database state__ and are __yet to be associated with a `Session` object__ that can _generate_ `INSERT` statements for them.

##### Adding objects to a Session

To illustrate the _addition_ process step by step, we will create a `Session` without using a _context manager_ (and hence we must make sure we close it later!).

In [9]:
session = Session(engine)

The __objects are then added to the `Session`__ using the `Session.add()` method. When this is called, the objects are in a _state_ known as __`pending`__ and _have not been inserted_ yet.

In [10]:
session.add(squidward)
session.add(krabs)

When we have _pending objects_, we can see this _state_ by looking at a _collection_ on the `Session` called `Session.new`.

In [11]:
session.new

IdentitySet([User(id=None, name='squidward', fullname='Squidward Tentacles'), User(id=None, name='ehkrabs', fullname='Eugene H. Krabs')])

The above view is using a __collection__ called `IdentitySet` that is _essentially_ a `Python set` that __hashes on object identity__ in all cases (i.e., using Python built-in `id()` function, rather than the Python `hash()` function).

##### Flushing

The `Session` makes use of a _pattern_ known as __`unit of work`__. This generally means it __accumulates changes one at a time__, but _does not actually communicate them_ to the database __until needed__. This allows it to __make better decisions__ about _how SQL DML should be emitted_ in the `transaction` based on a given __set of `pending` changes__. When it does _emit SQL to the database to push out the current set of changes_, the process is known as a `flush`. We can illustrate the _flush process_ __manually__ by calling the `Session.flush()` method.

In [12]:
session.flush()

2022-10-08 00:30:29,233 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-08 00:30:29,236 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-08 00:30:29,237 INFO sqlalchemy.engine.Engine [generated in 0.00133s] ('squidward', 'Squidward Tentacles')
2022-10-08 00:30:29,238 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-08 00:30:29,240 INFO sqlalchemy.engine.Engine [cached since 0.004461s ago] ('ehkrabs', 'Eugene H. Krabs')


Above we observe the `Session` was _first called_ upon to __emit SQL__, so it _created a new transaction_ and __emitted__ the _appropriate INSERT statements_ for the two objects. The `transaction` now __remains open__ _until_ we call any of the `Session.commit()`, `Session.rollback()`, or `Session.close()` methods of `Session`.

While `Session.flush()` may be used to _manually_ __push out pending changes__ to the _current transaction_, it is _usually unnecessary_ as the `Session` features a behavior known as __`autoflush`__, which we will illustrate later. It also _flushes out changes_ whenever `Session.commit()` is called.

##### Autogenerated primary key attributes

Once the _rows_ are __inserted__, the two _Python objects_ we've created are in a _state_ known as __`persistent`__, where they are _associated_ with the `Session` object in which they were __added or loaded__, and _feature lots of other behaviors_ that will be covered later.

Another effect of the `INSERT` that occurred was that the __ORM has retrieved the `new primary key identifiers` for each new object__; _internally_ it normally uses the same `CursorResult.inserted_primary_key` accessor we introduced previously. The `squidward` and `krabs` objects now have these _new primary key identifiers_ __associated with them__ and we can view them by _acesssing_ the `id` attribute.

In [13]:
print(f"{squidward.id = }")
print(f"{krabs.id = }")

squidward.id = 1
krabs.id = 2


> ##### Tip
> 
> __Why did the ORM emit `two separate INSERT statements` when it could have used `executemany`?__
> As we'll see in the next section, the `Session` when __flushing objects__ always _needs to know the primary key of newly inserted objects_. If a feature such as _SQLite's autoincrement_ is used (other examples include __PostgreSQL__ `IDENTITY` or `SERIAL`, `using sequences`, etc.), the `CursorResult.inserted_primary_key` feature usually requires that __each `INSERT`__ is __emitted one row at a time__. If we had _provided values for the primary keys ahead of time_, the ORM would have been __able to optimize the operation better__. Some database backends such as `psycopg2` __can__ also `INSERT` _many rows at once_ while still being __able to retrieve the primary key values__.

##### Getting Objects by Primary Key from the Identity Map

The `primary key identity` of the objects are __significant__ to the `Session`, as the objects are now _linked to this identity in memory_ using a feature known as the `identity map`. The `identity map` is an __in-memory store__ that __links__ all objects _currently loaded_ __`in memory`__ to their `primary key identity`. We can observe this by _retrieving_ one of the above objects using the `Session.get()` method, which will _return an entry from the identity map_ __if locally present__, _otherwise_ __emitting a `SELECT`__.

In [14]:
some_squidward = session.get(User, 1)
print(f"{some_squidward = }")

some_squidward = User(id=1, name='squidward', fullname='Squidward Tentacles')


The _important_ thing to note about the `identity map` is that it __maintains a unique instance__ of a _particular Python object per a particular database identity_, __within the scope__ of a particular `Session` object. We may observe that the `some_squidward` refers to the __same object__ as that of `squidward` previously.

In [15]:
print(f"{some_squidward is squidward = }")

some_squidward is squidward = True


The `identity map` is a _critical feature_ that __allows complex sets of objects__ to be __manipulated__ _within a transaction without things getting out of sync_.

##### Committing

There's much more to say about how the `Session` works which will be discussed further. For now we will __commit the transaction__ so that we can build up knowledge on how to `SELECT` rows before examining more ORM behaviors and features.

In [16]:
session.commit()

2022-10-08 00:32:06,267 INFO sqlalchemy.engine.Engine COMMIT


The above operation will __commit the transaction__ that was _in progress_. The `objects` which we've dealt with are __still attached__ to the `Session`, which is a _state_ they stay in __until the `Session` is closed__ (which is introduced at `Closing a Session`).

> ##### Tips
> 
> An _important_ thing to note is that `attributes on the objects` that we just worked with have been __`expired`__, _meaning_, when we next _access any attributes_ on them, the `Session` will __start a new transaction and re-load their state__. This _option_ is __sometimes problematic__ for both _performance reasons_, or if one wishes to _use the objects after closing_ the __`Session`__ (which is known as the `detached state`), as they _will not have any state_ and _will have no_ `Session` with which _to load that state_, leading to __`"detached instance"` errors__. The behavior is __controllable__ using a parameter called `Session.expire_on_commit`. More on this is at `Closing a Session`.