## Working with Related Objects

In this section, we will cover one more _essential ORM concept_, which is __how the ORM interacts with mapped classes that refer to other objects__. In the section `Declaring Mapped Classes`, the _mapped class_ examples made use of a construct called `relationship()`. This construct __defines a linkage__ between _two different mapped classes_, or from a _mapped class to itself_, the _latter_ of which is called a __`self-referential relationship`__. To describe the _basic idea_ of `relationship()`, first we'll review the mapping in short form, omitting the _Column mappings_ and other directives.

In [1]:
from sqlalchemy import (
    Column, Integer, String, ForeignKey, create_engine,
    select, insert, bindparam,
)
from sqlalchemy.orm import (
    Session, registry, aliased, with_parent, relationship,
    selectinload, joinedload, contains_eager,
)

In [2]:
engine = create_engine("sqlite+pysqlite:///:memory:", echo=True, future=True)
mapped_registry = registry()
Base = mapped_registry.generate_base()
session = Session(engine)

In [3]:
class User(Base):
    __tablename__ = "user_account"
    
    id = Column(Integer, primary_key=True)
    name = Column(String(30))
    fullname = Column(String)
    
    addresses = relationship("Address", back_populates="user", lazy="selectin")
    
    def __repr__(self):
        return f"User(id={self.id!r}, name={self.name!r}, fullname={self.fullname!r})"

In [4]:
class Address(Base):
    __tablename__ = "address"
    
    id = Column(Integer, primary_key=True)
    email_address = Column(String, nullable=False)
    user_id = Column(Integer, ForeignKey("user_account.id"))
    
    user = relationship("User", back_populates="addresses")
    
    def __repr__(self):
        return f"Address(id={self.id!r}, email_address={self.email_address!r})"

In [5]:
mapped_registry.metadata.create_all(engine)

2022-10-12 11:05:33,013 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:33,014 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("user_account")
2022-10-12 11:05:33,016 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:33,017 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("user_account")
2022-10-12 11:05:33,018 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:33,021 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("address")
2022-10-12 11:05:33,024 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:33,025 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("address")
2022-10-12 11:05:33,026 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:33,028 INFO sqlalchemy.engine.Engine 
CREATE TABLE user_account (
	id INTEGER NOT NULL, 
	name VARCHAR(30), 
	fullname VARCHAR, 
	PRIMARY KEY (id)
)


2022-10-12 11:05:33,031 INFO sqlalchemy.engine.Engine [no key 0.00241s] ()
2022-10-12 11:05:33,033 INFO sqlalchemy.engine.Engine 
C

In [6]:
spongebob = User(name="spongebob", fullname="Spongebob Squarepants")
sandy = User(name="sandy", fullname="Sandy Cheeks")
patrick = User(name="patrick", fullname="Patrick Star")
squidward = User(name="squidward", fullname="Squidward Tentacles")
krabs = User(name="ehkrabs", fullname="Eugene H. Krabs")

In [7]:
session.add(spongebob)
session.add(sandy)
session.add(patrick)
session.add(squidward)
session.add(krabs)

In [8]:
session.flush()
session.commit()

2022-10-12 11:05:33,384 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:33,389 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-12 11:05:33,390 INFO sqlalchemy.engine.Engine [generated in 0.00145s] ('spongebob', 'Spongebob Squarepants')
2022-10-12 11:05:33,393 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-12 11:05:33,395 INFO sqlalchemy.engine.Engine [cached since 0.005768s ago] ('sandy', 'Sandy Cheeks')
2022-10-12 11:05:33,396 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-12 11:05:33,399 INFO sqlalchemy.engine.Engine [cached since 0.009727s ago] ('patrick', 'Patrick Star')
2022-10-12 11:05:33,400 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-12 11:05:33,400 INFO sqlalchemy.engine.Engine [cached since 0.01144s ago] ('squidward', 'Squidward Tentacles')
2022-10-12 11:05:33,402 INFO sqla

In [9]:
addresses = [
    {"username": "spongebob", "email_address": "spongebob@sqlalchemy.org"},
    {"username": "sandy", "email_address": "sandy@sqlalchemy.org"},
    {"username": "sandy", "email_address": "sandy@squirrelpower.org"},
]
scalar_subq = (
    select(User.id).
    where(User.name == bindparam("username")).
    scalar_subquery()
)

with engine.connect() as conn:
    result = conn.execute(
        insert(Address).values(user_id=scalar_subq),
        [
            {"username": 'spongebob', "email_address": "spongebob@sqlalchemy.org"},
            {"username": 'sandy', "email_address": "sandy@sqlalchemy.org"},
            {"username": 'sandy', "email_address": "sandy@squirrelpower.org"},
        ]
    )
    conn.commit()

2022-10-12 11:05:33,521 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:33,523 INFO sqlalchemy.engine.Engine INSERT INTO address (email_address, user_id) VALUES (?, (SELECT user_account.id 
FROM user_account 
WHERE user_account.name = ?))
2022-10-12 11:05:33,524 INFO sqlalchemy.engine.Engine [generated in 0.00365s] (('spongebob@sqlalchemy.org', 'spongebob'), ('sandy@sqlalchemy.org', 'sandy'), ('sandy@squirrelpower.org', 'sandy'))
2022-10-12 11:05:33,526 INFO sqlalchemy.engine.Engine COMMIT


Above, the `User` class now has an _attribute_ `User.addresses` and the `Address` class has an _attribute_ `Address.user`. The `relationship()` construct will be used to __inspect the table relationships__ between the `Table` objects that are __mapped__ to the `User` and `Address` classes. As the `Table` object representing the _address table_ has a `ForeignKeyConstraint` which refers to the *user_account table*, the `relationship()` can _determine_ __unambiguously__ that there is a __one to many relationship__ _from_ `User.addresses` _to_ `User`; _one particular row_ in the `user_account` table __may be referred towards__ by _many rows_ in the `address` table.

_All one-to-many relationships_ __naturally correspond__ to a _many to one relationship_ in the _other direction_, in this case the one noted by `Address.user`. The `relationship.back_populates` parameter, seen above configured on both `relationship()` objects __referring__ to the _other name_, establishes that each of these two `relationship()` constructs should be __considered to be complimentary__ to _each other_; we will see how this plays out in the next section.

#### Persisting and Loading Relationships

We can start by _illustrating_ what `relationship()` does to instances of objects. If we make a new `User` object, we can note that there is a _Python list_ when we access the `.addresses` element.

In [10]:
u1 = User(name="pkrabs", fullname="Pearl Krabs")
print(f"{u1.addresses = }")

u1.addresses = []


This object is a _SQLAlchemy-specific version_ of _Python list_ which has the __ability to track and respond to changes__ made to it. The _collection_ also appeared __automatically__ when we _accessed the attribute_, even though we __never assigned__ it to the object. This is similar to the behavior noted at `Inserting Rows with the ORM` where it was observed that _column-based attributes_ to which we _don't explicitly assign a value_ also display as `None` __automatically__, rather than raising an `AttributeError` as would be Python's usual behavior.

As the `u1` object is still __transient__ and the _list_ that we got from `u1.addresses` has __not been mutated__ (i.e. _appended or extended_), it's __not actually `associated` with the object__ yet, but as we _make changes_ to it, it will become _part of the state_ of the `User` object.

The _collection_ is specific to the `Address` class which is the _only type of Python object_ that may be __persisted within it__. Using the `list.append()` method we may __add an Address object__.

In [11]:
a1 = Address(email_address="pearl.krabs@gmail.com")
u1.addresses.append(a1)

At this point, the `u1.addresses` _collection_ as expected _contains_ the __new `Address` object__.

In [12]:
print(f"{u1.addresses = }")

u1.addresses = [Address(id=None, email_address='pearl.krabs@gmail.com')]


As we _associated_ the `Address` object with the `User.addresses` _collection_ of the `u1` instance, another behavior also occurred, which is that the `User.addresses` _relationship_ __synchronized itself__ with the `Address.user` _relationship_, such that we __can navigate__ not only from the _User object to the Address object_, we __can also navigate__ from the _Address object back to the "parent" User object_.

In [13]:
print(f"{a1.user = }")

a1.user = User(id=None, name='pkrabs', fullname='Pearl Krabs')


This _synchronization_ occurred as a result of our _use of the_ `relationship.back_populates` parameter between the two `relationship()` objects. This parameter __names__ _another_ `relationship()` for which _complementary attribute_ __assignment/list mutation__ should occur. It will __work equally well__ in the _other direction_, which is that if we _create another_ `Address` _object_ and assign to its `Address.user` attribute, that `Address` becomes part of the `User.addresses` _collection_ on that `User` object.

In [14]:
a2 = Address(email_address="pearl@aol.com", user=u1)
print(f"{u1.addresses = }")

u1.addresses = [Address(id=None, email_address='pearl.krabs@gmail.com'), Address(id=None, email_address='pearl@aol.com')]


We actually made use of the `user` parameter as a _keyword argument_ in the `Address` constructor, which is __accepted just like any other mapped attribute__ that was _declared_ on the `Address` class. It is __equivalent__ to _assignment_ of the `Address.user` attribute after the fact.

In [15]:
# equivalent effect as a2 = Address(user=u1)
a2.user = u1

##### Cascading Objects into the Session

We now have a `User` and two `Address` objects that are _associated_ in a __bidirectional__ structure __in memory__, but as noted previously in `Inserting Rows with the ORM`, these objects are said to be in the __`transient state`__ until they are _associated_ with a `Session` object.

We make use of the `Session` that's __still ongoing__, and note that when we apply the `Session.add()` method to the lead `User` object, the _related_ `Address` object __also gets added__ to that _same_ `Session`.

In [16]:
session.add(u1)
print(f"{u1 in session = }")
print(f"{a1 in session = }")
print(f"{a2 in session = }")

u1 in session = True
a1 in session = True
a2 in session = True


The above behavior, where the `Session` _received_ a `User` object, and followed along the `User.addresses` _relationship_ to _locate a related_ `Address` object, is known as the __`save-update cascade`__.

The _three objects_ are now in the __pending state__; this means they are __ready__ to be the _subject_ of an `INSERT` operation but this has __not yet proceeded__; all _three objects_ have __no `primary key` assigned__ yet, and in addition, the `a1` and `a2` objects have an _attribute_ called `user_id` which refers to the `Column` that has a `ForeignKeyConstraint` referring to the `user_account.id` column; these are also `None` as the _objects_ are __not yet associated__ with a real database row.

In [17]:
print(f"{u1.id = }")
print(f"{a1.user_id = }")

u1.id = None
a1.user_id = None


It's at this _stage_ that we can see the very _great utility_ that the `unit of work process` provides; recall in the section `INSERT` usually _generates_ the `"values"` clause __automatically__, rows were inserted into the `user_account` and `address` tables using _some elaborate syntaxes_ in order to __automatically associate__ the `address.user_id` columns with those of the `user_account` rows. Additionally, it was __necessary__ that we __emit__ `INSERT` for `user_account` rows __first__, _before those of_ `address`, since rows in `address` are __dependent__ on their _parent row_ in `user_account` for a value in their `user_id` column.

When using the `Session`, all this _tedium_ is handled for us and even the most _die-hard_ SQL purist can __benefit from automation__ of `INSERT`, `UPDATE` and `DELETE` statements. When we `Session.commit()` the transaction _all steps invoke_ in the __correct order__, and furthermore the __newly generated `primary key`__ of the `user_account` row is applied to the `address.user_id` column _appropriately_.

In [18]:
session.commit()

2022-10-12 11:05:34,743 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:34,744 INFO sqlalchemy.engine.Engine INSERT INTO user_account (name, fullname) VALUES (?, ?)
2022-10-12 11:05:34,745 INFO sqlalchemy.engine.Engine [cached since 1.356s ago] ('pkrabs', 'Pearl Krabs')
2022-10-12 11:05:34,746 INFO sqlalchemy.engine.Engine INSERT INTO address (email_address, user_id) VALUES (?, ?)
2022-10-12 11:05:34,747 INFO sqlalchemy.engine.Engine [generated in 0.00073s] ('pearl.krabs@gmail.com', 6)
2022-10-12 11:05:34,748 INFO sqlalchemy.engine.Engine INSERT INTO address (email_address, user_id) VALUES (?, ?)
2022-10-12 11:05:34,749 INFO sqlalchemy.engine.Engine [cached since 0.002269s ago] ('pearl@aol.com', 6)
2022-10-12 11:05:34,751 INFO sqlalchemy.engine.Engine COMMIT


#### Loading Relationships

In the last step, we called `Session.commit()` which emitted a `COMMIT` for the _transaction_, and then per `Session.commit.expire_on_commit` _expired all objects_ so that they __refresh__ for the _next transaction_.

When we __next__ _access an attribute_ on these objects, we'll see the `SELECT` __emitted__ for the _primary attributes_ of the row, such as when we view the __newly generated `primary key`__ for the `u1` object.

In [19]:
print(f"{u1.id = }")

2022-10-12 11:05:34,866 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:34,869 INFO sqlalchemy.engine.Engine SELECT user_account.id AS user_account_id, user_account.name AS user_account_name, user_account.fullname AS user_account_fullname 
FROM user_account 
WHERE user_account.id = ?
2022-10-12 11:05:34,871 INFO sqlalchemy.engine.Engine [generated in 0.00147s] (6,)
2022-10-12 11:05:34,875 INFO sqlalchemy.engine.Engine SELECT address.id AS address_id, address.email_address AS address_email_address, address.user_id AS address_user_id 
FROM address 
WHERE ? = address.user_id
2022-10-12 11:05:34,876 INFO sqlalchemy.engine.Engine [generated in 0.00131s] (6,)
u1.id = 6


The _u1_ `User` object now has a _persistent collection_ `User.addresses` that we may also access. As this _collection_ consists of an _additional set of rows_ from the `address` table, when we _access this collection_ as well we again see a __lazy load emitted__ in order to __retrieve the objects__.

In [20]:
print(f"{u1.addresses = }")

u1.addresses = [Address(id=4, email_address='pearl.krabs@gmail.com'), Address(id=5, email_address='pearl@aol.com')]


_Collections_ and _related attributes_ in the SQLAlchemy ORM are __persistent in memory__; once the _collection_ or _attribute_ is `populated`, _SQL_ is __no longer emitted__ until that _collection_ or _attribute_ is `expired`. We __may access__ `u1.addresses` again as well as _add_ or _remove_ items and this __will not incur__ any new SQL calls.

In [21]:
print(f"{u1.addresses = }")

u1.addresses = [Address(id=4, email_address='pearl.krabs@gmail.com'), Address(id=5, email_address='pearl@aol.com')]


While the loading _emitted_ by __`lazy loading`__ can _quickly become expensive_ if we __don't take explicit steps__ to _optimize_ it, the _network of lazy loading_ at least is __fairly well optimized__ to __not perform redundant work__; as the `u1.addresses` collection was __refreshed__, per the _identity map_ these are in fact the __same `Address` instances__ as the `a1` and `a2` objects we've been dealing with already, so we're _done loading all attributes_ in this particular __object graph__.

In [22]:
print(f"{a1 = }")
print(f"{a2 = }")

a1 = Address(id=4, email_address='pearl.krabs@gmail.com')
a2 = Address(id=5, email_address='pearl@aol.com')


The issue of how `relationships` load, or not, is an entire subject onto itself. Some additional introduction to these concepts is later in this section at `Loader Strategies`.

#### Using Relationships in Queries

The previous section introduced the _behavior_ of the `relationship()` construct when working with _instances of a mapped class_, above, the `u1`, `a1` and `a2` instances of the `User` and `Address` classes. In this section, we introduce the _behavior_ of `relationship()` as it applies to _class level behavior of a mapped class_, where it _serves_ in several ways to help __automate the construction of SQL queries__.

##### Using Relationships to Join

The sections `Explicit FROM clauses and JOINs` and `Setting the ON Clause` introduced the usage of the `Select.join()` and `Select.join_from()` methods to __compose__ _SQL JOIN clauses_. In order to describe __how to join between tables__, these methods either __infer__ the `ON clause` based on the presence of a _single unambiguous_ `ForeignKeyConstraint` object within the _table metadata structure_ that links the two tables, or otherwise we may _provide_ an __explicit__ `SQL Expression construct` that __indicates a specific ON clause__.

When _using ORM entities_, an _additional mechanism_ is available to help us set up the `ON clause` of a _join_, which is to make use of the `relationship()` objects that we set up in our user _mapping_, as was demonstrated at `Declaring Mapped Classes`. The _class-bound attribute_ corresponding to the `relationship()` may be passed as the __single argument__ to `Select.join()`, where it serves to __indicate both the `right side of the join` as well as the `ON clause` `at once`__.

In [23]:
print(select(Address.email_address).select_from(User).join(User.addresses))

SELECT address.email_address 
FROM user_account JOIN address ON user_account.id = address.user_id


The presence of an ORM `relationship()` on a _mapping_ is __not used__ by `Select.join()` or `Select.join_from()` _if we don't specify it_; it is __not used__ for `ON clause` _inference_. This means, if we _join from User to Address without an ON clause_, it __works__ because of the `ForeignKeyConstraint` _between the two mapped_ `Table` _objects_, __not because__ of the `relationship()` objects on the `User` and `Address` classes.

In [24]:
print(select(Address.email_address).join_from(User, Address))

SELECT address.email_address 
FROM user_account JOIN address ON user_account.id = address.user_id


##### Joining between Aliased targets

In the section `ORM Entity Aliases` we introduced the `aliased()` construct, which is used to apply a _SQL alias_ to an _ORM entity_. When using a `relationship()` to help _construct SQL JOIN_, the use case where the _target of the join_ is to be an `aliased()` is suited by making use of the `PropComparator.of_type()` modifier. To demonstrate we will construct the same join illustrated at `ORM Entity Aliases` using the `relationship()` attributes to join instead.

In [25]:
address_alias_1 = aliased(Address)
address_alias_2 = aliased(Address)

In [26]:
print(
    select(User).
    join(User.addresses.of_type(address_alias_1)).
    where(address_alias_1.email_address == "patrick@aol.com").
    join(User.addresses.of_type(address_alias_2)).
    where(address_alias_2.email_address == "patrick@gmail.com")
)

SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account JOIN address AS address_1 ON user_account.id = address_1.user_id JOIN address AS address_2 ON user_account.id = address_2.user_id 
WHERE address_1.email_address = :email_address_1 AND address_2.email_address = :email_address_2


To make use of a `relationship()` to _construct a join_ from an __aliased entity__, the _attribute_ is __available__ from the `aliased()` construct directly.

In [27]:
user_alias_1 = aliased(User)
print(select(user_alias_1.name).join(user_alias_1.addresses))

SELECT user_account_1.name 
FROM user_account AS user_account_1 JOIN address ON user_account_1.id = address.user_id


##### Augmenting the ON Criteria

The `ON clause` generated by the `relationship()` construct may also be __augmented with additional criteria__. This is __useful__ both for _quick ways to limit the scope of a particular join over a relationship path_, and also for use cases like _configuring loader strategies_, introduced below at `Loader Strategies`. The `PropComparator.and_()` method accepts a _series of SQL expressions_ __positionally__ that will be joined to the `ON clause` of the _JOIN_ __via `AND`__. For example if we wanted to _JOIN from User to Address_ but also _limit the ON criteria_ to only _certain email addresses_.

In [28]:
stmt = select(User.fullname).join(
    User.addresses.and_(Address.email_address == "pearl.krabs@gmail.com")
)
session.execute(stmt).all()

2022-10-12 11:05:36,053 INFO sqlalchemy.engine.Engine SELECT user_account.fullname 
FROM user_account JOIN address ON user_account.id = address.user_id AND address.email_address = ?
2022-10-12 11:05:36,055 INFO sqlalchemy.engine.Engine [generated in 0.00170s] ('pearl.krabs@gmail.com',)


[('Pearl Krabs',)]

##### EXISTS forms: has()/any()

In the section `EXISTS subqueries`, we introduced the `Exists` object that provides for the _SQL EXISTS keyword_ in _conjunction_ with a `scalar subquery`. The `relationship()` construct provides for some _helper methods_ that may be used to __generate some common EXISTS styles of queries in terms of the relationship__.

For a _one-to-many relationship_ such as `User.addresses`, an `EXISTS` against the `address` table that __correlates back__ to the `user_account` table can be produced using `PropComparator.any()`. This method _accepts_ an __optional `WHERE` criteria__ to __limit the rows matched by the subquery__.

In [29]:
stmt = select(User.fullname).where(
    User.addresses.any(Address.email_address == "pearl.krabs@gmail.com")
)
session.execute(stmt).all()

2022-10-12 11:05:36,193 INFO sqlalchemy.engine.Engine SELECT user_account.fullname 
FROM user_account 
WHERE EXISTS (SELECT 1 
FROM address 
WHERE user_account.id = address.user_id AND address.email_address = ?)
2022-10-12 11:05:36,194 INFO sqlalchemy.engine.Engine [generated in 0.00107s] ('pearl.krabs@gmail.com',)


[('Pearl Krabs',)]

As `EXISTS` tends to be _more efficient_ __for `negative lookups`__, a _common query_ is to _locate entities_ where there are __no related entities present__. This is __succinct__ using a phrase such as `~User.addresses.any()`, to select for `User` entities that have __no related `Address` rows__.

In [30]:
stmt = select(User.fullname).where(~User.addresses.any())
session.execute(stmt).all()

2022-10-12 11:05:36,373 INFO sqlalchemy.engine.Engine SELECT user_account.fullname 
FROM user_account 
WHERE NOT (EXISTS (SELECT 1 
FROM address 
WHERE user_account.id = address.user_id))
2022-10-12 11:05:36,374 INFO sqlalchemy.engine.Engine [generated in 0.00130s] ()


[('Patrick Star',), ('Squidward Tentacles',), ('Eugene H. Krabs',)]

The `PropComparator.has()` method works in mostly the same way as `PropComparator.any()`, _except_ that it's __used for `many-to-one` relationships__, such as if we wanted to __locate all `Address` objects which belonged to `"pearl"`__.

In [31]:
stmt = select(Address.email_address).where(Address.user.has(User.name == "pkrabs"))
session.execute(stmt).all()

2022-10-12 11:05:36,494 INFO sqlalchemy.engine.Engine SELECT address.email_address 
FROM address 
WHERE EXISTS (SELECT 1 
FROM user_account 
WHERE user_account.id = address.user_id AND user_account.name = ?)
2022-10-12 11:05:36,496 INFO sqlalchemy.engine.Engine [generated in 0.00209s] ('pkrabs',)


[('pearl.krabs@gmail.com',), ('pearl@aol.com',)]

##### Common Relationship Operators

There are some _additional varieties_ of __SQL generation helpers__ that come with `relationship()`, including:

* __many to one equals comparison__ - a specific _object instance_ can be compared to __many-to-one relationship__, to select rows where the `foreign key` of the _target entity_ __matches__ the _primary key_ value of the _object_ given.

In [32]:
print(select(Address).where(Address.user == u1))

SELECT address.id, address.email_address, address.user_id 
FROM address 
WHERE :param_1 = address.user_id


* __many to one not equals comparison__ - the _not equals operator_ may also be used.

In [33]:
print(select(Address).where(Address.user != u1))

SELECT address.id, address.email_address, address.user_id 
FROM address 
WHERE address.user_id != :user_id_1 OR address.user_id IS NULL


* __object is contained in a one-to-many collection__ - this is _essentially_ the __one-to-many version__ of the `"equals" comparison`, select rows where the _primary key_ __equals__ the value of the _foreign key_ in a _related object_.

In [34]:
print(select(User).where(User.addresses.contains(a1)))

SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account 
WHERE user_account.id = :param_1


* __An object has a particular parent from a one-to-many perspective__ - the `with_parent()` function __produces a comparison__ that _returns rows_ which are __referred towards__ by a _given parent_, this is _essentially_ the __same as__ using the `"==" operator` with the __`"many-to-one"`__ side.

In [35]:
print(select(Address).where(with_parent(u1, User.addresses)))

SELECT address.id, address.email_address, address.user_id 
FROM address 
WHERE :param_1 = address.user_id


#### Loader Strategies

In the section `Loading Relationships` we introduced the concept that when we work with _instances of mapped objects_, _accessing the attributes_ that are mapped using `relationship()` in the _default case_ will __emit a lazy load__ when the _collection_ is __not populated__ in order to `load the objects` that _should be present_ in this _collection_.

`Lazy loading` is one of the __most famous__ _ORM patterns_, and is also the one that is __most controversial__. When _several dozen ORM objects in memory_ each refer to a _handful of unloaded attributes_, `routine manipulation` of these objects can __spin off many additional queries__ that __`can add up`__ (otherwise known as the __`N plus one problem`__), and to make matters __`worse`__ they are __emitted implicitly__. These _implicit queries_ `may not be noticed`, may __cause errors__ when they are _attempted_ __after__ there's _no longer a database transaction available_, or when using __alternative concurrency patterns__ such as `asyncio`, they actually __won't work__ at all.

At the same time, _lazy loading_ is a __vastly `popular` and `useful` pattern__ when it is _compatible_ with the __concurrency approach__ in use and __isn't otherwise causing problems__. For these reasons, SQLAlchemy's ORM places a _lot of emphasis_ on being able to `control` and `optimize` this _loading behavior_.

Above all, the __first step__ in _using ORM lazy loading_ __`effectively`__ is to `test the application`, `turn on SQL echoing`, and `watch the SQL` that is __emitted__. If there seem to be _lots of redundant_ `SELECT statements` that _look very much like_ they could be __`rolled into one` much more efficiently__, if there are `loads` __occurring inappropriately__ for objects that have been __detached__ from their `Session`, that's when to _look into using_ __`loader strategies`__.

_Loader strategies_ are represented as objects that __may be associated__ with a `SELECT statement` using the `Select.options()` method. They may be also __configured__ as _defaults_ for a `relationship()` using the `relationship.lazy` option.

In [36]:
for user_obj in session.execute(
    select(User).options(selectinload(User.addresses))
).scalars():
    print(f"{user_obj.addresses = }")

2022-10-12 11:05:37,197 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account
2022-10-12 11:05:37,198 INFO sqlalchemy.engine.Engine [generated in 0.00154s] ()
2022-10-12 11:05:37,203 INFO sqlalchemy.engine.Engine SELECT address.user_id AS address_user_id, address.id AS address_id, address.email_address AS address_email_address 
FROM address 
WHERE address.user_id IN (?, ?, ?, ?, ?, ?)
2022-10-12 11:05:37,206 INFO sqlalchemy.engine.Engine [generated in 0.00351s] (1, 2, 3, 4, 5, 6)
user_obj.addresses = [Address(id=1, email_address='spongebob@sqlalchemy.org')]
user_obj.addresses = [Address(id=2, email_address='sandy@sqlalchemy.org'), Address(id=3, email_address='sandy@squirrelpower.org')]
user_obj.addresses = []
user_obj.addresses = []
user_obj.addresses = []
user_obj.addresses = [Address(id=4, email_address='pearl.krabs@gmail.com'), Address(id=5, email_address='pearl@aol.com')]


Each _loader strategy_ object __adds some kind of information to the statement__ that will be __used later__ by the `Session` when it is _deciding how various attributes should be loaded and/or behave_ when they are accessed.

##### Selectin Load

The __most useful loader__ in modern SQLAlchemy is the `selectinload()` _loader option_. This option _solves the most common form_ of the __`"N plus one"`__ problem which is that of a _set of objects_ that __refer to related collections__. `selectinload()` will _ensure_ that _a particular collection_ for a full series of objects are __loaded up front using a single query__. It does this using a `SELECT` form that in most cases can be __emitted__ _against the related table alone_, __without the introduction__ of `JOIN`s or `subqueries`, and __only queries__ for those _parent objects_ for which the __collection isn't already loaded__. Below we illustrate `selectinload()` by _loading all_ of the __`User` objects__ and _all_ of their __related `Address` objects__; while we invoke `Session.execute()` __only once__, given a `select()` construct, when the database is accessed, there are in fact _two_ `SELECT` statements __emitted__, the _second_ one being to _fetch_ the __related `Address` objects__.

In [37]:
stmt = select(User).options(selectinload(User.addresses)).order_by(User.id)

for row in session.execute(stmt):
    print(f"{row.User.name} ({', '.join(a.email_address for a in row.User.addresses)})")

2022-10-12 11:05:37,324 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account ORDER BY user_account.id
2022-10-12 11:05:37,326 INFO sqlalchemy.engine.Engine [generated in 0.00156s] ()
2022-10-12 11:05:37,329 INFO sqlalchemy.engine.Engine SELECT address.user_id AS address_user_id, address.id AS address_id, address.email_address AS address_email_address 
FROM address 
WHERE address.user_id IN (?, ?, ?, ?, ?, ?)
2022-10-12 11:05:37,330 INFO sqlalchemy.engine.Engine [cached since 0.1269s ago] (1, 2, 3, 4, 5, 6)
spongebob (spongebob@sqlalchemy.org)
sandy (sandy@sqlalchemy.org, sandy@squirrelpower.org)
patrick ()
squidward ()
ehkrabs ()
pkrabs (pearl.krabs@gmail.com, pearl@aol.com)


##### Joined Load

The `joinedload()` _eager load strategy_ is the __oldest__ _eager loader_ in SQLAlchemy, which __augments__ the `SELECT` statement that's being _passed to the database_ with a `JOIN` (which _may be an outer or an inner join_ depending on options), which can then __load in related objects__.

The `joinedload()` strategy is _best suited_ towards __loading related `many-to-one` objects__, as this __`only requires`__ that _additional columns_ are __added to a primary entity row__ that would be __fetched in any case__. For _greater efficiency_, it also accepts an option `joinedload.innerjoin` so that an __`inner join` instead of an `outer join`__ may be used for a case such as below where we know that all `Address` objects have an _associated_ `User`.

In [38]:
stmt = (
    select(Address).options(joinedload(Address.user, innerjoin=True)).
    order_by(Address.id)
)

for row in session.execute(stmt):
    print(f"{row.Address.email_address} {row.Address.user.name}")

2022-10-12 11:05:37,463 INFO sqlalchemy.engine.Engine SELECT address.id, address.email_address, address.user_id, user_account_1.id AS id_1, user_account_1.name, user_account_1.fullname 
FROM address JOIN user_account AS user_account_1 ON user_account_1.id = address.user_id ORDER BY address.id
2022-10-12 11:05:37,465 INFO sqlalchemy.engine.Engine [generated in 0.00138s] ()
spongebob@sqlalchemy.org spongebob
sandy@sqlalchemy.org sandy
sandy@squirrelpower.org sandy
pearl.krabs@gmail.com pkrabs
pearl@aol.com pkrabs


`joinedload()` also __works for collections__, meaning __`one-to-many`__ _relationships_, however it has the _effect of multiplying out primary rows per related item in a recursive way_ that __grows the amount of data__ sent for a result set by _orders of magnitude_ for _nested collections_ and/or _larger collections_, so _its use vs. another option_ such as `selectinload()` should be __evaluated on a per-case basis__.

It's _important_ to note that the `WHERE` and `ORDER BY` criteria of the _enclosing_ `Select` statement __do not target the table rendered by `joinedload()`__. Above, it can be seen in the SQL that an __anonymous alias__ is applied to the `user_account` table such that is __not directly addressable__ in the query. This concept is discussed in more detail in the section `The Zen of Joined Eager Loading`.

The `ON clause` rendered by `joinedload()` __may be affected directly__ by using the `PropComparator.and_()` method described previously at `Augmenting the ON Criteria`; examples of this technique with _loader strategies_ are further below at `Augmenting Loader Strategy Paths`. However, more generally, `"joined eager loading"` may be applied to a `Select` that uses `Select.join()` using the approach described in the next section, `Explicit Join + Eager load`.

> ##### Tip
> 
> It's important to note that _many-to-one_`eager loads` are __often not necessary__, as the __`"N plus one"`__ problem is _much less prevalent_ in the common case. When _many objects_ all __refer to the same related object__, such as _many_ `Address` objects that each refer to the _same_ `User`, `SQL` will be __emitted only once__ for that `User` object using _normal lazy loading_. The `lazy load routine` will _look up the related object_ __by primary key__ in the _current_ `Session` __without emitting any SQL when possible__.

##### Explicit Join + Eager load

If we were to _load_ `Address` rows while _joining_ to the `user_account` table using a method such as `Select.join()` to __render the `JOIN`__, we _could also leverage_ that `JOIN` in order to __eagerly load__ the contents of the `Address.user` _attribute_ on each `Address` object returned. This is _essentially_ that we are using __`"joined eager loading"`__ but __rendering the `JOIN` ourselves__. This common use case is achieved by using the `contains_eager()` option. This option is __very similar__ to `joinedload()`, except that it _assumes_ we have __set up the `JOIN` ourselves__, and it instead _only indicates_ that _additional columns_ in the `COLUMNS` clause __should be loaded into related attributes__ on _each_ returned object.

In [39]:
stmt = (
    select(Address).
    join(Address.user).
    where(User.name == "pkrabs").
    options(contains_eager(Address.user)).
    order_by(Address.id)
)

for row in session.execute(stmt):
    print(f"{row.Address.email_address} {row.Address.user.name}")

2022-10-12 11:05:37,574 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname, address.id AS id_1, address.email_address, address.user_id 
FROM address JOIN user_account ON user_account.id = address.user_id 
WHERE user_account.name = ? ORDER BY address.id
2022-10-12 11:05:37,576 INFO sqlalchemy.engine.Engine [generated in 0.00170s] ('pkrabs',)
pearl.krabs@gmail.com pkrabs
pearl@aol.com pkrabs


Above, we __both__ _filtered the rows_ on `user_account.name` and also __loaded rows__ from `user_account` into the `Address.user` attribute of the returned rows. If we had __applied `joinedload()` separately__, we would get a SQL query that __unnecessarily__ _joins twice_.

In [40]:
stmt = (
    select(Address).
    join(Address.user).
    where(User.name == "pkrabs").
    options(joinedload(Address.user)).
    order_by(Address.id)
)
print(stmt)

SELECT address.id, address.email_address, address.user_id, user_account_1.id AS id_1, user_account_1.name, user_account_1.fullname 
FROM address JOIN user_account ON user_account.id = address.user_id LEFT OUTER JOIN user_account AS user_account_1 ON user_account_1.id = address.user_id 
WHERE user_account.name = :name_1 ORDER BY address.id


##### Augmenting Loader Strategy Paths

In `Augmenting the ON Criteria` we illustrated __how to add arbitrary criteria__ to a `JOIN` _rendered with_ `relationship()` to also _include additional criteria_ in the `ON clause`. The `PropComparator.and_()` method is in fact __generally available__ for _most loader options_. For example, if we wanted to __re-load__ the _names of users_ and _their email addresses_, but __omitting__ the _email addresses_ with the `sqlalchemy.org` domain, we can apply `PropComparator.and_()` to the argument passed to `selectinload()` to _limit_ this criteria.

In [41]:
stmt = (
    select(User).
    options(
        selectinload(User.addresses.and_(~Address.email_address.endswith("sqlalchemy.org")))
    ).order_by(User.id).execution_options(populate_existing=True)
)

for row in session.execute(stmt):
    print(f"{row.User.name} ({', '.join(a.email_address for a in row.User.addresses)})")

2022-10-12 11:05:37,902 INFO sqlalchemy.engine.Engine SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account ORDER BY user_account.id
2022-10-12 11:05:37,903 INFO sqlalchemy.engine.Engine [generated in 0.00152s] ()
2022-10-12 11:05:37,910 INFO sqlalchemy.engine.Engine SELECT address.user_id AS address_user_id, address.id AS address_id, address.email_address AS address_email_address 
FROM address 
WHERE address.user_id IN (?, ?, ?, ?, ?, ?) AND (address.email_address NOT LIKE '%' || ?)
2022-10-12 11:05:37,912 INFO sqlalchemy.engine.Engine [generated in 0.00149s] (1, 2, 3, 4, 5, 6, 'sqlalchemy.org')
spongebob ()
sandy (sandy@squirrelpower.org)
patrick ()
squidward ()
ehkrabs ()
pkrabs (pearl.krabs@gmail.com, pearl@aol.com)


A __very important__ thing to note above is that a `special option` is added with `.execution_options(populate_existing=True)`. This option which _takes effect when_ `rows` are _being fetched_ __indicates__ that the `loader option` we are using __should `replace` the `existing contents` of `collections` on the objects__, _if_ they are _already loaded_. As we are working with a _single_ `Session` __repeatedly__, the _objects_ we see __being loaded__ above are the __same `Python` instances__ as those that were _first persisted_ at the start of the ORM section of this notebook.

##### Raiseload

One additional _loader strategy_ worth mentioning is `raiseload()`. This option is used to __completely block an application__ from having the `"N plus one"` problem __at all__ by causing what would _normally_ be a `lazy load` to __raise an error instead__. It has _two variants_ that are _controlled_ via the `raiseload.sql_only` option _to block_ either __lazy loads that require SQL__, versus __all `"load"` operations__ including those which __only__ need to __consult the current `Session`__.

One way to use `raiseload()` is to _configure_ it on `relationship()` __itself__, by _setting_ `relationship.lazy` to the value `"raise_on_sql"`, so that for a particular _mapping_, a _certain relationship_ will __never try to emit SQL__.

In [42]:
class UserStrict(Base):
    __tablename__ = "user_strict"
    
    id = Column(Integer, primary_key=True)
    name = Column(String(30))
    fullname = Column(String)
    
    addresses = relationship("AddressStrict", back_populates="user", lazy="raise_on_sql")
    
    def __repr__(self):
        return f"User(id={self.id!r}, name={self.name!r}, fullname={self.fullname!r})"

In [43]:
class AddressStrict(Base):
    __tablename__ = "address_strict"
    
    id = Column(Integer, primary_key=True)
    email_address = Column(String, nullable=False)
    user_id = Column(Integer, ForeignKey("user_strict.id"))
    
    user = relationship("UserStrict", back_populates="addresses", lazy="raise_on_sql")
    
    def __repr__(self):
        return f"Address(id={self.id!r}, email_address={self.email_address!r})"

In [44]:
mapped_registry.metadata.create_all(engine)

2022-10-12 11:05:38,341 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:38,343 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("user_account")
2022-10-12 11:05:38,344 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:38,346 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("address")
2022-10-12 11:05:38,347 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:38,348 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("user_strict")
2022-10-12 11:05:38,349 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:38,351 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("user_strict")
2022-10-12 11:05:38,352 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:38,354 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("address_strict")
2022-10-12 11:05:38,355 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-10-12 11:05:38,358 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("address_strict")
2022-10-12 11:05:38,359 INFO sqlalchemy

Using such a _mapping_, the `application` is __blocked from `lazy loading`__, indicating that a particular query would __need to specify a loader strategy__.

In [45]:
spongebob_strict = UserStrict(name="spongebob", fullname="Spongebob Squarepants")
sandy_strict = UserStrict(name="sandy", fullname="Sandy Cheeks")
patrick_strict = UserStrict(name="patrick", fullname="Patrick Star")
squidward_strict = UserStrict(name="squidward", fullname="Squidward Tentacles")
krabs_strict = UserStrict(name="ehkrabs", fullname="Eugene H. Krabs")

session.add(spongebob_strict)
session.add(sandy_strict)
session.add(patrick_strict)
session.add(squidward_strict)
session.add(krabs_strict)

session.flush()
session.commit()

2022-10-12 11:05:38,469 INFO sqlalchemy.engine.Engine INSERT INTO user_strict (name, fullname) VALUES (?, ?)
2022-10-12 11:05:38,471 INFO sqlalchemy.engine.Engine [generated in 0.00173s] ('spongebob', 'Spongebob Squarepants')
2022-10-12 11:05:38,473 INFO sqlalchemy.engine.Engine INSERT INTO user_strict (name, fullname) VALUES (?, ?)
2022-10-12 11:05:38,476 INFO sqlalchemy.engine.Engine [cached since 0.005309s ago] ('sandy', 'Sandy Cheeks')
2022-10-12 11:05:38,476 INFO sqlalchemy.engine.Engine INSERT INTO user_strict (name, fullname) VALUES (?, ?)
2022-10-12 11:05:38,477 INFO sqlalchemy.engine.Engine [cached since 0.008398s ago] ('patrick', 'Patrick Star')
2022-10-12 11:05:38,479 INFO sqlalchemy.engine.Engine INSERT INTO user_strict (name, fullname) VALUES (?, ?)
2022-10-12 11:05:38,480 INFO sqlalchemy.engine.Engine [cached since 0.01087s ago] ('squidward', 'Squidward Tentacles')
2022-10-12 11:05:38,481 INFO sqlalchemy.engine.Engine INSERT INTO user_strict (name, fullname) VALUES (?, ?)

In [46]:
addresses = [
    {"username": "spongebob", "email_address": "spongebob@sqlalchemy.org"},
    {"username": "sandy", "email_address": "sandy@sqlalchemy.org"},
    {"username": "sandy", "email_address": "sandy@squirrelpower.org"},
]
scalar_subq = (
    select(UserStrict.id).
    where(UserStrict.name == bindparam("username")).
    scalar_subquery()
)

with engine.connect() as conn:
    result = conn.execute(
        insert(AddressStrict).values(user_id=scalar_subq),
        [
            {"username": 'spongebob', "email_address": "spongebob@sqlalchemy.org"},
            {"username": 'sandy', "email_address": "sandy@sqlalchemy.org"},
            {"username": 'sandy', "email_address": "sandy@squirrelpower.org"},
        ]
    )
    conn.commit()

2022-10-12 11:05:38,643 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:38,644 INFO sqlalchemy.engine.Engine INSERT INTO address_strict (email_address, user_id) VALUES (?, (SELECT user_strict.id 
FROM user_strict 
WHERE user_strict.name = ?))
2022-10-12 11:05:38,646 INFO sqlalchemy.engine.Engine [generated in 0.00363s] (('spongebob@sqlalchemy.org', 'spongebob'), ('sandy@sqlalchemy.org', 'sandy'), ('sandy@squirrelpower.org', 'sandy'))
2022-10-12 11:05:38,647 INFO sqlalchemy.engine.Engine COMMIT


In [47]:
try:
    u1 = session.execute(select(UserStrict)).scalars().first()
    print(f"{u1.addresses = }")
except Exception as e:
    print(f"{type(e)}")
    print(f"{str(e)}")

2022-10-12 11:05:38,770 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-10-12 11:05:38,773 INFO sqlalchemy.engine.Engine SELECT user_strict.id, user_strict.name, user_strict.fullname 
FROM user_strict
2022-10-12 11:05:38,775 INFO sqlalchemy.engine.Engine [generated in 0.00201s] ()
<class 'sqlalchemy.exc.InvalidRequestError'>
'UserStrict.addresses' is not available due to lazy='raise_on_sql'


The `exception` would _indicate_ that this collection should be __loaded up front instead__.

In [48]:
u1 = session.execute(select(UserStrict).options(selectinload(UserStrict.addresses))).scalars().first()
print(f"{u1 = }")

2022-10-12 11:07:15,158 INFO sqlalchemy.engine.Engine SELECT user_strict.id, user_strict.name, user_strict.fullname 
FROM user_strict
2022-10-12 11:07:15,160 INFO sqlalchemy.engine.Engine [generated in 0.00208s] ()
2022-10-12 11:07:15,165 INFO sqlalchemy.engine.Engine SELECT address_strict.user_id AS address_strict_user_id, address_strict.id AS address_strict_id, address_strict.email_address AS address_strict_email_address 
FROM address_strict 
WHERE address_strict.user_id IN (?, ?, ?, ?, ?)
2022-10-12 11:07:15,167 INFO sqlalchemy.engine.Engine [generated in 0.00155s] (1, 2, 3, 4, 5)
u1 = User(id=1, name='spongebob', fullname='Spongebob Squarepants')


The `lazy="raise_on_sql"` option __tries to be smart__ about `many-to-one relationships` as well; above, if the `Address.user` attribute of an `Address` object were _not loaded_, but that `User` object _were locally present_ in the __same `Session`__, the `"raiseload"` strategy __would not raise an error__.