# SQLAlchemy ORM (1.4) Joins & Relationships
This notebook covers the following:
- Foreign Key Relationships
  - One to Many (1:N)
  - One to One (1:1)
  - Many to Many (N:M)
- Lazy loading (and how to disable it)
- Self referential foreign keys
- Joins on columns, without keys


# Setup


In [None]:
import sqlalchemy as sa
from sqlalchemy import orm
from utils import rollback, logs, DEBUG

In [None]:
# First 'connect' to a database.
# The engine takes care of the Dialect (language) and Driver (communication protocol).
# This line creates an in-memory database (using SQLite)
# This is nice, because it means the notebooks won't cross-contaminate.
engine = sa.create_engine('sqlite:///')
con = engine.connect()

In [None]:
# Creating a metadata object, which is usually shared between objects belonging to the same database.
# More advanced features allow Metadata to be tied to a specific database.
meta = sa.MetaData()

In [None]:
# Then create a base class from which all model implementations will inherit.
# This also allows child classes to be 'registered', letting them to be discovered for migrations more easily.
class Base(orm.DeclarativeBase):
    metadata = meta

# Relationships

The *relationship* is a pure ORM feature that tells a Session how to treat the Model.<br>
This includes:

- Loading type: eager or lazy (default)
- Loading strategy: left join / separate query
- Custom cascade behavior.

This isusese `sqlalchemy.orm.relationship`.

**Alembic:** Anyone using Alembic for database migrations needs to know:<br>
Relationships are application-level constructs that do not influence datbase design.<br>
Modifying a relationship will not be picked up by Alembic.

## One-to-Many (1:N)
The one-to-many relationship is the easiest one to define in sqlalchemy.

See also: [SQLAlchemy Docs](https://docs.sqlalchemy.org/en/20/orm/basic_relationships.html#one-to-many)

In [None]:
class Author(Base):
    __tablename__ = 'authors'
    id = sa.Column(sa.Integer, primary_key=True, autoincrement=True)
    name = sa.Column(sa.VARCHAR(255), default=None, nullable=True)
    posts = orm.relationship('Post', back_populates='author')
    
class Post(Base):
    __tablename__ = 'posts'
    id = sa.Column(sa.Integer, primary_key=True, autoincrement=True)
    author_id = sa.Column(Author.id.type, sa.ForeignKey(Author.id))
    author = orm.relationship(Author, back_populates='posts')
    
    title = sa.Column(sa.VARCHAR(255), default=None, nullable=True)
    text = sa.Column(sa.TEXT)

There is a lot going on in this piece of Python code.

1. Both Author and Post define a relationship to the other.
2. They mention one another's attribute name in 'back_populates'.
3. The do *not* show cardinality (1:1, 1:M, N:M)
4. The ForeignKey is not referenced by the relationships.

The relation between ForeignKey and relationship is determined when the class gets used.<br>
The relationship objects get resolved on the table level.<br>
As long as one table references the other, this should work (some exceptions not withstanding).

The tables defined are a little funky because there are many ways to define a behavior.
```python
def relationship(target: str | Type | Callable, **kwargs):
    ...
```
When the target is a `str`, it is expecting the name of a class.

When the target is a `Type`, is is expecting an ORM class.

When the target is a Callable, it is expecting a function that returns a class.<br>
This can help with Forward Declaration.<br>
When declaring classes in order, a parent might not have a *nice* reference the the child class.<br>
In the code above, `orm.relationship('Post', ...)` could be replaced by `orm.relationship(lambda: Post, ...)`.<br>
This works because of how the Python compiler works.

In [None]:
Base.metadata.create_all(engine)

In [None]:
with rollback(con):
    with  logs(), orm.Session(con) as session, session.begin():
        author = Author(name='jack')
        # Implied '.add' to the User object
        Post(title='Hello World', text='Lorem Ipsum', author=author)
        Post(title='Doing SQLALchemy', text='Lorem Ipsum', author=author)
        # Add user to the session.
        session.add(author)

    with orm.Session(con) as session, session.begin(), logs():
        for author in session.execute(sa.select(Author)).scalars():
            print('> Posts by user:', author.name)
            for post in author.posts:
                print('>', post.title)

---
The code above demonstrates how relationships have implications across objects.<br>
Both posts reference to user, but were never explicitly added to the session.

The logs reveal 3 things (assuming SQLite):
1. Posts were never explicitly added to the session, they only referenced the new Author, and they still got saved to the database.
2. The Post objects are inserted on a 1-by-1 basis
3. Iterating over `author.posts` caused another query for on-demand/lazy loading


## One-to-One (1:1)
The classic Foreign Key in database schemas does not *enforce* 1-to-1 on its own.<br>
In SQLAlchemy, this also boils down to using the `relationship` system to make this explicit.

In effect, it will tell the ORM not to use a list when resolving the value.<br>
This is done via the aptly named 'uselist' parameter.

See also: [SQLAlchemy Docs](https://docs.sqlalchemy.org/en/20/orm/basic_relationships.html#one-to-one)

In [None]:
class Person(Base):
    __tablename__ = 'persons'
    id = sa.Column(sa.INTEGER, primary_key=True, autoincrement=True)
    name = sa.Column(sa.VARCHAR(255))
    address = orm.relationship(lambda: Address, back_populates='resident', uselist=False)
    
class Address(Base):
    __tablename__ = 'addresses'
    id = sa.Column(sa.INTEGER,  primary_key=True, autoincrement=True)
    resident_id = sa.Column(Person.id.type, sa.ForeignKey(Person.id))
    street_name = sa.Column(sa.VARCHAR(255))
    resident = orm.relationship(Person, back_populates='address')

In [None]:
Base.metadata.create_all(engine)

In [None]:
with rollback(con):
    with orm.Session(con) as session, session.begin():
        person = Person(name='jack')
        address = Address(street_name='Lorem Ipsum', resident=person)
        session.add(person)

    with orm.Session(con) as session, session.begin():
        for person in session.execute(sa.select(Person)).scalars():
            print(person.name, person.address)

## Many-to-Many (N:M)
As with regular SQL, the many-to-many relationship requires a linking table.<br>
This is where a little bit of Core functionality comes into play.

See also: [SQLAlchemy Docs](https://docs.sqlalchemy.org/en/20/orm/basic_relationships.html#many-to-many)


In [None]:
class User(Base):
    __tablename__ = 'users'
    id = sa.Column(sa.INTEGER, primary_key=True, autoincrement=True)
    name = sa.Column(sa.VARCHAR(255))
    folders = orm.relationship(lambda: Folder, secondary=lambda: sharing_table, back_populates='users')

class Folder(Base):
    __tablename__ = 'folders'
    id = sa.Column(sa.INTEGER, primary_key=True, autoincrement=True)
    name = sa.Column(sa.VARCHAR(255))
    users = orm.relationship(lambda: User, secondary=lambda: sharing_table, back_populates='folders')
    
sharing_table = sa.Table(
    "shares",
    Base.metadata,
    sa.Column("user_id", User.id.type, sa.ForeignKey(User.id), primary_key=True),
    sa.Column("folder_id", Folder.id.type, sa.ForeignKey(Folder.id), primary_key=True),
)



In [None]:
Base.metadata.create_all(engine)

In [None]:
with rollback(con):
    with orm.Session(con) as session, session.begin():
        alice = User(name='Alice')
        bob = User(name='Bob')
        memes = Folder(name='memes')
        work = Folder(name='work')
        session.add_all([alice, bob, memes, work])
        work.users.extend((alice, bob))
        memes.users.append(alice)
        
        print('Alice:')
        for entry in alice.folders:
            print('>', entry.name)
            
        print('Bob:')
        for entry in bob.folders:
            print('>', entry.name)
        

## Relationship with Self

# Eager/Lazy Loading

SQLAlchemy provides four core types of loading:
1. Eager, loading data with the initial query.
2. Lazy, this means only loading data when it is accessd.
3. Error, raising an error when it is accessed.<br>Some prefer errors to lazy loading for testing or performance reasons like N+1.
4. Not Loading, the data is returned as NULL or an empty list.
There are *many* different types of loading.

See also: [SQLAlchemy Docs](https://docs.sqlalchemy.org/en/14/orm/loading_relationships.html#relationship-loading-techniques)

## Types of Loading
Before jumping into demos, it might (for once) be better to know what each load does ahead of time.<br>
Here is a rough summation of the loading types and their uses.<br>
This will make it clear what to look out for when logging is turned on.

Understand that all of the following examples will create queries which are specific to the ORM.<br>
The 'select' clause may be manipulated by the Session object, but it will always return the columns/objects you expect.<br>
This means an ORM 'joinedload' will not return data like a standard 'join' would.

See also: [*The Zen of Joined Eager Loading*](https://docs.sqlalchemy.org/en/20/orm/queryguide/relationships.html#joined-eager-loading) for a few implementation details.

All of the following objects show a Design and Options example.<br>
The *Design* is part of the class, while the *Options* are used for the query builder.<br>
Below is a quick example for lazyload.

```python
select(Model).options(orm.lazyload(Model.children))
```

### Lazy
This is the default behavior for ORM models.<br>
When code tries to access a member that isn't loaded, the associated session will have to fetch the data.

Design: `relationship(..., lazy='select')`<br>
Options: `orm.lazyload(Model.children)`


### Join
This is one of the more common eager load behaviors.<br>
In short, it performs a left-join to resolve child elements.<br>
The advantage is that the server still only executes one query, so only a single round-trip.

Design(Left Join): `relationship(..., lazy='joined')`<br>
Design(Inner Join): `relationship(..., lazy='joined', innerjoin=True)`<br>
Options(Left Join): `orm.joinload(Model.children)`<br>
Options(Inner Join): `orm.joinload(Model.children, innerjoin=True)`

### IN
This is another popular way to eager-load items.<br>
The big disadvantage is that the session has to execute two queries.<br>
Query 1 will all the parents,<br>
Query 2 will get all the child objects with a matching parent_id.

The benefit here is that it does not duplicate data like a join might.<br>
The disadvantage is having to perform multiple queries / round-trips.

Design: `relationship(..., lazy='selectin')`<br>
Options: `orm.selectinload(Model.children)`

### Subquery
The Subquery reinterprets the original select statement to get the right data.<br>
In general, it is better to use `selectinload`unless you have special requirements.
Microsoft SQL Server *can* have [issues](https://docs.sqlalchemy.org/en/20/orm/queryguide/relationships.html#subquery-eager-loading) with `selectinload`.

Design: `relationship(..., lazy='subquery')`<br>
Options: `orm.subqueryload(Model.children)`

### Raise
There will be moments where lazy loading is completely undesirable.<br>
This type of loading is applied on returned instances, as opposed to the query.<br>
This might be for testing or performance reasons.

When using SQLAlchemy with an *async* engine, a lazy load might create a "GreenletException".<br>
Greenlets are an async tool for SQLAlchemy, which means they might also error for other reasons.<br>
It is recommended to add `lazy='raise'` to have a separate exception for **any** unexpected lazy load.

Design: `relationship(..., lazy='raise')` or `lazy='raise_on_sql`)<br>
Options: `orm.raiseload(Model.children)`<br>
Options: `orm..raiseload('*')`

### Noload
In some cases, lazy loading isn't desirable at all.<br>
This method will fake it by returning None or an empty list.

Design: `N/A` (?)
Options: `select(Model).noload(Model.children)`

### DefaultLoad
Finally, there is 'defaultload'.<br>
This can be used to restore the default loading behavior, effectively reassigning loading behavior.<br>
This can be useful for situations where part of the query was made by a different system.

Design: `N/A`<br>
Options: `orm.defaultload(Model.children)`

## Examples

### JoinedLoad

In [None]:
query = (
    sa.select(User).options(orm.noload(User.folders))
)
with rollback(con), logs(), orm.Session(con) as s, s.begin():
    result = s.execute(query).unique().scalars()
    for user in result:
        print(user.name, [f.name for f in user.folders])

In [None]:
con.commit()

### Eager Load: Select In

In [None]:
query = (
    sa.select(User).options(orm.joinedload(User.folders))
)
with rollback(con), logs(), orm.Session(con) as s, s.begin():
    result = s.execute(query).unique().scalars()
    for user in result:
        print(user.name, [f.name for f in user.folders])

In [None]:
class Order(Base):
    ...



In [None]:
x = sa.inspect(stmt)

In [None]:
for e in tuple(stmt.get_children()):
    print(e.name)
    

# Join without Relationship

In [None]:
joined = sa.select(OrderLine.id, OrderLine.description).join(Product, Product.id == OrderLine.product_id)

with orm.Session(engine) as s:
    for entry in s.execute(joined):
        print(entry)

# Cascade: ForeignKey vs relationship
SQLAlchemy can deal with cascades in two ways:
1. ForeignKey definition
2. Relationship definition



In [None]:
SQL