# SQLAlchemy ORM

![ORM Layer](./images/orm_layer.png)

ORM stands for Object Relational Mapper, and is a layer that maps database rows to python objects. When programming, we often prefer to work with objects, rather than primitive types, at the cost of some flexibility and transparency into the underlying SQL

As a general rule - the Core is better suited for analytical queries where we expect to get back many rows and ORM is better suited for applications where we often only need to work with one to a handful of rows at a time

## Defining the tables

As we're using the ORM, we need to define the objects that will map to the database. There are a few different ways to do this [mapping](https://docs.sqlalchemy.org/en/14/orm/mapping_styles.html#mapping-python-classes) in SQLAlchemy. The classic way is to create a Base class, and inherit from that.

In [1]:
import sqlalchemy as sa
from sqlalchemy.orm import declarative_base

Base = declarative_base()

class MyClass(Base):
    __tablename__ = "demo_table"
    
    # Note that ORM classes must define at least one primary_key
    class_id: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.String)

One of the many changes in SQLAlchemy 2.0 is the ability to register classes through a decorator, which can feel more inline with `dataclass` and `attrs` based classes

In [2]:
from sqlalchemy.orm import registry
import enum

mapper_registry = registry()

In [3]:
class StatusEnum(str, enum.Enum):
    gold = "gold"
    silver = "silver"
    bronze = "bronze"

In [4]:
@mapper_registry.mapped
class Address:
    __tablename__ = "addresses"
    
    address_id: int = sa.Column(sa.Integer, primary_key=True)
    street_name: str = sa.Column(sa.VARCHAR(50))
    street_number: int = sa.Column(sa.Integer)
    postnr: str = sa.Column(sa.VARCHAR(4))
    
    def __repr__(self):
        return f"<Address street_name={self.street_name} street_number={self.street_number} postnr={self.postnr}>"

The ORM layer autogenerates a SQLAlchemy Table, the same one we saw in Core

In [5]:
Address.__table__

Table('addresses', MetaData(), Column('address_id', Integer(), table=<addresses>, primary_key=True, nullable=False), Column('street_name', VARCHAR(length=50), table=<addresses>), Column('street_number', Integer(), table=<addresses>), Column('postnr', VARCHAR(length=4), table=<addresses>), schema=None)

Note that in both instances, we're not defining an `__init__` - SQLAlchemy will automatically generate one, though we can always add one if we want to run some extra logic.

Let's finish our models - we can add a Purchase object and a Customer object and relate them:

In [6]:
import decimal
from sqlalchemy.orm import relationship

@mapper_registry.mapped
class Purchase:
    __tablename__ = "purchases"
    
    purchase_id: int = sa.Column(sa.Integer, primary_key=True)
    item_name: str = sa.Column(sa.VARCHAR(200))
    price: decimal.Decimal = sa.Column(sa.Numeric(19, 4))
    user_id: int = sa.Column(sa.Integer, sa.ForeignKey("customers.customer_id"))
    
    def __repr__(self):
        return f"<Purchase item_name={self.item_name}>"

In [7]:
@mapper_registry.mapped
class Customer:
    __tablename__ = "customers"
    
    customer_id: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.VARCHAR(50), unique=True)
    status: str = sa.Column(sa.Enum(StatusEnum))
    address_id: int = sa.Column(sa.Integer, sa.ForeignKey("addresses.address_id"))
    
    # One-to-one relationship
    address: Address = relationship("Address", backref="customer")
    
    # One-to-many
    purchases: list[Purchase] = relationship("Purchase", backref="customer")
    
    def __repr__(self):
        return f"<Customer name={self.name}>"

A relationship allows us to use attributes to select a related collection - essentially selecting the relevant rows from the other table.

To demonstrate, let's start by creating the tables and inserting some data

In [8]:
# If you have docker installed and haven't already run this - uncomment these lines
# !docker run -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres
conn_string = "postgresql://postgres:postgres@localhost:5432"
# Otherwise, use the sqlite conn_string
# conn_string = "sqlite:///parking.db"

In [9]:
engine = sa.create_engine(conn_string, future=True)

Since ORM builds on top of Core, we still use the engine and the metadata as we did before

In [10]:
mapper_registry.metadata.create_all(engine)

In [11]:
john = Customer(name="John", status="gold")
jane = Customer(name="Jane", status="bronze")

When working with the ORM, we use a Session instead of a connection. The session knows how to work with ORM-enabled classes, and serves as a local map of the various instances, keeping track of which instances have changes to be sent to the database, which instances are new and which are current. 

In [12]:
from sqlalchemy.orm import Session

In [13]:
with Session(engine) as session:
    session.add(john)
    session.add(jane)
    # Still have to actively commit
    session.commit()

Let's add an address to John's account

In [14]:
address = Address(street_name="Bogholder Allè", street_number=15, postnr=2720) 

In [15]:
john.address = address

In [16]:
with Session(engine) as session:
    session.add(john)
    session.commit()

John now goes shopping

In [17]:
potion = Purchase(item_name="Magic Potion", price=20.00, customer=john)

In [18]:
with Session(engine) as session:
    session.add(potion)
    session.commit()
    print(f"{potion.customer.name} bought {potion.item_name}")
    print(f"{potion.customer.name} lives at {potion.customer.address.street_name}")

John bought Magic Potion
John lives at Bogholder Allè


Let's add one more purchase:

In [19]:
magic_hat = Purchase(item_name="Magic Hat", price=100)

In [20]:
with Session(engine) as session:
    # Need to connect john to this session
    session.add(john)
    # purchases is a one-to-many relationship, so SQLAlchemy represents it as a list
    john.purchases.append(magic_hat)
    session.add(john)
    session.commit()

Now we have some data, how do we select from the database? The same way as for Core!

In [21]:
sql = sa.select(Customer).filter_by(name="Jane")
print(sql)

SELECT customers.customer_id, customers.name, customers.status, customers.address_id 
FROM customers 
WHERE customers.name = :name_1


In [22]:
with Session(engine) as session:
    jane = session.execute(sql).one_or_none()

In [23]:
jane

(<Customer name=Jane>,)

the result of our query is a `Row` object, same as in Core - but usually in ORM mode, we're often interested in the `scalars` result - the value in the first column for each row.

SQLAlchemy supports this through the `scalars` modifier, as well as `scalars` helpers

In [24]:
with Session(engine) as session:
    jane = session.execute(sql).scalars().one_or_none()

In [25]:
jane

<Customer name=Jane>

In [26]:
with Session(engine) as session:
    jane = session.execute(sql).scalar_one_or_none()

In [27]:
jane

<Customer name=Jane>

If we know the primary key, SQLAlchemy provides an efficient method of looking up by primary key

In [28]:
with Session(engine) as session:
    jane2 = session.get(Customer, jane.customer_id)

In [29]:
jane2

<Customer name=Jane>

## ORMs are classes

The nice thing about working with ORM's is that they're just classes - you can add whatever methods you want, use inheritance through MixIns and other similar patterns

In [30]:
import datetime as dt
from sqlalchemy.ext.hybrid import hybrid_property

class CreatedMixin:
    last_updated: dt.datetime = sa.Column(sa.DateTime, default=sa.func.now(), onupdate=sa.func.now())
    created_at: dt.datetime = sa.Column(sa.DateTime, default=sa.func.now())

@mapper_registry.mapped
class User(CreatedMixin):
    __tablename__ = "users"
    __table_args__ = {"extend_existing": True} # A workaround for us being in a Notebook
    
    primary: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.String)
    role: str = sa.Column(sa.String)
    purchases: int = sa.Column(sa.Integer, default=0)

In [31]:
list(User.__table__.columns)

[Column('last_updated', DateTime(), table=<users>, onupdate=ColumnDefault(<sqlalchemy.sql.functions.now at 0x27631496190; now>), default=ColumnDefault(<sqlalchemy.sql.functions.now at 0x27631496250; now>)),
 Column('created_at', DateTime(), table=<users>, default=ColumnDefault(<sqlalchemy.sql.functions.now at 0x276314921c0; now>)),
 Column('primary', Integer(), table=<users>, primary_key=True, nullable=False),
 Column('name', String(), table=<users>),
 Column('role', String(), table=<users>),
 Column('purchases', Integer(), table=<users>, default=ColumnDefault(0))]

The User table has inherited all the columns, as we expected. This pattern can reduce boilerplate when using ORMs

In [32]:
user = User(name="Jade", role="admin")

Since `User` is a regular class, we can use the `classmethod` constructor to define a new instance as well as the regular constructor

In [33]:
@mapper_registry.mapped
class User(CreatedMixin):
    __tablename__ = "users"
    __table_args__ = {"extend_existing": True} 
    
    primary: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.String)
    role: str = sa.Column(sa.String)
    purchases: int = sa.Column(sa.Integer, default=0)
    
    @classmethod
    def from_dict(cls, data):
        return cls(name=data["UserName"], role="public", purchases=0)

  class User(CreatedMixin):


In [34]:
user = User.from_dict({"UserName": "Jarvis"})

In [35]:
@mapper_registry.mapped
class User(CreatedMixin):
    __tablename__ = "users"
    __table_args__ = {"extend_existing": True} 
    
    primary: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.String)
    role: str = sa.Column(sa.String)
    purchases: int = sa.Column(sa.Integer, default=0)
    
    @classmethod
    def from_dict(cls, data):
        return cls(name=data["UserName"], role="public", purchases=0)
    
    @property
    def is_admin(self):
        return self.role == "admin"

  class User(CreatedMixin):


The instances have the defined property, just like we're used to

In [36]:
user = User(name="Jade", role="public")
user.is_admin

False

If we want to, we can also use the property in our queries, by defining it as a `hybrid_property`. This lets us write `User.is_admin` to generate a SQL expression`

In [37]:
@mapper_registry.mapped
class User(CreatedMixin):
    __tablename__ = "users"
    __table_args__ = {"extend_existing": True} 
    
    primary: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.String)
    role: str = sa.Column(sa.String)
    purchases: int = sa.Column(sa.Integer, default=0)
    
    @classmethod
    def from_dict(cls, data):
        return cls(name=data["UserName"], role="public")
    
    @hybrid_property
    def is_admin(self):
        return self.role == "admin"

  class User(CreatedMixin):


Let's create the tables, and try out with some SQL

In [38]:
mapper_registry.metadata.create_all(engine)

First, we create and add a user we can play with

In [39]:
user = User(name="Jade", role="admin")

In [40]:
session = Session(engine)
session.add(user)
session.commit()

In [41]:
sql = sa.select(User).where(User.is_admin)

In [42]:
print(sql)

SELECT users.last_updated, users.created_at, users."primary", users.name, users.role, users.purchases 
FROM users 
WHERE users.role = :role_1


Notice that the SQL statement includes our property statement in the WHERE clause.

Let's also verify the mixin defaults, while we're at it

In [43]:
admin_user = session.execute(sql).scalar_one_or_none()
print(f"Last updated: {admin_user.last_updated:%H:%M:%S}")

Last updated: 08:05:37


In [48]:
admin_user.name = "Jade Smith"
session.add(admin_user)
session.commit()
print(f"Last updated: {admin_user.last_updated:%H:%M:%S}")

Last updated: 08:07:18


Sometimes the SQL logic and the Python logic differ, and need to be written two different ways. Each hybrid_property can define an expression to be run when used inside a SQL statement.

In [49]:
@mapper_registry.mapped
class User(CreatedMixin):
    __tablename__ = "users"
    __table_args__ = {"extend_existing": True} 
    
    primary: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.String)
    role: str = sa.Column(sa.String)
    purchases: int = sa.Column(sa.Integer, default=0)
    
    @classmethod
    def from_dict(cls, data):
        return cls(name=data["UserName"], role="public")
    
    @hybrid_property
    def is_admin(self):
        return self.role == "admin"
    
    @hybrid_property
    def is_validated(self):
        return self.role in ["public", "admin"]
        
    @is_validated.expression
    def is_validated(cls):
        return cls.role.in_(["public", "admin"])


  class User(CreatedMixin):


In [50]:
user = User(name="Jade", role="public")

In [51]:
user.is_validated

True

In [52]:
sql = sa.select(User).where(User.is_validated)
print(sql)

SELECT users.last_updated, users.created_at, users."primary", users.name, users.role, users.purchases 
FROM users 
WHERE users.role IN (__[POSTCOMPILE_role_1])


In [53]:
validated_users = session.execute(sql).scalars().all()

In [54]:
validated_users[0].name

'Jade Smith'

So we now have Python logic mapped to both SQL and our local python instance. So far, it's been a simple property, what about logic?

In [55]:
from sqlalchemy.ext.hybrid import hybrid_method

@mapper_registry.mapped
class User(CreatedMixin):
    __tablename__ = "users"
    __table_args__ = {"extend_existing": True} 
    
    primary: int = sa.Column(sa.Integer, primary_key=True)
    name: str = sa.Column(sa.String)
    role: str = sa.Column(sa.String)
    purchases: int = sa.Column(sa.Integer, default=0)
    
    @classmethod
    def from_dict(cls, data):
        return cls(name=data["UserName"], role="public")
    
    @hybrid_property
    def is_admin(self):
        return self.role == "admin"
    
    @hybrid_property
    def is_validated(self):
        return self.role in ["public", "admin"]
        
    @is_validated.expression
    def is_validated(cls):
        return cls.role.in_(["public", "admin"])

    def purchase(self, session: Session, item_cost: int) -> int:
        self.purchases += item_cost
        session.add(self)
        return self.purchases
    
    @hybrid_method
    def calculate_roi(self, total_cost: int) -> float:
        return (self.purchases - total_cost) / total_cost

  class User(CreatedMixin):


We've added a regular `purchase` method, so let's try that first:

In [56]:
user = User(name="Jane", role="admin", purchases=0)

In [57]:
user.purchase(session, 2_000)
session.commit()

In [58]:
user.purchases

2000

What if we want to use a calculation inside our SQL query? that's what the hybrid_method does. Follows the same rules as the property, but works with arguments

In [59]:
user.calculate_roi(total_cost=1000)

1.0

In [60]:
sql = sa.select(User).where(User.calculate_roi(total_cost=1000) >= 1)

print(sql)

SELECT users.last_updated, users.created_at, users."primary", users.name, users.role, users.purchases 
FROM users 
WHERE (users.purchases - :purchases_1) / :param_1 >= :param_2


In [61]:
print([(user.name, user.calculate_roi(total_cost=1000)) for user in session.execute(sql).scalars()])

[('Jane', 1.0)]


In [62]:
session.close()

# Exercise

Jane, our customer from the first example, has just updated her profile to add her address, `Copenhagen Midtown, 2100`. She then went in and bought a `Dungeons and Dragons` item for 200. She then bought some `Dice` for 50.

Finally, find out how much our shop has sold for total, as well as average purchase price per customer