# SQL Alchemy: Part 4

| Key              | Value                                                                                                                                                                                                                                                                                                                                                          |
|:-----------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Course Codes** | DAT 2201, DAT 3103, BBT 3104, MCS 8104, MIT 8107, BBT 4106                                                                                                                                                                                                                                                                                                     |
| **Course Names** | DAT 2201: Database Design and SQL (Week 1-3 of 13), <br/>DAT 3103: Principles of Data Engineering (Week 1-3 of 13), <br/>BBT 3104: Advanced Database Systems (Week 7-9 of 13), <br/>MCS 8104: Database Management Systems (Week 1-3 of 13), <br/>MIT 8107: Advanced Database Systems (Week 1-3 of 13), <br/>BBT 4106: Business Intelligence I (Week 4-6 of 13) |
| **Semester**     | May to July 2026                                                                                                                                                                                                                                                                                                                                               |
| **Lecturer**     | Allan Omondi                                                                                                                                                                                                                                                                                                                                                   |
| **Contact**      | aomondi@strathmore.edu                                                                                                                                                                                                                                                                                                                                         |
| **Note**         | The lecture contains both theory and practice.<br/>This notebook forms part of the practice.<br/>It is intended for educational purpose only.<br/>Recommended citation: [BibTex](https://raw.githubusercontent.com/course-files/ObjectRelationalMapping/refs/heads/main/RecommendedCitation.bib)                                                               |


## The Declarative Base and Sessionmaker
- `declarative_base` replaces the metadata object from SQL Alchemy Core. It is used to define the base class for all ORM-mapped classes. This base class maintains a catalog of classes and tables relative to that base.
- `sessionmaker` replaces the connection object from SQL Alchemy Core. It is a factory for creating new Session objects, which are used to interact with the database in an ORM context.

In [202]:
from sqlalchemy.orm import declarative_base, sessionmaker, relationship

In [203]:
from sqlalchemy import create_engine
from sqlalchemy import Column, Integer, String, Float
from sqlalchemy import ForeignKey, func
import pandas as pd

### An SQLite Engine

In [204]:
engine = create_engine('sqlite+pysqlite:///mydatabase.db', echo=True)
Session = sessionmaker(bind=engine)
session = Session()

### A MySQL Engine

In [205]:
# engine = create_engine('mysql+pymysql://root:5trathm0re@127.0.0.1:3307/siwaka_dishes', echo=True)
# Session = sessionmaker(bind=engine)
# session = Session()

### A PostgreSQL Engine

- We can set the schema to use when creating the engine as opposed to editing the declarative base later.
- The `%3d` is the URL-encoded `=` sign

In [206]:
# engine = create_engine('postgresql+psycopg2://postgres:5trathm0re@127.0.0.1:5433/postgres?options=-csearch_path%3dsiwaka_dishes', echo=True)
# Session = sessionmaker(bind=engine)
# session = Session()

- When you use `back_populates`, SQLAlchemy understands that two specific `relationship()` definitions are mirror images of each other.
- If you append an object to one side of the relationship, SQLAlchemy automatically adds (or "populates") the object to the collection on the other side in memory.
- This ensures that the state of your Python objects remains consistent before they are committed to the database.

In [207]:
# Instead of creating a metadata object, we create a declarative base for SQL Alchemy ORM
Base = declarative_base()

class SideDish(Base):
    __tablename__ = 'side_dish'
    id = Column(Integer, primary_key=True)
    name = Column(String(100), nullable=False)
    price = Column(Float)
    side_dish_and_side_dish_ingredients = relationship(
        "SideDishAndSideDishIngredient",
        back_populates="side_dish"
    )

class SideDishIngredient(Base):
    __tablename__ = 'side_dish_ingredient'
    id = Column(Integer, primary_key=True)
    name = Column(String(100), nullable=False)
    side_dish_and_side_dish_ingredients = relationship(
        "SideDishAndSideDishIngredient",
        back_populates="side_dish_ingredient"
    )

class  SideDishAndSideDishIngredient(Base):
    __tablename__ = 'side_dish_and_side_dish_ingredient'
    side_dish_id = Column(Integer, ForeignKey('side_dish.id'), nullable=False, primary_key=True)
    side_dish_ingredient_id = Column(Integer, ForeignKey('side_dish_ingredient.id'), nullable=False, primary_key=True)
    side_dish = relationship("SideDish", back_populates="side_dish_and_side_dish_ingredients")
    side_dish_ingredient = relationship("SideDishIngredient", back_populates="side_dish_and_side_dish_ingredients")

Base.metadata.create_all(engine)

2025-12-10 16:51:25,494 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,496 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("side_dish")
2025-12-10 16:51:25,497 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-12-10 16:51:25,498 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("side_dish_ingredient")
2025-12-10 16:51:25,500 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-12-10 16:51:25,502 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("side_dish_and_side_dish_ingredient")
2025-12-10 16:51:25,502 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-12-10 16:51:25,504 INFO sqlalchemy.engine.Engine COMMIT


## Truncate the Data in the Tables to Start Afresh

In [208]:
with Session() as session:
    session.query(SideDishAndSideDishIngredient).delete()
    session.commit()

with Session() as session:
    session.query(SideDish).delete()
    session.commit()

with Session() as session:
    session.query(SideDishIngredient).delete()
    session.commit()

2025-12-10 16:51:25,526 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,527 INFO sqlalchemy.engine.Engine DELETE FROM side_dish_and_side_dish_ingredient
2025-12-10 16:51:25,528 INFO sqlalchemy.engine.Engine [generated in 0.00105s] ()
2025-12-10 16:51:25,530 INFO sqlalchemy.engine.Engine COMMIT
2025-12-10 16:51:25,536 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,538 INFO sqlalchemy.engine.Engine DELETE FROM side_dish
2025-12-10 16:51:25,538 INFO sqlalchemy.engine.Engine [generated in 0.00055s] ()
2025-12-10 16:51:25,540 INFO sqlalchemy.engine.Engine COMMIT
2025-12-10 16:51:25,544 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,545 INFO sqlalchemy.engine.Engine DELETE FROM side_dish_ingredient
2025-12-10 16:51:25,546 INFO sqlalchemy.engine.Engine [generated in 0.00080s] ()
2025-12-10 16:51:25,548 INFO sqlalchemy.engine.Engine COMMIT


## Confirm that the Tables are Empty

### `side_dish`

In [209]:
with Session() as session:
    result = session.query(SideDish).all()
    df = pd.DataFrame([{
        "id": r.id, "name": r.name, "price": r.price
    } for r in result])
    display(df)
    # print (result)
    session.commit()

2025-12-10 16:51:25,577 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,579 INFO sqlalchemy.engine.Engine SELECT side_dish.id AS side_dish_id, side_dish.name AS side_dish_name, side_dish.price AS side_dish_price 
FROM side_dish
2025-12-10 16:51:25,580 INFO sqlalchemy.engine.Engine [generated in 0.00076s] ()


2025-12-10 16:51:25,586 INFO sqlalchemy.engine.Engine COMMIT


### `side_dish_ingredient`

In [210]:
with Session() as session:
    result = session.query(SideDishIngredient).all()
    df = pd.DataFrame([{
        "id": r.id, "name": r.name
    } for r in result])
    display(df)
    # print(result)
    session.commit()

2025-12-10 16:51:25,648 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,649 INFO sqlalchemy.engine.Engine SELECT side_dish_ingredient.id AS side_dish_ingredient_id, side_dish_ingredient.name AS side_dish_ingredient_name 
FROM side_dish_ingredient
2025-12-10 16:51:25,650 INFO sqlalchemy.engine.Engine [generated in 0.00078s] ()


2025-12-10 16:51:25,655 INFO sqlalchemy.engine.Engine COMMIT


### `side_dish_and_side_dish_ingredient`

In [211]:
with Session() as session:
    result = session.query(SideDishAndSideDishIngredient).all()
    df = pd.DataFrame([{
        "side_dish_id": r.side_dish_id,
        "side_dish_ingredient_id": r.side_dish_ingredient_id
    } for r in result])
    display(df)
    print(result)
    session.commit()

2025-12-10 16:51:25,727 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,728 INFO sqlalchemy.engine.Engine SELECT side_dish_and_side_dish_ingredient.side_dish_id AS side_dish_and_side_dish_ingredient_side_dish_id, side_dish_and_side_dish_ingredient.side_dish_ingredient_id AS side_dish_and_side_dish_ingredient_side_dish_ingredient_id 
FROM side_dish_and_side_dish_ingredient
2025-12-10 16:51:25,729 INFO sqlalchemy.engine.Engine [generated in 0.00085s] ()


[]
2025-12-10 16:51:25,733 INFO sqlalchemy.engine.Engine COMMIT


## Insert Data into Related Tables

**Option 1:** The option below shows how to insert data into related tables using explicit IDs.

In [212]:
with Session() as session:
    side_dishes = [
        SideDish(id=10, name='Kachumbari na Pilipili', price=45),
        SideDish(id=11, name='Kenyan Guacamole', price=45),
        SideDish(id=12, name='Sukuma Wiki (Sautéed Collard Greens)', price=55),
        SideDish(id=13, name='Mukimo Plain', price=130),
        SideDish(id=14, name='Avocado', price=50)
    ]
    session.add_all(side_dishes)

    ingredients = [
        SideDishIngredient(id=10, name='Tomatoes'),
        SideDishIngredient(id=11, name='Onions'),
        SideDishIngredient(id=12, name='Chilli'),
        SideDishIngredient(id=13, name='Irish Potatoes'),
        SideDishIngredient(id=14, name='Green Peas'),
        SideDishIngredient(id=15, name='Soft Green Maize'),
        SideDishIngredient(id=16, name='Pumpkin Leaves'),
        SideDishIngredient(id=17, name='Garlic'),
        SideDishIngredient(id=18, name='Salt'),
        SideDishIngredient(id=19, name='Avocado'),
        SideDishIngredient(id=20, name='Dhania (Coriander)')
    ]
    session.add_all(ingredients)

    recipes = [
        SideDishAndSideDishIngredient(side_dish_id=10, side_dish_ingredient_id=10),
        SideDishAndSideDishIngredient(side_dish_id=10, side_dish_ingredient_id=11),
        SideDishAndSideDishIngredient(side_dish_id=10, side_dish_ingredient_id=12),
        SideDishAndSideDishIngredient(side_dish_id=11, side_dish_ingredient_id=18),
        SideDishAndSideDishIngredient(side_dish_id=11, side_dish_ingredient_id=10),
        SideDishAndSideDishIngredient(side_dish_id=11, side_dish_ingredient_id=11),
        SideDishAndSideDishIngredient(side_dish_id=11, side_dish_ingredient_id=19),
        SideDishAndSideDishIngredient(side_dish_id=13, side_dish_ingredient_id=13),
        SideDishAndSideDishIngredient(side_dish_id=13, side_dish_ingredient_id=14),
        SideDishAndSideDishIngredient(side_dish_id=13, side_dish_ingredient_id=15),
        SideDishAndSideDishIngredient(side_dish_id=13, side_dish_ingredient_id=16),
        SideDishAndSideDishIngredient(side_dish_id=13, side_dish_ingredient_id=18),
        SideDishAndSideDishIngredient(side_dish_id=13, side_dish_ingredient_id=19),
        SideDishAndSideDishIngredient(side_dish_id=14, side_dish_ingredient_id=19)
    ]
    session.add_all(recipes)

    session.commit()

2025-12-10 16:51:25,823 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,825 INFO sqlalchemy.engine.Engine INSERT INTO side_dish (id, name, price) VALUES (?, ?, ?)
2025-12-10 16:51:25,826 INFO sqlalchemy.engine.Engine [generated in 0.00120s] [(10, 'Kachumbari na Pilipili', 45.0), (11, 'Kenyan Guacamole', 45.0), (12, 'Sukuma Wiki (Sautéed Collard Greens)', 55.0), (13, 'Mukimo Plain', 130.0), (14, 'Avocado', 50.0)]
2025-12-10 16:51:25,830 INFO sqlalchemy.engine.Engine INSERT INTO side_dish_ingredient (id, name) VALUES (?, ?)
2025-12-10 16:51:25,830 INFO sqlalchemy.engine.Engine [generated in 0.00076s] [(10, 'Tomatoes'), (11, 'Onions'), (12, 'Chilli'), (13, 'Irish Potatoes'), (14, 'Green Peas'), (15, 'Soft Green Maize'), (16, 'Pumpkin Leaves'), (17, 'Garlic')  ... displaying 10 of 11 total bound parameter sets ...  (19, 'Avocado'), (20, 'Dhania (Coriander)')]
2025-12-10 16:51:25,833 INFO sqlalchemy.engine.Engine INSERT INTO side_dish_and_side_dish_ingredient (side_dish_i

**Option 2:** The option below shows how to insert data into related tables without specifying the foreign keys manually.

In [213]:
with Session() as session:
    # Create and add a new side dish
    new_side_dish = SideDish(name='Fried Plantain with Cinnamon', price=160)
    session.add(new_side_dish)
    session.flush()  # Assigns an ID to new_side_dish

    # Create and add a new ingredient
    new_ingredient_plantain = SideDishIngredient(name='Plantain')
    session.add(new_ingredient_plantain)
    session.flush()  # Assigns an ID to new_ingredient_plantain

    new_ingredient_spice = SideDishIngredient(name='Cinnamon')
    session.add(new_ingredient_spice)
    session.flush()  # Assigns an ID to new_ingredient_plantain

    # Link them using the assigned IDs
    association = SideDishAndSideDishIngredient(
        side_dish_id=new_side_dish.id,
        side_dish_ingredient_id=new_ingredient_plantain.id
    )
    session.add(association)

    # Link them using the assigned IDs
    association2 = SideDishAndSideDishIngredient(
        side_dish_id=new_side_dish.id,
        side_dish_ingredient_id=new_ingredient_spice.id
    )
    session.add(association2)
    session.commit()

2025-12-10 16:51:25,896 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,898 INFO sqlalchemy.engine.Engine INSERT INTO side_dish (name, price) VALUES (?, ?)
2025-12-10 16:51:25,899 INFO sqlalchemy.engine.Engine [generated in 0.00103s] ('Fried Plantain with Cinnamon', 160.0)
2025-12-10 16:51:25,901 INFO sqlalchemy.engine.Engine INSERT INTO side_dish_ingredient (name) VALUES (?)
2025-12-10 16:51:25,902 INFO sqlalchemy.engine.Engine [generated in 0.00097s] ('Plantain',)
2025-12-10 16:51:25,903 INFO sqlalchemy.engine.Engine INSERT INTO side_dish_ingredient (name) VALUES (?)
2025-12-10 16:51:25,904 INFO sqlalchemy.engine.Engine [cached since 0.002969s ago] ('Cinnamon',)
2025-12-10 16:51:25,905 INFO sqlalchemy.engine.Engine INSERT INTO side_dish_and_side_dish_ingredient (side_dish_id, side_dish_ingredient_id) VALUES (?, ?)
2025-12-10 16:51:25,906 INFO sqlalchemy.engine.Engine [cached since 0.07295s ago] [(15, 21), (15, 22)]
2025-12-10 16:51:25,908 INFO sqlalchemy.engine.Eng

## Confirm that the Data has been Inserted

### `side_dish`

In [214]:
with Session() as session:
    result = session.query(SideDish).all()
    df = pd.DataFrame([{
        "id": r.id, "name": r.name, "price": r.price
    } for r in result])
    display(df)
    # print (result)
    session.commit()

2025-12-10 16:51:25,941 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:25,942 INFO sqlalchemy.engine.Engine SELECT side_dish.id AS side_dish_id, side_dish.name AS side_dish_name, side_dish.price AS side_dish_price 
FROM side_dish
2025-12-10 16:51:25,942 INFO sqlalchemy.engine.Engine [cached since 0.3632s ago] ()


Unnamed: 0,id,name,price
0,10,Kachumbari na Pilipili,45.0
1,11,Kenyan Guacamole,45.0
2,12,Sukuma Wiki (Sautéed Collard Greens),55.0
3,13,Mukimo Plain,130.0
4,14,Avocado,50.0
5,15,Fried Plantain with Cinnamon,160.0


2025-12-10 16:51:25,949 INFO sqlalchemy.engine.Engine COMMIT


### `side_dish_ingredient`

In [215]:
with Session() as session:
    result = session.query(SideDishIngredient).all()
    df = pd.DataFrame([{
        "id": r.id, "name": r.name
    } for r in result])
    display(df)
    # print(result)
    session.commit()

2025-12-10 16:51:26,026 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,027 INFO sqlalchemy.engine.Engine SELECT side_dish_ingredient.id AS side_dish_ingredient_id, side_dish_ingredient.name AS side_dish_ingredient_name 
FROM side_dish_ingredient
2025-12-10 16:51:26,028 INFO sqlalchemy.engine.Engine [cached since 0.3791s ago] ()


Unnamed: 0,id,name
0,10,Tomatoes
1,11,Onions
2,12,Chilli
3,13,Irish Potatoes
4,14,Green Peas
5,15,Soft Green Maize
6,16,Pumpkin Leaves
7,17,Garlic
8,18,Salt
9,19,Avocado


2025-12-10 16:51:26,036 INFO sqlalchemy.engine.Engine COMMIT


### `side_dish_and_side_dish_ingredient`

In [216]:
with Session() as session:
    result = session.query(SideDishAndSideDishIngredient).all()
    df = pd.DataFrame([{
        "side_dish_id": r.side_dish_id,
        "side_dish_ingredient_id": r.side_dish_ingredient_id
    } for r in result])
    display(df)
    # print(result)
    session.commit()

2025-12-10 16:51:26,207 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,208 INFO sqlalchemy.engine.Engine SELECT side_dish_and_side_dish_ingredient.side_dish_id AS side_dish_and_side_dish_ingredient_side_dish_id, side_dish_and_side_dish_ingredient.side_dish_ingredient_id AS side_dish_and_side_dish_ingredient_side_dish_ingredient_id 
FROM side_dish_and_side_dish_ingredient
2025-12-10 16:51:26,209 INFO sqlalchemy.engine.Engine [cached since 0.4807s ago] ()


Unnamed: 0,side_dish_id,side_dish_ingredient_id
0,10,10
1,10,11
2,10,12
3,11,18
4,11,10
5,11,11
6,11,19
7,13,13
8,13,14
9,13,15


2025-12-10 16:51:26,214 INFO sqlalchemy.engine.Engine COMMIT


## Querying Data using Joins

## Equijoin

- Select all side dishes and their matching ingredients

In [217]:
with Session() as session:
    result = (
        session.query(SideDish.name, SideDishIngredient.name)
        .join(SideDishAndSideDishIngredient, SideDish.id == SideDishAndSideDishIngredient.side_dish_id)
        .join(SideDishIngredient, SideDishIngredient.id == SideDishAndSideDishIngredient.side_dish_ingredient_id)
        .order_by(SideDish.name)
        .all()
    )
    df = pd.DataFrame(result, columns=["Side Dish", "Ingredients"])
    display(df)
    session.commit()

2025-12-10 16:51:26,314 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,316 INFO sqlalchemy.engine.Engine SELECT side_dish.name AS side_dish_name, side_dish_ingredient.name AS side_dish_ingredient_name 
FROM side_dish JOIN side_dish_and_side_dish_ingredient ON side_dish.id = side_dish_and_side_dish_ingredient.side_dish_id JOIN side_dish_ingredient ON side_dish_ingredient.id = side_dish_and_side_dish_ingredient.side_dish_ingredient_id ORDER BY side_dish.name
2025-12-10 16:51:26,317 INFO sqlalchemy.engine.Engine [generated in 0.00102s] ()


Unnamed: 0,Side Dish,Ingredients
0,Avocado,Avocado
1,Fried Plantain with Cinnamon,Plantain
2,Fried Plantain with Cinnamon,Cinnamon
3,Kachumbari na Pilipili,Tomatoes
4,Kachumbari na Pilipili,Onions
5,Kachumbari na Pilipili,Chilli
6,Kenyan Guacamole,Salt
7,Kenyan Guacamole,Tomatoes
8,Kenyan Guacamole,Onions
9,Kenyan Guacamole,Avocado


2025-12-10 16:51:26,323 INFO sqlalchemy.engine.Engine COMMIT


## Left Outer Join

- Select all side dishes and their ingredients, including those without ingredients

In [218]:
with Session() as session:
    result = (
        session.query(SideDish.name, SideDishIngredient.name)
        .outerjoin(
            SideDishAndSideDishIngredient,
            SideDish.id == SideDishAndSideDishIngredient.side_dish_id
        )
        .outerjoin(
            SideDishIngredient,
            SideDishIngredient.id == SideDishAndSideDishIngredient.side_dish_ingredient_id
        )
        .order_by(SideDish.name)
        .all()
    )
    df = pd.DataFrame(result, columns=["Side Dish", "Ingredients"])
    display(df)
    session.commit()

2025-12-10 16:51:26,401 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,403 INFO sqlalchemy.engine.Engine SELECT side_dish.name AS side_dish_name, side_dish_ingredient.name AS side_dish_ingredient_name 
FROM side_dish LEFT OUTER JOIN side_dish_and_side_dish_ingredient ON side_dish.id = side_dish_and_side_dish_ingredient.side_dish_id LEFT OUTER JOIN side_dish_ingredient ON side_dish_ingredient.id = side_dish_and_side_dish_ingredient.side_dish_ingredient_id ORDER BY side_dish.name
2025-12-10 16:51:26,403 INFO sqlalchemy.engine.Engine [generated in 0.00069s] ()


Unnamed: 0,Side Dish,Ingredients
0,Avocado,Avocado
1,Fried Plantain with Cinnamon,Plantain
2,Fried Plantain with Cinnamon,Cinnamon
3,Kachumbari na Pilipili,Tomatoes
4,Kachumbari na Pilipili,Onions
5,Kachumbari na Pilipili,Chilli
6,Kenyan Guacamole,Tomatoes
7,Kenyan Guacamole,Onions
8,Kenyan Guacamole,Salt
9,Kenyan Guacamole,Avocado


2025-12-10 16:51:26,409 INFO sqlalchemy.engine.Engine COMMIT


- Select all ingredients and the side dishes they are used to make, including those that are not used to make any side dish

In [219]:
with Session() as session:
    result = (
        session.query(SideDishIngredient.name, SideDish.name)
        .outerjoin(
            SideDishAndSideDishIngredient,
            SideDishIngredient.id == SideDishAndSideDishIngredient.side_dish_ingredient_id
        )
        .outerjoin(
            SideDish,
            SideDish.id == SideDishAndSideDishIngredient.side_dish_id
        )
        .order_by(SideDishIngredient.name)
        .all()
    )
    df = pd.DataFrame(result, columns=["Ingredients", "Side Dish"])
    display(df)
    session.commit()

2025-12-10 16:51:26,524 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,526 INFO sqlalchemy.engine.Engine SELECT side_dish_ingredient.name AS side_dish_ingredient_name, side_dish.name AS side_dish_name 
FROM side_dish_ingredient LEFT OUTER JOIN side_dish_and_side_dish_ingredient ON side_dish_ingredient.id = side_dish_and_side_dish_ingredient.side_dish_ingredient_id LEFT OUTER JOIN side_dish ON side_dish.id = side_dish_and_side_dish_ingredient.side_dish_id ORDER BY side_dish_ingredient.name
2025-12-10 16:51:26,527 INFO sqlalchemy.engine.Engine [generated in 0.00084s] ()


Unnamed: 0,Ingredients,Side Dish
0,Avocado,Kenyan Guacamole
1,Avocado,Mukimo Plain
2,Avocado,Avocado
3,Chilli,Kachumbari na Pilipili
4,Cinnamon,Fried Plantain with Cinnamon
5,Dhania (Coriander),
6,Garlic,
7,Green Peas,Mukimo Plain
8,Irish Potatoes,Mukimo Plain
9,Onions,Kachumbari na Pilipili


2025-12-10 16:51:26,532 INFO sqlalchemy.engine.Engine COMMIT


## Aggregate within Groups

![img.png](assets/images/pexels-antonio-filigno-159809-8538296.jpg)

_Source: [link](https://www.pexels.com/photo/sliced-avocado-on-yellow-green-surface-8538296/)_

- Example of how to use GROUP BY to count the number of side dishes that use each ingredient

In [220]:
with Session() as session:
    result = (
        session.query(
            SideDishIngredient.name,
            func.count(SideDishAndSideDishIngredient.side_dish_id).label("Number of Side Dishes")
        )
        .outerjoin(
            SideDishAndSideDishIngredient,
            SideDishIngredient.id == SideDishAndSideDishIngredient.side_dish_ingredient_id
        )
        .group_by(SideDishIngredient.name)
        .order_by(func.count(SideDishAndSideDishIngredient.side_dish_id).desc())
        .all()
    )
    df = pd.DataFrame(result, columns=["Ingredient", "Number of Side Dishes"])
    display(df)
    session.commit()

2025-12-10 16:51:26,684 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,686 INFO sqlalchemy.engine.Engine SELECT side_dish_ingredient.name AS side_dish_ingredient_name, count(side_dish_and_side_dish_ingredient.side_dish_id) AS "Number of Side Dishes" 
FROM side_dish_ingredient LEFT OUTER JOIN side_dish_and_side_dish_ingredient ON side_dish_ingredient.id = side_dish_and_side_dish_ingredient.side_dish_ingredient_id GROUP BY side_dish_ingredient.name ORDER BY count(side_dish_and_side_dish_ingredient.side_dish_id) DESC
2025-12-10 16:51:26,687 INFO sqlalchemy.engine.Engine [generated in 0.00098s] ()


Unnamed: 0,Ingredient,Number of Side Dishes
0,Avocado,3
1,Tomatoes,2
2,Salt,2
3,Onions,2
4,Soft Green Maize,1
5,Pumpkin Leaves,1
6,Plantain,1
7,Irish Potatoes,1
8,Green Peas,1
9,Cinnamon,1


2025-12-10 16:51:26,692 INFO sqlalchemy.engine.Engine COMMIT


In [221]:
with Session() as session:
    result = (
        session.query(
            SideDishIngredient.name,
            func.count(SideDishAndSideDishIngredient.side_dish_id).label("Number of Side Dishes")
        )
        .outerjoin(
            SideDishAndSideDishIngredient,
            SideDishIngredient.id == SideDishAndSideDishIngredient.side_dish_ingredient_id
        )
        .group_by(SideDishIngredient.name)
        .order_by(func.count(SideDishAndSideDishIngredient.side_dish_id).desc())
        .all()
    )
    df = pd.DataFrame(result, columns=["Ingredient", "Number of Side Dishes"])
    display(df)
    session.commit()

2025-12-10 16:51:26,821 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,822 INFO sqlalchemy.engine.Engine SELECT side_dish_ingredient.name AS side_dish_ingredient_name, count(side_dish_and_side_dish_ingredient.side_dish_id) AS "Number of Side Dishes" 
FROM side_dish_ingredient LEFT OUTER JOIN side_dish_and_side_dish_ingredient ON side_dish_ingredient.id = side_dish_and_side_dish_ingredient.side_dish_ingredient_id GROUP BY side_dish_ingredient.name ORDER BY count(side_dish_and_side_dish_ingredient.side_dish_id) DESC
2025-12-10 16:51:26,823 INFO sqlalchemy.engine.Engine [cached since 0.1368s ago] ()


Unnamed: 0,Ingredient,Number of Side Dishes
0,Avocado,3
1,Tomatoes,2
2,Salt,2
3,Onions,2
4,Soft Green Maize,1
5,Pumpkin Leaves,1
6,Plantain,1
7,Irish Potatoes,1
8,Green Peas,1
9,Cinnamon,1


2025-12-10 16:51:26,828 INFO sqlalchemy.engine.Engine COMMIT


- Example of how to use GROUP BY with HAVING to filter ingredients used in more than one side dish

In [222]:
with Session() as session:
    result = (
        session.query(
            SideDishIngredient.name,
            func.count(SideDishAndSideDishIngredient.side_dish_id).label("Number of Side Dishes")
        )
        .outerjoin(
            SideDishAndSideDishIngredient,
            SideDishIngredient.id == SideDishAndSideDishIngredient.side_dish_ingredient_id
        )
        .group_by(SideDishIngredient.name)
        .having(func.count(SideDishAndSideDishIngredient.side_dish_id) > 1)
        .order_by(func.count(SideDishAndSideDishIngredient.side_dish_id).desc())
        .all()
    )
    df = pd.DataFrame(result, columns=["Ingredient", "Number of Side Dishes"])
    display(df)
    session.commit()

2025-12-10 16:51:26,963 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:26,965 INFO sqlalchemy.engine.Engine SELECT side_dish_ingredient.name AS side_dish_ingredient_name, count(side_dish_and_side_dish_ingredient.side_dish_id) AS "Number of Side Dishes" 
FROM side_dish_ingredient LEFT OUTER JOIN side_dish_and_side_dish_ingredient ON side_dish_ingredient.id = side_dish_and_side_dish_ingredient.side_dish_ingredient_id GROUP BY side_dish_ingredient.name 
HAVING count(side_dish_and_side_dish_ingredient.side_dish_id) > ? ORDER BY count(side_dish_and_side_dish_ingredient.side_dish_id) DESC
2025-12-10 16:51:26,966 INFO sqlalchemy.engine.Engine [generated in 0.00121s] (1,)


Unnamed: 0,Ingredient,Number of Side Dishes
0,Avocado,3
1,Tomatoes,2
2,Salt,2
3,Onions,2


2025-12-10 16:51:26,974 INFO sqlalchemy.engine.Engine COMMIT


## Database Transactions using SQL Alchemy ORM

In [223]:
with Session() as session:
    with session.begin():
        # Create
        dish = SideDish(name='Mango Salsa', price=45)
        session.add(dish)
        session.flush()  # Assigns an ID

        # Savepoint 1
        savepoint1 = session.begin_nested()

        # Read
        read_dish = session.query(SideDish).filter_by(name='Mango Salsa').first()

        # Update
        read_dish.price = 58
        session.flush()

        # Savepoint 2
        savepoint2 = session.begin_nested()

        # Delete
        session.delete(read_dish)
        session.flush()

        # Rollback to savepoint2 (undo the deletion)
        savepoint2.rollback()

        # Rollback to savepoint1 (undo the price update and the deletion)
        savepoint1.rollback()

        # At this point, the side dish is only created, not updated or deleted
        session.commit()

2025-12-10 16:51:27,113 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:27,114 INFO sqlalchemy.engine.Engine INSERT INTO side_dish (name, price) VALUES (?, ?)
2025-12-10 16:51:27,115 INFO sqlalchemy.engine.Engine [cached since 1.217s ago] ('Mango Salsa', 45.0)
2025-12-10 16:51:27,117 INFO sqlalchemy.engine.Engine SAVEPOINT sa_savepoint_1
2025-12-10 16:51:27,118 INFO sqlalchemy.engine.Engine [no key 0.00093s] ()
2025-12-10 16:51:27,120 INFO sqlalchemy.engine.Engine SELECT side_dish.id AS side_dish_id, side_dish.name AS side_dish_name, side_dish.price AS side_dish_price 
FROM side_dish 
WHERE side_dish.name = ?
 LIMIT ? OFFSET ?
2025-12-10 16:51:27,121 INFO sqlalchemy.engine.Engine [generated in 0.00099s] ('Mango Salsa', 1, 0)
2025-12-10 16:51:27,123 INFO sqlalchemy.engine.Engine UPDATE side_dish SET price=? WHERE side_dish.id = ?
2025-12-10 16:51:27,124 INFO sqlalchemy.engine.Engine [generated in 0.00083s] (58.0, 16)
2025-12-10 16:51:27,126 INFO sqlalchemy.engine.Engine 

### Confirm the Successful Processing of the Transaction

In [224]:
with Session() as session:
    result = session.query(SideDish).all()
    df = pd.DataFrame([{
        "id": r.id, "name": r.name, "price": r.price
    } for r in result])
    display(df)
    # print (result)
    session.commit()

2025-12-10 16:51:27,292 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-12-10 16:51:27,293 INFO sqlalchemy.engine.Engine SELECT side_dish.id AS side_dish_id, side_dish.name AS side_dish_name, side_dish.price AS side_dish_price 
FROM side_dish
2025-12-10 16:51:27,293 INFO sqlalchemy.engine.Engine [cached since 1.714s ago] ()


Unnamed: 0,id,name,price
0,10,Kachumbari na Pilipili,45.0
1,11,Kenyan Guacamole,45.0
2,12,Sukuma Wiki (Sautéed Collard Greens),55.0
3,13,Mukimo Plain,130.0
4,14,Avocado,50.0
5,15,Fried Plantain with Cinnamon,160.0
6,16,Mango Salsa,45.0


2025-12-10 16:51:27,299 INFO sqlalchemy.engine.Engine COMMIT


## References
SQLAlchemy Project. (2025, December 5). _SQLAlchemy 2.0 Documentation._ SQLAlchemy. https://docs.sqlalchemy.org/en/20/