# Homework 10: SQLAlchemy

Total questions: 5<br/>
Total points: 8

### FYI

In class, we used paths on your local machine outside of the notebook to persist data, but here, we'll use a special argument that `'sqlite:///:memory:'` in SQLAlchemy has which stores the database in-memory -- in otherwords, the database will live only so long as you're running the notebook. Afterwards, it will be thrown away. Usually this argument is useful when experimenting, given that as we discussed, it is persistence that often makes databases useful. We use it here however simply to avoid needing to upload multiple files to Courseworks -- everything you need is here.

If for any reason you need to discard the contents of your database to start over, you may therefore do so by simply restarting the "kernel" of this notebook, which you can do by clicking the restart button in the toolbar (it's 2 buttons to the right of the "Run" button) or in the Kernel menu.

## Question 1

Using `SQLAlchemy`'s ORM (object relational mapping) layer, we will be creating a new database with some new tables.

For the first table, create a table named `students` by defining a `Student` class. A `Student` should have the following:

* a `uni` (`String`)
* a `fullname` (`String`)
* a `nickname` (`String`)
* a planned `graduation_date` (`Date`)
* a tuition `balance` (`Float`)

as well as an `id` as a primary key.

Be sure to define the `__tablename__` attribute within the `Student` class.

[2.5 points]

In [14]:
# Solution
from sqlalchemy import Column, Date, Integer, Float, String
from sqlalchemy.orm import declarative_base

Base = declarative_base()

class Student(Base):
    __tablename__ = "students"
    
    id = Column(Integer, primary_key=True)
    uni = Column(String)
    fullname = Column(String)
    nickname = Column(String)
    graduation_date = Column(Date)
    balance = Column(Float)

In [15]:
# autograder tests
assert Student.__table__.name == "students"

In [16]:
assert hasattr(Student, "id")
assert isinstance(Student.id.type, Integer)

from sqlalchemy.inspection import inspect
actual_primary_key = inspect(Student).primary_key[0]
assert "id" == actual_primary_key.name

In [17]:
assert hasattr(Student, "uni")
assert isinstance(Student.uni.type, String)
assert hasattr(Student, "fullname")
assert isinstance(Student.fullname.type, String)
assert hasattr(Student, "nickname")
assert isinstance(Student.nickname.type, String)

In [18]:
assert hasattr(Student, "graduation_date")
assert isinstance(Student.graduation_date.type, Date)

In [19]:
assert hasattr(Student, "balance")
assert isinstance(Student.balance.type, Float)

## Question 2

Next, let's define a `classes` table with a `Class` class. A `Class` should have the following columns:

* a `name` (`String`) of the class
* a `days` (`String`) of week representing the schedule (e.g. `M` for Monday schedule, `MW` for Monday + Wednesday schedule, `TR` for Tuesday + Thursday schedule, etc)
* a `professor` (`String`) name

as well as an `id` as a primary key.

Again, be sure to define the `__tablename__` attribute within the `Class` class.

[1 point]

In [20]:
# Solution
class Class(Base):
    __tablename__ = "classes"
    
    id = Column(Integer, primary_key=True)
    name = Column(String)
    days = Column(String)
    professor = Column(String)

In [21]:
# autograder tests
assert Class.__table__.name == "classes"

assert hasattr(Class, "id")
assert isinstance(Class.id.type, Integer)

actual_primary_key = inspect(Class).primary_key[0]
assert "id" == actual_primary_key.name

In [22]:
assert hasattr(Class, "name")
assert isinstance(Class.name.type, String)
assert hasattr(Class, "days")
assert isinstance(Class.days.type, String)
assert hasattr(Class, "professor")
assert isinstance(Class.professor.type, String)

## Question 3

It makes sense that a class has students enrolled in it. And that students can be enrolled in one or more classes. We consider this a [many-to-many relationship](https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html#many-to-many). (It may also be helpful to read [this part](https://docs.sqlalchemy.org/en/14/orm/tutorial.html#building-a-many-to-many-relationship) of the SQLAlchemy tutorial.)

Defined for you is an association table, `student_classes`. You are to re-define both the `Student` and `Class` classes:

Re-define your `Student` class (using the same table name and columns already defined) to include a `classes` attribute using SQLAlchemy's [`relationship()` method](https://docs.sqlalchemy.org/en/14/orm/relationship_api.html#sqlalchemy.orm.relationship). The relationship should be called with 3 arguments: one for the Python class that we're creating a relationship to; one for the [`back_populates`](https://docs.sqlalchemy.org/en/14/orm/relationship_api.html#sqlalchemy.orm.relationship.params.back_populates) argument; and one for [`secondary`](https://docs.sqlalchemy.org/en/14/orm/relationship_api.html#sqlalchemy.orm.relationship.params.secondary) to map to the association table.

Do the same with your `Class` class with including a `students` attrubte using the `relationship()` method.


[1.5 points]

In [23]:
# solution
from sqlalchemy.schema import ForeignKey
from sqlalchemy import Table
from sqlalchemy.orm import relationship


Base = declarative_base()
student_class = Table("student_classes", Base.metadata,
    Column("student_id", ForeignKey("students.id"), primary_key=True),
    Column("class_id", ForeignKey("classes.id"), primary_key=True),
                      
)


class Student(Base):
    __tablename__ = "students"
    
    id = Column(Integer, primary_key=True)
    uni = Column(String)
    fullname = Column(String)
    nickname = Column(String)
    graduation_date = Column(Date)
    balance = Column(Float)
    
    classes = relationship(
        "Class", back_populates="students", secondary=student_class
    )
    

class Class(Base):
    __tablename__ = "classes"
    
    id = Column(Integer, primary_key=True)
    name = Column(String)
    days = Column(String)
    professor = Column(String)
    
    students = relationship(
        "Student", back_populates="classes", secondary=student_class
    )

In [24]:
# autograder tests
from sqlalchemy import create_engine
engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)

In [25]:
tables = Base.metadata.tables

table_names = tables.keys()
exp_tables = ["classes", "students", "student_classes"]

assert sorted(table_names) == sorted(exp_tables)

In [26]:
from sqlalchemy.orm.relationships import RelationshipProperty

assert isinstance(Student.classes.property, RelationshipProperty)
assert isinstance(Class.students.property, RelationshipProperty)

## Question 4

Let's now add some data into our two tables.

With the data provided below, create 3 students using your `Student` class, create 4 classes using your `Class` class, and make sure the students are "enrolled" in the listed classes. You'll then use `session.add_all` and `session.commit` to add the 3 students and 2 classes to your database.

**Students:**

| UNI    | Fullname          | Nickname | Graduation Date   | Balance | Classes   |
|--------|-------------------|----------|-------------------|---------|-----------|
| ab1234 | Elizabeth Rose    | Liz      | May 1, 2023       | 0       | Stochastic Models |
| cd5678 | Jon-Paul Phillips | JP       | December 15, 2022 | 526.50  | Stochastic Models, Data Analytics & Machine Learning |
| ef0987 | Idris Sanders     |          | May 1, 2024       | 392.12  | Data Analytics & Machine Learning, Foundations of Data Science |

**Classes:**

| Name                              | Days | Professor       |
|-----------------------------------|------|-----------------|
| Stochastic Models                 | TR   | Marcus Brown    |
| Data Analytics & Machine Learning | MW   | Grace Robbinson |
| Foundations of Data Science       | T    | Shannon Wells   |
| Reinforcement Learning            | WF   | Joe Greene      |


Name your variables like so: `student1`, `student2`, `class1`, etc.

[2 points]

In [27]:
# solution
import datetime 
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
session = Session()


student1 = Student(
    uni="ab1234", 
    fullname="Elizabeth Rose", 
    nickname="Liz",
    graduation_date=datetime.date(2023, 5, 1),
    balance=0,
)

student2 = Student(
    uni="cd5678",
    fullname="Jon-Paul Phillips",
    nickname="JP",
    graduation_date=datetime.date(2022, 12, 15),
    balance=526.50,
)

student3 = Student(
    uni="ef0987",
    fullname="Idris Sanders",
    graduation_date=datetime.date(2024, 5, 1),
    balance=392.12
)

class1 = Class(
    name="Stochastic Models",
    days="TR",
    professor="Marcus Brown"
)
class2 = Class(
    name="Data Analytics & Machine Learning",
    days="MW",
    professor="Grace Robbinson"
)
class3 = Class(
    name="Foundations of Data Science",
    days="T",
    professor="Shannon Wells"
)
class4 = Class(
    name="Reinforcement Learning",
    days="WF",
    professor="Joe Green"
)

student1.classes.append(class1)
student2.classes.extend([class1, class2])
student3.classes.extend([class2, class3])


session.add_all([
    student1, student2, student3,
    class1, class2, class3, class4
])
session.commit()

In [28]:
# autograder tests
results = session.query(Student).filter_by(uni="ab1234").all()

assert results[0] == student1
assert results[0].classes == [class1]

In [29]:
results = session.query(Student).filter_by(uni="cd5678").all()

assert results[0] == student2
assert results[0].classes == [class1, class2]

In [30]:
results = session.query(Student).filter_by(uni="ef0987").all()

assert results[0] == student3
assert results[0].classes == [class2, class3]

In [31]:
results = session.query(Class).filter_by(students=None).all()
assert results == [class4]

## Question 5

Recall the types of joins we reviewed on Monday's lecture. Define a query against your database using SQLAlchemy's ORM with the `session.query` method that returns all classes (just the class `name`) and the number of enrolled students. Include classes that do not have any students enrolled. 

Assign your query to a variable called `result`.

Be sure that the query is executed, so that the `result` is a `list` of `tuple`s containing two items each: the name of the class, and number of students enrolled. Order of the list does not matter.

You may need to use [SQLAlchemy's `func` module](https://docs.sqlalchemy.org/en/14/core/functions.html) so it's been imported for you; but feel free to import other SQLAlchemy modules as needed.

[1 point]

In [32]:
from sqlalchemy import func

# Solution
result = session.query(Class.name, func.count(Student.id)).\
    outerjoin(Class.students).group_by(Class.name).all()

In [33]:
# autograder tests
expected = sorted([
    ('Reinforcement Learning', 0),
    ('Stochastic Models', 1),
    ('Data Analytics & Machine Learning', 2),
    ('Stochastic Models', 2)
])
assert expected == sorted(result)