# 3. SQLAlchemy orm

The main objective of the Object Relational Mapper API of SQLAlchemy is to facilitate associating user-defined Python classes with database tables, and objects of those classes with rows in their corresponding tables. Changes in states of objects and rows are synchronously matched with each other. SQLAlchemy enables expressing database queries in terms of user defined classes and their defined relationships.

## 3.1 Declare mapping

In case of ORM, the configuration process starts by
- describing the database tables
- defining classes which will be mapped to those tables.

In SQLAlchemy, these two tasks are performed together. This is done by using Declarative system; the classes created include directives to describe the actual database table they are mapped to.

In [1]:
from sqlalchemy import create_engine
base_path="../../../data/orm_test.db"
db_url=f"sqlite:///{base_path}"
# echo(default is false) when set to True will generate the activity log
# Below command will create the sqlite db, if not existed
# create_engine() will return an engine object.
# The Engine establishes a real DBAPI connection to the database when
# a method like Engine.execute() or Engine.connect() is called.
engine = create_engine(db_url, echo = True)

Below code create a `base class`, which stores a catalog of classes and mapped tables in the Declarative system. This is called as the declarative base class. There will be usually just one instance of this base in a commonly imported module. The declarative_base() function is used to create base class. This function is defined in sqlalchemy.ext.declarative module.

In [2]:
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()

A table object mapper class in Declarative must have a **__tablename__** attribute, and **at least one Column** which is part of a primary key. Declarative replaces all the Column objects with special Python accessors known as `descriptors`. In below example, we have two types of descriptors:
- column
- relationship

All the descriptors will be stored in **Base.metadata**

In [3]:
from sqlalchemy.orm import backref, relationship
from sqlalchemy import Column, Integer,String, SmallInteger, Text, DateTime, ForeignKey

# One-to-many relation
# Having a ForeignKey defines the existence of the relationship between Cohort and
# Dataset.
# Below code defines a parent-child collection. The datasets attribute being plural
# (which is not a requirement, just a convention) is an indication that it’s a collection.
# The first parameter is the class name Dataset (which is not the table name dataset), is the
# class to which the datasets attribute is related. The relationship informs SQLAlchemy that
# there’s a relationship between the **Cohort and Dataset classes**. SQLAlchemy will find the
# relationship in the Dataset class definition (line 3 of Dataset class)
# The backref parameter creates an author attribute for each Book instance. This attribute refers
# to the parent Author that the Book instance is related to.

class Cohort(Base):
    __tablename__='cohort'

    id=Column(Integer,primary_key=True)
    cname=Column(String)
    datasets=relationship("Dataset", backref=backref("cohort"))

In [4]:
class Dataset(Base):
    __tablename__="dataset"

    id=Column(Integer,primary_key=True)
    cohort_id=Column(Integer, ForeignKey("cohort.id"))
    year= Column(Integer)
    name = Column(String)
    location = Column(String)
    status = Column(SmallInteger)
    validation_tasks=relationship("ValidationTask",backref=backref("dataset"))

In [5]:
class Descriptor(Base):
    __tablename__="descriptor"

    id=Column(Integer,primary_key=True)
    dataset_id=Column(Integer, ForeignKey("dataset.id"))
    name = Column(String)
    location = Column(String)

In [6]:
class ValidationRule(Base):
    __tablename__="validation_rule"

    id=Column(Integer,primary_key=True)
    name = Column(String)
    description=Column(Text)
    args= Column(String)
    kwargs= Column(String)
    validation_tasks=relationship("ValidationTask",backref=backref("validation_rule"))

In [7]:
class ValidationTask(Base):
    __tablename__="validation_task"

    id=Column(Integer,primary_key=True)
    start_date=Column(DateTime)
    end_date=Column(DateTime)
    dataset_id=Column(Integer, ForeignKey("dataset.id"))
    validation_rule_id=Column(Integer,ForeignKey("validation_rule.id"))
    task_status = Column(SmallInteger)
    output = Column(Text)


In [8]:
Base.metadata.create_all(engine)

2022-12-12 09:40:26,176 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-12-12 09:40:26,177 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("cohort")
2022-12-12 09:40:26,197 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-12-12 09:40:26,200 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("cohort")
2022-12-12 09:40:26,201 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-12-12 09:40:26,201 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("dataset")
2022-12-12 09:40:26,202 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-12-12 09:40:26,203 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("dataset")
2022-12-12 09:40:26,217 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-12-12 09:40:26,217 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("descriptor")
2022-12-12 09:40:26,228 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-12-12 09:40:26,229 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("descriptor")
2022-12-12 09:40:26,233 INFO sqlalchemy.engine.Engine [raw sql