## Working with Database Metadata

With engines and SQL execution down, we are ready to begin some `Alchemy`. The central element of both `SQLAlchemy Core` and `ORM` is the `SQL Expression Language` which allows for fluent, composable construction of SQL queries. The foundation for these queries are _Python objects that represent database concepts like tables and columns_. These objects are known collectively as __database metadata__.

The most common foundational objects for database metadata in SQLAlchemy are known as `MetaData, Table, and Column`. The sections below will illustrate how these objects are used in both a Core-oriented style as well as an ORM-oriented style.

#### Setting up MetaData with Table objects

When we work with a relational database, the basic structure that we create and query from is known as a `table`. In SQLAlchemy, the `table` is represented by a Python class similarly named `Table`.

To start using the `SQLAlchemy Expression Language`, we will want to have `Table` objects constructed that represent all of the database tables we are interested in working with. Each `Table` may be __declared__, meaning we _explicitly spell out in source code what the table looks like_, or may be __reflected__, which means we _generate the object based on what’s already present in a particular database_. The two approaches can also be blended in many ways.

Whether we will _declare_ or _reflect_ our tables, we start out with a collection that will be where we place our tables known as the `MetaData` object. This object is essentially a `facade around a Python dictionary` that stores a series of `Table` objects keyed to their string name.

In [14]:
from sqlalchemy import MetaData, Table, Column, Integer, String, ForeignKey, create_engine
from sqlalchemy.orm import registry, relationship

In [2]:
engine = create_engine("sqlite+pysqlite:///:memory:", echo=True, future=True)

In [3]:
metadata_obj = MetaData()

Having `a single MetaData object` for an entire application is the most common case, represented as a _module-level variable_ in a single place in an application, often in a __models__ or __dbschema__ type of package. There can be `multiple MetaData collections` as well, however it's typically _most helpful if a series of Table objects that are related to each other belong to a single MetaData collection_.

_Once we have a `MetaData` object, we can declare some `Table` objects_. This tutorial will start with the classic SQLAlchemy tutorial model, that of the table `user`, which would for example _represent the users of a website_, and the table `address`, representing a list of _email addresses associated with rows in the user table_. We normally assign each `Table` object to a variable that will be how we will refer to the table in application code.

In [4]:
user_table = Table(
    "user_account", metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("name", String(30)),
    Column("fullname", String),
)

We can observe that the above `Table` construct looks a lot like a `SQL CREATE TABLE statement`; starting with the _table name, then listing out each column, where each column has a name and a datatype_. The objects we use above are:

* __`Table`__ - represents a database table and assigns itself to a `MetaData` collection.

* __`Column`__ - represents a column in a database table, and assigns itself to a `Table` object. The `Column` usually includes _a string name and a type object_. The collection of Column objects in terms of the parent `Table` are typically accessed via an associative array located at `Table.c`.

In [5]:
user_table.c.name

Column('name', String(length=30), table=<user_account>)

In [6]:
user_table.c.keys()

['id', 'name', 'fullname']

* __`Integer, String`__ - these classes represent `SQL datatypes` and can be passed to a `Column` _with or without necessarily being instantiated_. Above, we want to give a length of `30` to the `name` column, so we instantiated `String(30)`. But for `id` and `fullname` we did not specify these, so we can send the class itself.

<!-- create table model with declaration strategy -->

#### Declaring Simple Constraints

The first Column in the above `user_table` includes the `Column.primary_key` parameter which is a shorthand technique of indicating that this `Column` should be part of the _primary key_ for this table. The _primary key_ itself is normally declared implicitly and is represented by the `PrimaryKeyConstraint` construct, which we can see on the `Table.primary_key` attribute on the `Table` object.

In [7]:
user_table.primary_key

PrimaryKeyConstraint(Column('id', Integer(), table=<user_account>, primary_key=True, nullable=False))

The constraint that is most typically declared explicitly is the `ForeignKeyConstraint` object that corresponds to a database _foreign key constraint_. When we declare tables that are related to each other, SQLAlchemy uses the presence of these _foreign key constraint_ declarations not only so that they are emitted within `CREATE statements` to the database, but also to assist in `constructing SQL expressions`.

A `ForeignKeyConstraint` that involves only a single column on the target table is typically declared using a _column-level shorthand notation_ via the `ForeignKey` object. Below we declare a second table `address` that will have a _foreign key constraint_ referring to the `user` table.

In [8]:
address_table = Table(
    "address", metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("user_id", ForeignKey("user_account.id"), nullable=False),
    Column("email_address", String, nullable=False)
)

The table above also features a third kind of _constraint_, which in SQL is the _`NOT NULL` constraint_, indicated above using the `Column.nullable` parameter.

When using the `ForeignKey` object within a `Column` definition, _we can omit the datatype for that `Column`_; it is __automatically inferred__ from that of the related column, in the above example the `Integer` datatype of the `user_account.id` column.

#### Emitting DDL to the Database

We've constructed a fairly elaborate object hierarchy to represent _two_ database tables, starting at the root `MetaData` object, then into _two_ `Table` objects, each of which hold onto _a collection of `Column` and `Constraint` objects_. This object structure will be at the center of most operations we perform with both _Core_ and _ORM_ going forward.

The first useful thing we can do with this structure will be to emit `CREATE TABLE statements`, or `DDL`, to our `SQLite database` so that we can _insert_ and _query_ data from them. We have already all the tools needed to do so, by invoking the `MetaData.create_all()` method on our `MetaData`, sending it the `Engine` that refers to the target database.

In [9]:
metadata_obj.create_all(engine)

2022-09-20 06:31:08,879 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-09-20 06:31:08,881 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("user_account")
2022-09-20 06:31:08,882 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-09-20 06:31:08,884 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("user_account")
2022-09-20 06:31:08,886 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-09-20 06:31:08,887 INFO sqlalchemy.engine.Engine PRAGMA main.table_info("address")
2022-09-20 06:31:08,888 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-09-20 06:31:08,890 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info("address")
2022-09-20 06:31:08,891 INFO sqlalchemy.engine.Engine [raw sql] ()
2022-09-20 06:31:08,894 INFO sqlalchemy.engine.Engine 
CREATE TABLE user_account (
	id INTEGER NOT NULL, 
	name VARCHAR(30), 
	fullname VARCHAR, 
	PRIMARY KEY (id)
)


2022-09-20 06:31:08,894 INFO sqlalchemy.engine.Engine [no key 0.00069s] ()
2022-09-20 06:31:08,896 INFO sqlalchemy.engine.Engine 
C

The `DDL` create process by default includes some `SQLite-specific PRAGMA statements` that _test for the existence of each table before emitting a CREATE_. The full series of steps are also included _within a BEGIN/COMMIT pair_ to accommodate for `transactional DDL` _(`SQLite` does actually support `transactional DDL`, however the `sqlite3` database driver historically runs DDL in `autocommit mode`)_.

The create process also takes care of emitting `CREATE statements` in the correct order; above, the `FOREIGN KEY constraint` is dependent on the `user` table existing, so the `address` table is created second. In more complicated dependency scenarios the `FOREIGN KEY constraints` may also be applied to tables after the fact using `ALTER`.

The `MetaData` object also features a `MetaData.drop_all()` method that will emit `DROP statements` in the __reverse order__ as it would emit `CREATE` in order to _drop schema elements_.

##### Migration tools are usually appropriate

Overall, the `CREATE/DROP` feature of `MetaData` is useful for test suites, small and/or new applications, and applications that use short-lived databases. For management of an application database schema over the long term however, a _schema management tool_ such as [`Alembic`](https://alembic.sqlalchemy.org/en/latest/), which builds upon `SQLAlchemy`, is likely a better choice, as it can __manage and orchestrate__ the process of incrementally _altering a fixed database schema over time_ as the design of the application changes.

#### Defining Table Metadata with the ORM

This `ORM-only section` will provide an example declaring the _same database structure illustrated in the previous section_, using a more __ORM-centric configuration paradigm__. When using the ORM, the process by which we declare `Table` metadata is usually combined with the process of _declaring mapped classes_. The mapped class is any `Python class` we'd like to create, which will then have _attributes on it that will be linked to the columns in a database table_. While there are _a few varieties_ of how this is achieved, the most common style is known as __declarative__, and allows us to declare our _user-defined classes_ and `Table` metadata at once.

##### Setting up the Registry

When using the `ORM`, the `MetaData` collection remains present, however it itself is contained within an `ORM-only object` known as the `registry`. We create a `registry` by constructing it.

In [11]:
mapped_registry = registry()

The above `registry`, when constructed, _automatically_ includes a `MetaData` object that will store a collection of `Table` objects.

In [12]:
mapped_registry.metadata

MetaData()

Instead of declaring `Table` objects directly, we will now declare them indirectly through directives applied to our mapped classes. In the most common approach, each mapped class _descends from a common base class_ known as the `declarative base`. We get a new declarative base from the `registry` using the `registry.generate_base()` method.

In [13]:
Base = mapped_registry.generate_base()

The steps of creating the `registry` and `declarative base` classes can be _combined_ into one step using the _historically_ familiar `declarative_base()` function:

```
from sqlalchemy.orm import declarative_base
Base = declarative_base()
```

##### Declaring Mapped Classes

The `Base` object above is a Python class which will serve as the _base class for the ORM mapped classes_ we declare. We can now define _ORM mapped classes_ for the `user` and `address` table in terms of new classes `User` and `Address`.

In [15]:
class User(Base):
    __tablename__ = "user_account"
    
    id = Column(Integer, primary_key=True)
    name = Column(String(30))
    fullname = Column(String)
    
    addresses = relationship("Address", back_populates="user")
    
    def __repr__(self):
        return f"User(id={self.id!r}, name={self.name!r}, fullname={self.fullname!r})"

In [16]:
class Address(Base):
    __tablename__ = "address"
    
    id = Column(Integer, primary_key=True)
    email_address = Column(String, nullable=False)
    user_id = Column(Integer, ForeignKey("user_account.id"))
    
    user = relationship("User", back_populates="addresses")
    
    def __repr__(self):
        return f"Address(id={self.id!r}, email_address={self.email_address!r})"

The above two classes are now our mapped classes, and are available for use in `ORM persistence and query operations`, which will be described later. But they also include `Table` objects that were generated as part of the _declarative mapping process_, and are equivalent to the ones that we declared directly in the previous Core section. We can see these `Table` objects from a declarative mapped class using the `.__table__` attribute.

In [17]:
User.__table__

Table('user_account', MetaData(), Column('id', Integer(), table=<user_account>, primary_key=True, nullable=False), Column('name', String(length=30), table=<user_account>), Column('fullname', String(), table=<user_account>), schema=None)

This `Table` object was generated from the _declarative process_ based on the `.__tablename__` attribute defined on each of our classes, as well as through the use of `Column` objects assigned to _class-level attributes_ within the classes. These `Column` objects can usually be declared _without an explicit `name` field_ inside the constructor, as the _Declarative process_ will name them __automatically based on the attribute name__ that was used.