-
Notifications
You must be signed in to change notification settings - Fork 0
ORM Models and Schema Models
ORM Models describe the structure of entity objects that are persisted to database or other persistent storage. SqlAlchemy defines the syntax for ORM Models.
Schema Models describe how entity objects are represented. Schema models also provide validation rules and serialization/deserialization. Pydantic defines the syntax for Schema models.
You can Populate a ORM Model from a Schema Model and vice versa.
Validation can be done by Pydantic or SqlAlchemy.
The SqlAlchemy 2.0 official style is:
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy import Integer, String
from sqlalchemy.ext.async import AsyncAttrs
class Base(AsyncAttrs, DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
username: Mapped[str] = mapped_column(String(50), nullable=False)
class UserPassword(Base):
__tablename__ = "user_passwords"
hashed_password: Mapped[str] = mapped_column(String, nullable=False)
user_id: Mapped[int] = mapped_column(ForeignKey("users.id", ondelete="CASCADE"), nullable=False)Inferred values can be omitted, such as Integer for primary key(in code above).
You can completely omit mapped_column and accepted inferred or default values, as in:
username: Mapped[str] # inferred column type is `String`Explanation
-
Mapped[int]is a generic marker meaning "this is an ORM-mapped attribute" -
mapped_column(...)creates a Column object under the hood, but designed to work with Python typing. - in
mapped_columnit is not necessary to specify the datatype if it can be inferred from the type hint. - the entire
mapped_columncan be omitted if you want the default datatype and inferred properties, i.e.username: Mapped[str]defaults tomapped_column('String', nullable=True) - specify the data type if you need to add detail (
String(50)orDateTime(timezone=True)), disambiguate, or add options:created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False)
Prior to SqlAlchemy 2.0, ORM table models were written as:
Base = declarative_base()
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True, nullable=False)
username = Column(String(50), unique=True, nullable=False)Reasons for the change in SqlAlchemy 2.0:
- Type hints to enable type checking tools
- "Clarity" (ha!)
You can have multiple Pydantic schema models for one persistence model. Each schema provides a subset of attributes for a particular purpose.
import pydantic
from pydantic import EmailStr
from datetime import datetime, timezone
class UserCreate(pydantic.BaseModel):
"""User attributes that are given to a service endpoint to create a new User entity."""
email: EmailStr
username: Optional[str] = None
class User(UserCreate):
"""The complete User schema."""
id: int
# In model classes, these default to current time
created_at: datetime = datetime.now(timezone.utc)
updated_at: datetime = datetime.now(timezone.utc)
# model_config replaces the Config inner-class in Pydantic 2.0
model_config = ConfigDict(from_attributes=True)or:
model_config = ConfigDict(from_attributes=True, model_class="User")Any of these techniques can be used.
1. Use **user_data.model_dump() (Pydantic v2) or **user_data.dict() (v1 or v2)
def save_user(user_data: schemas.UserCreate):
user = models.User(**user_data.model_dump())- This assumes attribute names in
UserCreateschema match those inUsermodel.
2. Explicit Assignment of Attributes
def save_user(user_data: schemas.UserCreate):
user = models.User(
username=user_data.username,
email=user_data.email
)- Requires manual updating if new attributes added to model and schema.
3. Factory method in User model or UserCreate schema
Define your own method to perform the conversion. Models should not depend on schema, so put the method in UserCreate schema or a separate factory class:
def save_user(user_data: schemas.UserCreate):
user = user_data.as_model()
# schemas class
class UserCreate(BaseModel):
def as_model(self) -> models.User:
return models.User(username=self.username, email=self.email)Validation is done by Pydantic.
-
Schema classes automatically apply validation rules when you create a new schema object, but not if you assign a new value to an object.
import schemas user = schema.UserCreate(username="Santa", email="santa@xmas.org") # but doesn't validate email here: user.email = "santa@"
-
**
model_validate(obj)** class method validates the parameter.obj` can be a model, a dict, or another schema object and returns a new Pydantic model instance.import schemas user_in = schemas.UserCreate(username="Santa", email="santa@xmas.org") # validate & create a different schema object (User) user = schemas.User.model_validate(user_in) data = {'username': 'harry', 'email': 'hackers@com'} user = schemas.User.model_validate(data) # raises ValidationError because 'email' is malformed
-
model_validate_json(json_data)validate JSON data (against a schema class) and returns an instance of the schema class.
Suppose user is an existing SqlAlchemy model (models.User) and user_data is a schema instance (schemas.User).
To update only explicitly set fields:
update_data = user_data.model_dump(exclude_unset=True)
for field, value in update_data.items():
setattr(user, field, value)The difference between back_populates and backref in SQLAlchemy ORM relationships is in how the bidirectional relationship is declared and controlled.
Both are used to create two-way navigation between ORM objects, but they differ in:
| Feature | back_populates |
backref |
|---|---|---|
| Explicit declaration | You must declare relationships on both sides | Declares both sides in one place |
| Control | Greater control; each side can have its own config | Less control; both sides share config |
| Clarity | More explicit; easier to read in large schemas | More concise; good for simple relationships |
| Customization | Each side can have independent cascade, lazy, etc. |
Must use backref() to customize both sides |
class User(Base):
__tablename__ = 'users'
id = mapped_column(Integer, primary_key=True)
user_password = relationship("UserPassword", back_populates="user")
class UserPassword(Base):
__tablename__ = 'user_passwords'
id = mapped_column(Integer, primary_key=True)
user_id = mapped_column(ForeignKey("users.id"))
user = relationship("User", back_populates="user_password")This requires you to define the relationship on both sides, and each side can have its own configuration
(e.g., lazy, cascade, uselist, etc.).
from sqlalchemy.orm import backref
class User(Base):
__tablename__ = 'users'
id = mapped_column(Integer, primary_key=True)
user_password = relationship("UserPassword", backref="user")This automatically creates the user attribute on UserPassword behind the scenes.
If you want to customize the other side, use backref(...) with parameters:
user_password = relationship(
"UserPassword",
backref=backref("user", lazy="joined", cascade="all, delete")
)Use back_populates when... |
|---|
| You want fine-grained control over each side |
You need to customize lazy, cascade, or uselist differently |
| You want explicit clarity in model definitions |
Use backref when... |
|---|
| You want concise code for simple bidirectional relationships |
| Both sides can share the same configuration |
-
back_populatesis explicit, flexible, and preferred in complex or production-grade code. -
backrefis concise and automatic, suitable for simpler cases. - Should not use both together on the same relationship. Use one or the other.