Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] Models to Schemas #17

Closed
vitalik opened this issue Aug 28, 2020 · 9 comments
Closed

[proposal] Models to Schemas #17

vitalik opened this issue Aug 28, 2020 · 9 comments
Labels
design API design

Comments

@vitalik
Copy link
Owner

vitalik commented Aug 28, 2020

Read full proposal here - http://django-ninja.rest-framework.com/proposals/models/

To make it possible to convert django models to ninja Schemas:

from ninja import ModelSchema

class UserOut(ModelSchema):
    class Meta:
        model = User
        fields = ['email', 'first_name', 'last_name']

This feature will definitely land soon to production code

But there is just one open question:

configuration class - should it be Meta (django way) or Config (pydantic way) ?

@vitalik vitalik added the design API design label Aug 28, 2020
@vitalik vitalik pinned this issue Aug 28, 2020
@LowerDeez
Copy link

https://github.com/jordaneremieff/pydantic-django

@jordaneremieff
Copy link

Hi @vitalik,

I'm the maintainer of https://github.com/jordaneremieff/djantic (previously pydantic-django). There was previously an issue where someone brought up collaborating with this project, but at that time it was still very new/experimental so I had not pursued it further.

However, I've recently switched from experimental status to a more stable release, and I noticed you're doing some very similar work in this repo so I wanted to float the idea of collaboration at this time. I think the ModelSchema implementation provided in Djantic would work seamlessly with the other features of this library.

Would you like to share the effort here and become a co-maintainer of Djantic and it could serve as an upstream dependency to Django Ninja? My reasoning for proposing it as a dependency (rather than including it directly in this package itself) is because I think a package specific to the ORM could be used more generically in multiple frameworks and use-cases that aren't specifically related to JSON APIs.

Please let me know what you think, cheers.

@jdzero
Copy link

jdzero commented Apr 16, 2021

Hi @vitalik, @jordaneremieff, and everyone,

Thank you both for your respective efforts -- great stuff! As I have ebbed and flowed between Django/REST/Channels, FastAPI, Pydantic, Celery, and Kombu on the backend, and then embraced more use of async (e.g. websockets) on both GUI and non-GUI (e.g. CLI) frontends, the glaring problem is that "I want to declare my schema and core validation one time but I am not sure where to do it." Specifically, I would characterize the problem further as follows:

  1. Specification formats like OpenAPI, OpenRPC, and AsyncAPI are amazing, but they are often under-specified. What I mean there is you have a field with complex validation that is only performed on the backend (e.g. requires membership testing across a relationship, requires complex calculation/limit testing, etc.). The best thing they can do is be complete enough to tell a front-end "here is the schema and the fields, here are the relationship, here are the basic validations you can do pre-submission, and here are is the URL you can for a pre-submission backend validation test dry run."
  2. Database models (e.g. SQL Alchemy, Django ORM) are often over-specified. They have hyper-particular implementation characteristics that are no longer agnostic to data storage backend.

Thus, I think the Schema "Models" introduced by Pydantic are the best place to have the "single source of truth" exist, and we reverse the directionality of meta-inheritance.

Consider the following:

  • I have a frontend SPA consuming OpenAPI and AsyncAPI spec to auto-generate its forms/actions.
  • I have a backend serving REST APIs and websockets endpoint(s) advertising their specification via OpenAPI and AsyncAPI which uses a relational database and defers asynchronous tasking to worker microservices.
  • I have a microservice worker that interacts with the same relational database and receives its tasking over a message queue which also uses ~AsyncAPI for specification.
  • I have a microservice worker that interacts with a NoSQL database and receives its tasking over a message queue which uses ~AsyncAPI for specification.

From a DRY perspective, I would propose building as follows:

  • "schemas" - pydantic, can be used to easily generate OpenAPI and AsyncAPI specifications when implemented as an API, can also be used to implement any complex validation items in the abstract... checking membership, complex math... since it is backend code it can all be abstract implementation and tested. The schemas would be used by all three backend services. Critically, they have the core business logic defined one time. The concept of membership testing a post belongs to a blog can be abstractly implemented here, and then to use the abstract method one must implement it fully in models.
  • "models" - orm... django, sqlalchemy, nosql adapter, implementations of the schemas for data persistence purposes and concrete implementation-specific aspects of the abstract implementations from schemas... in this example, there would be two sets of models... one shared by the web server and first worker, a second only used by the second worker.
  • "controllers" - where API endpoints are built whether they talk HTTP or AMQP... inherit the business logic and data models from the other two and provide binding logic to realize all of it... think as ViewSets or ModelAdmin classes in DRF/Admin.

All put together:

schema.py

from __future__ import annotations
from typing import Any, List, Optional
from pydantic import BaseModel, Field, conint


class Pet(BaseModel):
    name: Optional[str] = None
    age: Optional[conint(ge=0)] = Field(None, description='Age in years.')


class Person(BaseModel):
    first_name: str = Field(..., description="The person's first name.")
    last_name: str = Field(..., description="The person's last name.")
    age: Optional[conint(ge=0)] = Field(None, description='Age in years.')
    pets: Optional[List[Pet]] = None
    comment: Optional[Any] = None

models.py

class Pet(Model):
    class Meta:
        schema = schemas.Pet


class Person(Model):
    class Meta:
        schema = schemas.Person

models.py equivalent (sorry for lazy psuedo-code)

class Pet(Model):
    name = CharField(max_length=100, empty=True)
    age = PositiveIntegerField(null=True)
    person = ForeignKey(to=Person, related_name='pets')


class Person(Model):
    first_name = CharField(max_length=100)
    last_name= CharField(max_length=100)
    age = PositiveIntegerField(null=True)
    comment = CharField(max_length=1000, empty=True)

I look forward to helping with the development of these efforts. Thank you all for your consideration and feedback!

@vitalik
Copy link
Owner Author

vitalik commented Apr 18, 2021

Hi @jdzero
Thats interesting approach !

But in my practice (and what I see from other people feedbacks) it's usually the other way round

one django model - can produce multiple pydantic schemes:

  • for list (where you need maybe only title + id)
  • for details - where you need ALL fields (and maybe even deep related ones)
  • for creation (where you need to have at least id to be readonly field)
  • path (where all fields are optional)

on the other hand your approach can just a plugin/external library - pretty easy implementable..

@vitalik
Copy link
Owner Author

vitalik commented Apr 18, 2021

Hi @vitalik,

I'm the maintainer of https://github.com/jordaneremieff/djantic (previously pydantic-django). There was previously an issue where someone brought up collaborating with this project, but at that time it was still very new/experimental so I had not pursued it further.
...
... Djantic and it could serve as an upstream dependency to Django Ninja?

Well that sounds good, but there is one slight problem that I do not see how to overcome

a very "core" of the django ninja is the Schema class inherited from pydantic basemodel and tied to a lot of parts of django ninja

Even the naming (schema) is pretty important to avoid confusion between django models and pydantic models...

maybe the class can be come dependency, but alias/naming not sure...

@jdzero
Copy link

jdzero commented Apr 19, 2021

@vitalik

Thanks for the feedback! I see what you are saying.

Some terms (maybe helps with other topic as well):

  • (Data storage) schemas are the fundamental fields (required and optional) and validation which form the basis for data models as implemented by an ORM. This is what I was trying to propose is used to automate data model creation.
  • (Data/Database) models are implementations of data storage schemas. Today, we do this concurrently by making a data model and specifying the fields and validation on it, but what pydantic provides that Django and SQLAlchemy do not is a data storage-agnostic way to do the field/validation part.
  • (API/view/presentation) schemas (e.g. DRF serializers) are per-action data expectations. This is the focus of your effort is to make this quicker/DRY-er by inheriting from model.
  • (API) specification is a roll-up of a particular set of schemas, addresses, and actions for a server and technology class (e.g. REST API, Websockets, Message Queue).

The goal of your effort is to alleviate the effort of going from data models to API schemas, whereas what I am discussing is a new body of work upstream that gets you from data storage schemas into data models.

Thanks again for your time, and please let me know if I can assist if you move out on any of these efforts!

@jordaneremieff
Copy link

jordaneremieff commented Apr 20, 2021

Hi @vitalik

a very "core" of the django ninja is the Schema class inherited from pydantic basemodel and tied to a lot of parts of django ninja

Yeah, this is an important detail, and I've given it a lot of thought previously - eventually I landed on ModelSchema. I think ModelSchema would be a clear distinction between the Schema class and the BaseModel classes and be easy to understand in this context. A similar naming convention and use-case would be the ModelSerializer class used in DRF where the primary difference is that one of these classes has a model configured specifically in a custom subclass and the other does not.

Another idea might also be to move the implementation of the existing Schema here into Djantic to provide a common interface, and the ModelSchema that handles specific model configuration would be implemented on top of it as subclass - maybe we can do a PoC in Djantic to see how this might look?

@blazing-gig
Copy link

blazing-gig commented May 28, 2021

Hi @vitalik,

I've been working extensively with various Django based backend architectures and finally decided to give django-ninja a try. Seems pretty simple and intuitive and resembles a lot of FastAPI paradigms as quoted. After writing a bit of code, I found that having a pydantic schema centric approach kinda made more sense.

Upon reading this discussion thread, I found that it was cited by @jdzero as well. To add to that, I was also considering making the ORM queries natively serialize into Pydantic schema instances rather than instances of models.Model. This is roughly illustrated below:

class Pet:
    name = CharField(max_length=100, empty=True)
    age = PositiveIntegerField(null=True)

class PetSchema(BaseModel):
    name: Optional[str] = None
    age: Optional[conint(ge=0)] = Field(None, description='Age in years.')
    class Config:
        objects = Pet.objects
...

qs: QuerySet[PetSchema] = PetSchema.objects.all()

The advantages of doing something like the above are as follows:

  1. Only one data container to pass/reference DB data in code. Otherwise, there are models.Model instances and ModelSchema instances which can both act as data containers for passing around DB data.
  2. The overhead of serialising data to model.Model instances and then to ModelSchema instances is avoided since there is a significant performance loss in doing both these steps as opposed to just one level of serialisation (from raw DB dicts to ModelSchema instances).

But in order to implement the above, support for serialising to a different type should be baked into the core Django ORM code.


On the other hand, considering that the Django convention over the years has been to have models.Model as the source of truth and serializers slice out different views of the same (as pointed out by @vitalik) , I don't think something like the above can be adopted that easily by the community.

To answer the main topic of discussion, personally I'd prefer explicitly declaring the ModelSchema over auto-generation since I have more fine grain control of how it should behave and also the type hints that any IDE might provide. Also, the Meta key seems to fit the use-case well unless developers are allowed to pass in keys of the Config property of pydantic models directly.

Also, since both model.Model and ModelSchema classes will co-exist, serialization from one type to the other has to be super fast. For this reason, I've submitted a feature request to pydantic to include a skip_validation flag in the .from_orm method since this might significantly boost performance while converting from an already existing models.Model instance to the ModelSchema instance.

Let me know your thoughts on the same. Thanks.

@radusuciu
Copy link

radusuciu commented Jul 9, 2021

I'm looking into alternatives to DRF and just wanted to add my 2 cents. I definitely like the idea of leveraging the work already done in the djantic project -- seems like a very natural fit that would avoid duplicating efforts and make it easier to achieve full support for the many features of Django's models (eg. jordaneremieff/djantic#17, and jordaneremieff/djantic#3).

As to the question posed in this issue, I would opt for Config, to add another hint that these classes are not Django models.

@vitalik vitalik closed this as completed Sep 19, 2021
@vitalik vitalik unpinned this issue Sep 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design API design
Projects
None yet
Development

No branches or pull requests

6 participants