Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate pydantic model python code from jsonschema #643

Closed
timothyjlaurent opened this issue Jul 5, 2019 · 19 comments
Closed

Generate pydantic model python code from jsonschema #643

timothyjlaurent opened this issue Jul 5, 2019 · 19 comments
Labels

Comments

@timothyjlaurent
Copy link

timothyjlaurent commented Jul 5, 2019

Feature Request

Ability to generate pydantic models from jsonschema. swaggerapi/openapi-codegen libraries don't do a very good job creating client models that validate themselves -- I would like to be able to generate pydantic models from a jsonschema.


This issue relates specifically to pydantic model code generation, see this comment below.

@koxudaxi
Copy link
Contributor

koxudaxi commented Jul 6, 2019

@timothyjlaurent
I agree too. It's a good idea to improve speed for developing API.
Also, I have the same idea, and I'm creating the code generator. The repository closed project yet.
I have opened the project for example.
https://github.com/koxudaxi/datamodel_code_generator

But, We should add a feature to create a code generator. #507
I'm making a pull request for the feature. #628

@samuelcolvin The code generator will be a great advantage on pydantic.
I will implement and upload pypi the code generator for pip after custom root type feature will be merged.
However, I think the best is you make and distribute the code generator as an official project.

@samuelcolvin
Copy link
Member

since it's a standalone feature and won't effect anyone not using it, I'd be happy to accept a PR implementing a model generator.

@haizaar
Copy link
Contributor

haizaar commented Jul 8, 2019

I'd love to have this feature!

/cc @motinani @worroc

@tiangolo
Copy link
Member

I started a local experiment to generate Pydantic models from JSON Schema, using Jinja2 templates, and used in a recent project. I'm still not generating models that reference other models.

But I realized that probably a better approach would be to have two separated features (or packages if not suitable for Pydantic?):

  • From a Pydantic model in memory, e.g. created with create_model, generate the equivalent Python code as a text string e.g.:
from pydantic import BaseModel, Schema

class SomeDynamicModel(BaseModel):
    some_str_field: str
    some_int_field: int = Schema(..., title="Some title")
  • And from a JSON Schema input, generate a dynamic Pydantic model.

The two features combined would result in being able to generate Pydantic models from JSON Schema.

But the separated components could be extended to, e.g.:

  • Generate dynamic Pydantic models from DB (e.g. SQLAlchemy) models and then generate the Python code Pydantic models.
  • Generate dynamic Pydantic models (without Pydantic model code) directly from JSON Schema (might be useful for someone?)
  • Probably other use cases...

I'm just brainstorming here, but some of these ideas might be useful in case someone ( 😉 @koxudaxi ) implements it before me...

@koxudaxi
Copy link
Contributor

@tiangolo
Thank you for showing us your idea.
I have not thought about using the dynamic model creation with create_model.
I may use the way to develop it.

Generate dynamic Pydantic models from DB (e.g. SQLAlchemy) models and then generate the Python code Pydantic models.

I agree. I guess the code generator needs multi-type parsers for OpenAPI, json-schema, SQL DDL or SQLAlchemy model.

I will implement this experimental idea next time :)
parser --> pydantic.create_model --> code_generator

@samuelcolvin samuelcolvin added the schema JSON schema label Jul 19, 2019
@samuelcolvin
Copy link
Member

samuelcolvin commented Jul 19, 2019

Okay, having thought about this more I think it should be outside pydantic, a separate package.

I'd be happy to link to it, but given the amount of interest in pydantic I'm already struggling to deal with the load. Since this is easily split, best to keep it complete separate.

Also if you're using jinja you'll have completely separate dependencies.

@samuelcolvin samuelcolvin removed the help wanted Pull Request welcome label Jul 19, 2019
@koxudaxi
Copy link
Contributor

@samuelcolvin

I'd be happy to link to it, but given the amount of interest in pydantic I'm already struggling to deal with the load. Since this is easily split, best to keep it complete separate.

If I create the code generator, then Where do I uploading it on GitHub?
This repogitory of pydantic?

Also if you're using jinja you'll have completely separate dependencies.

Did you think the code generator should not import pydantic module?
We need a structure to build pydantic code which has class name and field, schema.
I think we can use pydantic.BaseModel and pydantic.fields.Field as the structure.

@samuelcolvin
Copy link
Member

If I create the code generator, then Where do I uploading it on GitHub?

Your own repo, and upload to pypi under your own account.

Did you think the code generator should not import pydantic module?

I'm afraid I don't know what you mean.

There are two possible approaches, I have no preference on which (or both) to use:

  • Create a python object of a pydantic model representing a jsonschema using create_model
  • Create a file with python code in which is a valid pydantic model - this might (or might not) use jinja to generate the code

@koxudaxi
Copy link
Contributor

Thank you for your reply.
I understand.

@elijahbenizzy
Copy link

elijahbenizzy commented Feb 6, 2020

Hi folks! Has there been any progress on this?
Maybe my use-case is simpler -- using schema we can easily serialize a models definition. Is there a way to deserialize it into a model class as well? E.G. a BaseModel.from_schema method?

@samuelcolvin
Copy link
Member

No progress as far as I know.

Might be possible using create_model(), I would review a PR if someone worked on it.

@samuelcolvin samuelcolvin reopened this Feb 6, 2020
@afiorillo
Copy link

Hi! I stumbled on this while researching for a project at work. I've built a proof of concept before (also using Jinja, but targeting marshmallow instead) and could maybe put out some kind of beta in the next week or so. If @koxudaxi has made substantial progress I would rather contribute to that... 😄 ?

@joshbode
Copy link

joshbode commented Feb 7, 2020

I'm using @koxudaxi 's tool for an internal project (an openapi client) and it's been very useful!

@koxudaxi
Copy link
Contributor

koxudaxi commented Feb 7, 2020

I implemented JasonSchema parser on my code-generator.
It means the tool can generate a pydantic model from JsonSchema since version 0.3.0.
The version was released on Jan 10, 2020.

https://github.com/koxudaxi/datamodel-code-generator
(I use the tool a few internal projects)

@afiorillo
@joshbode contributes a lot of PR on my code-generator project. 😉
We're welcome to your contribution. It will help the pydantic community.

@samuelcolvin
Copy link
Member

samuelcolvin commented Feb 7, 2020

There are two potential routes here:

  1. generating python code. This is done by https://github.com/koxudaxi/datamodel-code-generator and should remain outside the pydantic library. @koxudaxi feel free to add a note to the pydantic docs linking to this
  2. creating a dynamic pydantic model class, similar to what can currently be done via create_model(). This could potentially be added to the pydantic library if someone wants to work on it.

The original question here doesn't explain which approach they're looking for, but many of the comments indicate interest in 1. Therefore let's leave this issue to relate to code generation.

If you'd like to create dynamic pydantic models based on JsonSchema, please create a new issue.

@samuelcolvin samuelcolvin changed the title Feature Request - generate Pydantic models from jsonschema Generate pydantic model python code from jsonschema Feb 7, 2020
@elijahbenizzy
Copy link

@samuelcolvin Thanks for following up! Those solutions make sense. The code-generation makes solves a variety of use-cases (including ours). I'll think more about it and see how useful (2) would be and create another issue if its something we'd want to implement/need.

@muruges95
Copy link

muruges95 commented Oct 30, 2023

Unsure if there is still any interest in the second route proposed by @samuelcolvin, but managed to find a package that seems to do just that here: https://github.com/c32168/dyntamic, though it's probably only supports very simple schemas for now.

@pietz
Copy link

pietz commented Nov 4, 2023

Unsure if there is still any interest in the second route proposed by @samuelcolvin, but managed to find a package that seems to do just that here: https://github.com/c32168/dyntamic, though it's probably only supports very simple schemas for now.

Would love to see this integrated in pydantic. I'm working on something where users define models in a GUI. I want to store the model definition in a noSQL database, which I can simply do by generating the schema. Going the other way I also want to by able to create a pydantic model from that schema at runtime. I'm surprised this is such a niche feature.

alexdrydew pushed a commit to alexdrydew/pydantic that referenced this issue Dec 23, 2023
@kaavee315
Copy link

kaavee315 commented Mar 15, 2024

Gemini's code:

from pydantic import BaseModel, Field, create_model
from pydantic.json_schema import model_json_schema
from typing import Any, Dict, List, Optional, Type, Union

def json_schema_to_model(json_schema: Dict[str, Any]) -> Type[BaseModel]:
    """
    Converts a JSON schema to a Pydantic BaseModel class.

    Args:
        json_schema: The JSON schema to convert.

    Returns:
        A Pydantic BaseModel class.
    """

    # Extract the model name from the schema title.
    model_name = json_schema.get('title')

    # Extract the field definitions from the schema properties.
    field_definitions = {
        name: json_schema_to_pydantic_field(name, prop, json_schema.get('required', []) )
        for name, prop in json_schema.get('properties', {}).items()
    }

    # Create the BaseModel class using create_model().
    return create_model(model_name, **field_definitions)

def json_schema_to_pydantic_field(name: str, json_schema: Dict[str, Any], required: List[str]) -> Any:
    """
    Converts a JSON schema property to a Pydantic field definition.

    Args:
        name: The field name.
        json_schema: The JSON schema property.

    Returns:
        A Pydantic field definition.
    """

    # Get the field type.
    type_ = json_schema_to_pydantic_type(json_schema)

    # Get the field description.
    description = json_schema.get('description')

    # Get the field examples.
    examples = json_schema.get('examples')

    # Create a Field object with the type, description, and examples.
    # The 'required' flag will be set later when creating the model.
    return (type_, Field(description=description, examples=examples, default=... if name in required else None))

def json_schema_to_pydantic_type(json_schema: Dict[str, Any]) -> Any:
    """
    Converts a JSON schema type to a Pydantic type.

    Args:
        json_schema: The JSON schema to convert.

    Returns:
        A Pydantic type.
    """

    type_ = json_schema.get('type')

    if type_ == 'string':
        return str
    elif type_ == 'integer':
        return int
    elif type_ == 'number':
        return float
    elif type_ == 'boolean':
        return bool
    elif type_ == 'array':
        items_schema = json_schema.get('items')
        if items_schema:
            item_type = json_schema_to_pydantic_type(items_schema)
            return List[item_type]
        else:
            return List
    elif type_ == 'object':
        # Handle nested models.
        properties = json_schema.get('properties')
        if properties:
            nested_model = json_schema_to_model(json_schema)
            return nested_model
        else:
            return Dict
    elif type_ == 'null':
        return Optional[Any]  # Use Optional[Any] for nullable fields
    else:
        raise ValueError(f'Unsupported JSON schema type: {type_}')``` 
Had to make a one line change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests