# Schema to Model - Dynamic Pydantic Model Generation

The `load_pydantic_model_from_schema` function dynamically creates Pydantic models from JSON schemas using `datamodel-code-generator`. This enables:

**Core Features:**
- **Schema Flexibility**: Accept JSON schema as string or dict
- **Automatic Model Naming**: Extract model name from schema title or use default
- **Type Safety**: Generate fully-typed Pydantic v2 models
- **Python Version Support**: Target specific Python versions (3.11-3.14)
- **Dynamic Loading**: Import and instantiate generated models at runtime
- **Validation Ready**: Models support full Pydantic validation

In [None]:
# Check for required dependency
try:
    import datamodel_code_generator  # noqa: F401

    DEPENDENCY_AVAILABLE = True
    print("✓ datamodel-code-generator is installed")
except ImportError:
    DEPENDENCY_AVAILABLE = False
    print("⚠️  datamodel-code-generator not installed")
    print("   Install with: pip install `lionherd-core[schema-gen]`")
    print("   Or: pip install datamodel-code-generator")
    print("Notebook cells below will be skipped.")

✓ datamodel-code-generator is installed


In [2]:
import json

from pydantic import ValidationError

from lionherd_core.libs.schema_handlers import load_pydantic_model_from_schema

## 1. Basic Schema to Model Conversion

Convert a simple JSON schema to a working Pydantic model.

In [3]:
# Simple user schema
user_schema = {
    "title": "User",
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "email": {
            "type": "string"
        },  # Removed "format": "email" to avoid email-validator dependency
    },
    "required": ["name", "email"],
}

# Generate model from schema
UserModel = load_pydantic_model_from_schema(user_schema)
print(f"Model name: {UserModel.__name__}")
print(f"Model fields: {list(UserModel.model_fields.keys())}")

Model name: User
Model fields: ['name', 'age', 'email']


In [4]:
# Create instance with valid data
user = UserModel(name="Alice", age=30, email="alice@example.com")
print(f"User: {user}")
print(f"Name: {user.name}")
print(f"Age: {user.age}")
print(f"Email: {user.email}")

User: name='Alice' age=30 email='alice@example.com'
Name: Alice
Age: 30
Email: alice@example.com


In [5]:
# Validation works automatically
try:
    invalid_user = UserModel(name="Bob")  # Missing required 'email'
except ValidationError as e:
    print(f"✓ Validation error caught: {e.error_count()} error(s)")
    print(f"  Missing field: {e.errors()[0]['loc'][0]}")

✓ Validation error caught: 1 error(s)
  Missing field: email


## 2. Schema Title Extraction

Model name is automatically extracted from schema `title` field. If title is missing or invalid, falls back to provided `model_name`.

In [6]:
# Schema with title
schema_with_title = {
    "title": "Product",
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "price": {"type": "number"},
    },
}

ProductModel = load_pydantic_model_from_schema(schema_with_title)
print(f"Model name from title: {ProductModel.__name__}")

Model name from title: Product


In [7]:
# Schema without title - uses default
schema_no_title = {
    "type": "object",
    "properties": {
        "quantity": {"type": "integer"},
    },
}

InventoryModel = load_pydantic_model_from_schema(schema_no_title, "Inventory")
print(f"Model name from parameter: {InventoryModel.__name__}")

Model name from parameter: Model


In [8]:
# Schema with invalid title - sanitizes or falls back
schema_invalid_title = {
    "title": "My Cool Schema!",  # Invalid Python identifier
    "type": "object",
    "properties": {
        "data": {"type": "string"},
    },
}

SanitizedModel = load_pydantic_model_from_schema(schema_invalid_title, "Fallback")
print(f"Sanitized model name: {SanitizedModel.__name__}")

Sanitized model name: MyCoolSchema


## 3. Dictionary vs String Schema Input

Function accepts both dictionary and JSON string formats.

In [9]:
# Dictionary input
dict_schema = {
    "title": "Address",
    "type": "object",
    "properties": {
        "street": {"type": "string"},
        "city": {"type": "string"},
        "zipcode": {"type": "string"},
    },
}

AddressModel = load_pydantic_model_from_schema(dict_schema)
address = AddressModel(street="123 Main St", city="Springfield", zipcode="12345")
print(f"From dict: {address}")

From dict: street='123 Main St' city='Springfield' zipcode='12345'


In [10]:
# JSON string input
json_schema = json.dumps(
    {
        "title": "Contact",
        "type": "object",
        "properties": {
            "phone": {"type": "string"},
            "email": {"type": "string"},
        },
    }
)

ContactModel = load_pydantic_model_from_schema(json_schema)
contact = ContactModel(phone="555-1234", email="contact@example.com")
print(f"From JSON string: {contact}")

From JSON string: phone='555-1234' email='contact@example.com'


## 4. Nested and Complex Schemas

Generate models with nested objects, arrays, and complex types.

In [11]:
# Complex nested schema
company_schema = {
    "title": "Company",
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "founded": {"type": "integer"},
        "address": {
            "type": "object",
            "properties": {
                "street": {"type": "string"},
                "city": {"type": "string"},
            },
        },
        "employees": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "role": {"type": "string"},
                },
            },
        },
    },
}

CompanyModel = load_pydantic_model_from_schema(company_schema)
print(f"Model: {CompanyModel.__name__}")
print(f"Fields: {list(CompanyModel.model_fields.keys())}")

Model: Company
Fields: ['name', 'founded', 'address', 'employees']


In [12]:
# Create nested instance
company = CompanyModel(
    name="TechCorp",
    founded=2020,
    address={"street": "456 Tech Blvd", "city": "San Francisco"},
    employees=[
        {"name": "Alice", "role": "Engineer"},
        {"name": "Bob", "role": "Designer"},
    ],
)

print(f"Company: {company.name}")
print(f"City: {company.address.city}")
print(f"Employees: {len(company.employees)}")
print(f"First employee: {company.employees[0].name} - {company.employees[0].role}")

Company: TechCorp
City: San Francisco
Employees: 2
First employee: Alice - Engineer


## 5. Enums and Constraints

JSON schema constraints translate to Pydantic validation.

In [13]:
# Schema with enums and constraints
order_schema = {
    "title": "Order",
    "type": "object",
    "properties": {
        "status": {
            "type": "string",
            "enum": ["pending", "processing", "shipped", "delivered"],
        },
        "priority": {
            "type": "integer",
            "minimum": 1,
            "maximum": 5,
        },
        "notes": {
            "type": "string",
            "maxLength": 100,
        },
    },
    "required": ["status"],
}

OrderModel = load_pydantic_model_from_schema(order_schema)
order = OrderModel(status="pending", priority=3, notes="Handle with care")
print(f"Order: {order}")

Order: status=<Status.pending: 'pending'> priority=3 notes='Handle with care'


In [14]:
# Enum validation
try:
    invalid_order = OrderModel(status="invalid_status")
except ValidationError as e:
    print(f"✓ Enum validation: {e.errors()[0]['msg'][:50]}...")

# Constraint validation
try:
    invalid_priority = OrderModel(status="pending", priority=10)  # Exceeds maximum
except ValidationError as e:
    print(f"✓ Constraint validation: {e.errors()[0]['msg'][:50]}...")

✓ Enum validation: Input should be 'pending', 'processing', 'shipped'...
✓ Constraint validation: Input should be less than or equal to 5...


## 6. Model Serialization

Generated models support full Pydantic serialization.

In [15]:
# Create model instance
user = UserModel(name="Charlie", age=25, email="charlie@example.com")

# Pydantic dict serialization
user_dict = user.model_dump()
print(f"Dict: {user_dict}")

# JSON serialization
user_json = user.model_dump_json()
print(f"JSON: {user_json}")

# Deserialization
restored = UserModel.model_validate(user_dict)
print(f"Restored: {restored}")
print(f"✓ Roundtrip successful: {restored.name == user.name}")

Dict: {'name': 'Charlie', 'age': 25, 'email': 'charlie@example.com'}
JSON: {"name":"Charlie","age":25,"email":"charlie@example.com"}
Restored: name='Charlie' age=25 email='charlie@example.com'
✓ Roundtrip successful: True


## 7. Error Handling

The function validates inputs and provides clear error messages.

In [16]:
# Invalid schema type
try:
    load_pydantic_model_from_schema(123)  # Not a dict or string
except TypeError as e:
    print(f"✓ Type error: {e!s}")

✓ Type error: Schema must be a JSON string or a dictionary


In [17]:
# Invalid JSON string
try:
    load_pydantic_model_from_schema("{invalid json}")
except ValueError as e:
    print(f"✓ Value error: {e!s}")

✓ Value error: Invalid JSON schema string provided


In [18]:
# Missing datamodel-code-generator (simulated)
# Uncomment to test if library is not installed:
# try:
#     load_pydantic_model_from_schema({"type": "object"})
# except ImportError as e:
#     print(f"Import error: {str(e)}")

print("✓ Error handling demonstrates clear failure modes")

✓ Error handling demonstrates clear failure modes


## 8. Real-World Example: API Response Schema

Generate model for a typical REST API response.

In [19]:
# API response schema
api_response_schema = {
    "title": "APIResponse",
    "type": "object",
    "properties": {
        "success": {"type": "boolean"},
        "data": {
            "type": "object",
            "properties": {
                "id": {"type": "string", "format": "uuid"},
                "timestamp": {"type": "string", "format": "date-time"},
                "payload": {"type": "object"},
            },
        },
        "errors": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "code": {"type": "string"},
                    "message": {"type": "string"},
                },
            },
        },
    },
    "required": ["success"],
}

APIResponseModel = load_pydantic_model_from_schema(api_response_schema)
print(f"Model: {APIResponseModel.__name__}")

Model: APIResponse


In [20]:
# Success response
success_response = APIResponseModel(
    success=True,
    data={
        "id": "123e4567-e89b-12d3-a456-426614174000",
        "timestamp": "2025-11-09T12:00:00Z",
        "payload": {"result": "operation completed"},
    },
)
print(f"Success: {success_response.success}")
print(f"Data ID: {success_response.data.id}")

Success: True
Data ID: 123e4567-e89b-12d3-a456-426614174000


In [21]:
# Error response
error_response = APIResponseModel(
    success=False,
    errors=[
        {"code": "AUTH_001", "message": "Invalid credentials"},
        {"code": "RATE_LIMIT", "message": "Too many requests"},
    ],
)
print(f"Success: {error_response.success}")
print(f"Errors: {len(error_response.errors)}")
print(f"First error: {error_response.errors[0].code} - {error_response.errors[0].message}")

Success: False
Errors: 2
First error: AUTH_001 - Invalid credentials


## Summary Checklist

**Schema to Model Essentials:**
- ✅ Convert JSON schema (dict or string) to Pydantic models
- ✅ Auto-extract model name from schema title with sanitization
- ✅ Support nested objects, arrays, and complex types
- ✅ Validate enums, constraints, and required fields
- ✅ Full Pydantic v2 serialization/deserialization support
- ✅ Clear error messages for invalid inputs
- ✅ Runtime model generation for dynamic schemas
- ✅ Type-safe models with IDE support

**Use Cases:**
- API response/request models from OpenAPI specs
- Configuration schemas from JSON Schema
- Dynamic data validation from external sources
- Code generation from schema definitions

**Requirements:**
- Install with: `pip install 'lionherd-core[schema-gen]'`
- Or directly: `pip install datamodel-code-generator`

**Next Steps:**
- See `ln` utilities for JSON/dict handling
- See `Operable` for structured output integration
- See `Spec` for validation patterns