### Examples of Using Pydantic for Swiss Re Sustainability Report Data

Below are two examples demonstrating how Pydantic can be useful for handling and validating data from the Swiss Re Sustainability Report. By using Pydantic models, we can enforce constraints on data fields, ensuring that data entries are consistent, accurate, and meet specific validation criteria.

1. **Green Bond Investments**:
   We can use a Pydantic model to validate data related to green, social, and sustainability bonds. This model includes fields such as the bond type, investment amount, and investment date, with appropriate constraints to ensure data consistency. This ensures that each entry is accurate and follows the expected format.

   - **Fields**:
     - `bond_type`: Type of bond, e.g., "Green", "Social", "Sustainability" (required).
     - `amount`: Investment amount in USD, must be greater than zero.
     - `investment_date`: Date of investment.

2. **GHG Emissions Reductions**:
   This model enforces constraints for tracking carbon intensity reduction targets, ensuring structured data entries. Key fields include the target year, base year emission levels, and current reduction percentage, with appropriate constraints to validate data integrity.

   - **Fields**:
     - `target_year`: The year the target should be achieved, must be 2023 or later.
     - `base_year_emission`: Emission level in the base year, must be greater than zero.
     - `current_reduction`: Current reduction percentage relative to the base year, must be 100 or below.

### Code Implementation

Below is the Python code for creating these Pydantic models.


In [5]:
from pydantic import BaseModel, Field
from datetime import datetime

# Green Bond Investment Model
class GreenBondInvestment(BaseModel):
    bond_type: str = Field(..., description="Type of bond, e.g., Green, Social, Sustainability")
    amount: float = Field(..., gt=0, description="Investment amount in USD")
    investment_date: datetime = Field(..., description="Date of investment")

# GHG Emissions Reduction Model
class GHGReduction(BaseModel):
    target_year: int = Field(..., ge=2023, description="The year the target should be achieved, must be 2023 or later")
    base_year_emission: float = Field(..., gt=0, description="Base year emissions in tonnes CO2e")
    current_reduction: float = Field(..., le=100, description="Current reduction percentage relative to base year")

# Example usage:
# Green Bond Investment example
green_bond_data = GreenBondInvestment(
    bond_type="Green",
    amount=4.4e9,
    investment_date="2023-03-01"
)

# GHG Reduction example
ghg_reduction_data = GHGReduction(
    target_year=2030,
    base_year_emission=500.0,
    current_reduction=45.0
)

from tabulate import tabulate

# Define a function to print each model as a table with a title
def print_model_as_table(title, model):
    print(f"\n{title}")
    print(tabulate(model.dict().items(), headers=["Field", "Value"], tablefmt="grid"))

# Displaying the created data models in separate tables with titles
print_model_as_table("Green Bond Investment Data", green_bond_data)
print_model_as_table("GHG Reduction Data", ghg_reduction_data)






Green Bond Investment Data
+-----------------+---------------------+
| Field           | Value               |
| bond_type       | Green               |
+-----------------+---------------------+
| amount          | 4400000000.0        |
+-----------------+---------------------+
| investment_date | 2023-03-01 00:00:00 |
+-----------------+---------------------+

GHG Reduction Data
+--------------------+---------+
| Field              |   Value |
| target_year        |    2030 |
+--------------------+---------+
| base_year_emission |     500 |
+--------------------+---------+
| current_reduction  |      45 |
+--------------------+---------+


### Example: Validators with `pre` and `post` for Swiss Re Sustainability Report Data

In the context of Swiss Re's sustainability initiatives, we might need to sanitize and validate data related to environmental, social, and governance (ESG) scores. For instance, ESG data might need to be stripped of extra whitespace before validation, and formatted consistently after validation.

Below is an example where a Pydantic model uses `pre` and `post` validators:
- **Pre-validation**: Trims whitespace from the `company_name` field.
- **Post-validation**: Ensures that `esg_score` is in uppercase format.

### Code Example


In [7]:
from pydantic import BaseModel, field_validator

class ESGData(BaseModel):
    company_name: str
    esg_score: str

    # Pre-validation to strip extra whitespace
    @field_validator('company_name', mode='before')
    def sanitize_company_name(cls, value):
        return value.strip()

    # Post-validation to ensure ESG score is uppercase
    @field_validator('esg_score')
    def uppercase_esg_score(cls, value):
        return value.upper()

# Example usage
esg_data_example = ESGData(company_name="  Swiss Re  ", esg_score="aaa")
print(esg_data_example)



company_name='Swiss Re' esg_score='AAA'


### Feature 3: Root Validator in Risk Modelling

In risk modeling, consistency checks are crucial to ensure valid data across related fields. For instance, in a sustainability report, a model tracking climate-related risks might need to enforce that `risk_level` and `mitigation_score` are logically consistent. If the `risk_level` is "High," then `mitigation_score` should not be excessively low.

The example below uses a root validator to check that a high risk level corresponds to a sufficiently high mitigation score.


In [10]:
from pydantic import BaseModel,  model_validator, ValidationError

class ClimateRiskModel(BaseModel):
    risk_level: str  # e.g., "High", "Medium", "Low"
    mitigation_score: int  # Scale from 0 to 100

    # Model validator to check consistency between risk_level and mitigation_score
    @model_validator(mode='after')
    def check_risk_mitigation(cls, values):
        risk_level = values.risk_level
        mitigation_score = values.mitigation_score

        if risk_level == "High" and mitigation_score < 50:
            raise ValueError("Mitigation score must be >= 50 for high risk level.")
        return values

# Example usage with error handling
try:
    climate_risk = ClimateRiskModel(risk_level="High", mitigation_score=40)
except ValidationError as e:
    print(e)

# Valid example
climate_risk_valid = ClimateRiskModel(risk_level="High", mitigation_score=60)
print(climate_risk_valid)


1 validation error for ClimateRiskModel
  Value error, Mitigation score must be >= 50 for high risk level. [type=value_error, input_value={'risk_level': 'High', 'mitigation_score': 40}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/value_error
risk_level='High' mitigation_score=60


### Feature 4: Recursive Models for Sustainable Finance Analysis

In sustainable finance, a recursive model can be useful for analyzing hierarchical data, like the structure of an investment portfolio where each investment can contain sub-investments. Here’s an example of a model representing a portfolio with nested investments.


In [11]:
from pydantic import BaseModel
from typing import List

class Investment(BaseModel):
    name: str
    amount: float
    sub_investments: List['Investment'] = []

Investment.model_rebuild()  # Update forward references for recursion

# Example data
portfolio = Investment(
    name="Green Fund",
    amount=1000000,
    sub_investments=[
        Investment(
            name="Solar Project",
            amount=500000,
            sub_investments=[
                Investment(name="Solar Panel Manufacturer", amount=250000),
                Investment(name="Installation Service", amount=250000),
            ]
        ),
        Investment(name="Wind Project", amount=500000)
    ]
)

print(portfolio)


name='Green Fund' amount=1000000.0 sub_investments=[Investment(name='Solar Project', amount=500000.0, sub_investments=[Investment(name='Solar Panel Manufacturer', amount=250000.0, sub_investments=[]), Investment(name='Installation Service', amount=250000.0, sub_investments=[])]), Investment(name='Wind Project', amount=500000.0, sub_investments=[])]


Another use of recursive model for logging chat between sustainability analyst and product manager

In [12]:
class Message(BaseModel):
    sender: str  # e.g., "Analyst", "Product Manager"
    content: str
    replies: List['Message'] = []

Message.model_rebuild()  # Enable recursion

# Example threaded chat
chat = Message(
    sender="Analyst",
    content="Can we review the ESG risk metrics for Q2?",
    replies=[
        Message(
            sender="Product Manager",
            content="Yes, let's prioritize renewable investments first.",
            replies=[
                Message(
                    sender="Analyst",
                    content="Noted. Also, do we need a risk reassessment for wind projects?",
                ),
                Message(
                    sender="Product Manager",
                    content="Yes, especially after recent regulatory changes."
                )
            ]
        )
    ]
)

print(chat)

sender='Analyst' content='Can we review the ESG risk metrics for Q2?' replies=[Message(sender='Product Manager', content="Yes, let's prioritize renewable investments first.", replies=[Message(sender='Analyst', content='Noted. Also, do we need a risk reassessment for wind projects?', replies=[]), Message(sender='Product Manager', content='Yes, especially after recent regulatory changes.', replies=[])])]


In [15]:
from pydantic import BaseModel, Field, model_validator

class CompanyData(BaseModel):
    company_name: str = Field(alias="company_name")
    industry: str = Field(alias="industry")
    revenue: float = Field(alias="revenue")

    @model_validator(mode='before')
    def map_aliases(cls, values):
        # Map aliases from different databases
        aliases = {
            "company_name": ["company-name-dbA", "company-name-dbB", "Company_Name_dbC"],
            "industry": ["industry-vertical-dbA", "IndustryType-dbB", "Sector-dbC"],
            "revenue": ["annual-revenue-dbA", "AnnualRevenue-dbB", "RevenueInMillions-dbC"],
        }

        for field, alias_list in aliases.items():
            for alias in alias_list:
                if alias in values:
                    values[field] = values[alias]
                    break  # Stop after finding the first matching alias
        return values

# Example data from different databases
data_dbA = {
    "company-name-dbA": "Swiss Re",
    "industry-vertical-dbA": "Insurance",
    "annual-revenue-dbA": 4500.5
}

data_dbB = {
    "company-name-dbB": "Swiss Re",
    "IndustryType-dbB": "Insurance",
    "AnnualRevenue-dbB": 4500.5
}

data_dbC = {
    "Company_Name_dbC": "Swiss Re",
    "Sector-dbC": "Insurance",
    "RevenueInMillions-dbC": 4500.5
}

# Parsing data from each database into a unified model
company_dbA = CompanyData(**data_dbA)
company_dbB = CompanyData(**data_dbB)
company_dbC = CompanyData(**data_dbC)

# Print results
print("From Database A:", company_dbA)
print("From Database B:", company_dbB)
print("From Database C:", company_dbC)


From Database A: company_name='Swiss Re' industry='Insurance' revenue=4500.5
From Database B: company_name='Swiss Re' industry='Insurance' revenue=4500.5
From Database C: company_name='Swiss Re' industry='Insurance' revenue=4500.5
