feat: Add Pydantic models for stricter YAML card validation


**Is your feature request related to a problem? Please describe.**  
Current YAML card loading in scripts/ (primarily in convert.py and related scripts) uses yaml.safe_load without any schema validation or type checking.  

This leads to:
- Silent failures or hard-to-debug crashes if cards miss required fields (e.g. title, description), have wrong types (e.g. mappings as string instead of dict), or contain extra/unknown fields.
- Poor error messages for contributors/translators (no clear indication of which field or card is invalid).
- Risk of invalid data propagating to the website/card browser, causing UI issues or incorrect threat modeling output.
- Difficulty maintaining consistency as new suits/editions or languages are added.

**Describe the solution you'd like**  
Introduce Pydantic models to define and enforce a strict schema for card YAML structures at load time.

Key aspects:
- Create a base Card model with required/typed fields (title: str, description: str, mappings: Dict, etc.)
- Use Pydantic's ValidationError for clear, field-specific errors
- Integrate into convert.py (or central loading function) to validate before processing
- Start with common fields across suits, later extend for suit-specific ones
- Add unit tests for valid/invalid cases

Example:

```python
from pydantic import BaseModel, Field, ValidationError
from typing import Dict, List, Optional

class CardMapping(BaseModel):
    capec: Optional[str] = None
    asvs: Optional[List[str]] = None

class Card(BaseModel):
    title: str = Field(..., min_length=1)
    description: str = Field(..., min_length=10)
    mappings: Dict[str, CardMapping] = Field(default_factory=dict)
    # suit-specific: Optional[str] = None

# In convert.py
try:
    card_data = yaml.safe_load(file_content)
    validated = Card(**card_data)  # raises ValidationError if invalid
except ValidationError as e:
    raise ValueError(f"Invalid card in {filename}: {e.errors()}")
```

**Benefits**:
- Clear errors (e.g. "title required", "mappings must be dict")
- Enforces data integrity early
- Complements #2406 YAML hardening with runtime checks
- Incremental: start with convert.py

**Describe alternatives you've considered**  
1. Keep current yaml.safe_load + manual checks  
   → Works for basics, but repetitive, misses nested validation, error messages unclear.

2. Use jsonschema or cerberus  
   → Good, but Pydantic offers better type hints, IDE support, error messages, and Pythonic API.

3. Do nothing  
   → Risks invalid data propagating to converter/website.

**Additional context**  
- Builds directly on #2406 (FAILSAFE_SCHEMA) by adding schema enforcement
- Low-risk: Pydantic is dev-only (Pipfile), no runtime impact on production
- Migration incremental: prototype in convert.py first + tests
- Aligns with OWASP data integrity goals for threat modeling cards
- Happy to make a PR and do further changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add Pydantic models for stricter YAML card validation #2430

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

feat: Add Pydantic models for stricter YAML card validation #2430

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions