Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,16 @@ docs/superpowers/

# Detailed internal blocklist (committed config is .gitleaks.toml)
.gitleaks-internal.toml

# rp1:start
!.rp1/
.rp1/*
!.rp1/context/
!.rp1/context/**
!.rp1/config/
!.rp1/config/**
!.rp1/work/
!.rp1/work/**
.rp1/context/meta.json
.rp1/settings.toml
# rp1:end
28 changes: 28 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,32 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-json
- id: check-toml
- id: check-added-large-files
- id: check-merge-conflict
- id: debug-statements
- id: mixed-line-ending

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.11.6
hooks:
- id: ruff
args: [--fix]
- id: ruff-format

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.15.0
hooks:
- id: mypy
args: [--config-file=pyproject.toml, src/]
additional_dependencies: [pydantic>=2.0, httpx>=0.28]
pass_filenames: false

- repo: https://github.com/gitleaks/gitleaks
rev: v8.24.3
hooks:
Expand Down
219 changes: 219 additions & 0 deletions .rp1/context/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
# System Architecture

**Project**: model-ledger
**Architecture Pattern**: Event-Sourced, Protocol-First, Layered with Tool-Shaped API
**Last Updated**: 2026-04-16

## High-Level Architecture

```mermaid
graph TB
subgraph Presentation["Presentation Layer"]
CLI["CLI<br/>(Typer)"]
MCP["MCP Server<br/>(FastMCP)"]
REST["REST API<br/>(FastAPI)"]
end

subgraph Tools["Tool Layer"]
T_record["record"]
T_query["query"]
T_investigate["investigate"]
T_trace["trace"]
T_changelog["changelog"]
T_discover["discover"]
end

subgraph SDK["SDK Layer"]
Ledger["Ledger SDK<br/>(register, record, tag,<br/>add, connect, trace,<br/>members, groups)"]
end

subgraph Backends["Backend Layer"]
SQLite["SQLiteLedgerBackend"]
Snowflake["SnowflakeLedgerBackend"]
HTTP_BE["HttpLedgerBackend"]
JSON["JsonFileLedgerBackend"]
Memory["InMemoryLedgerBackend"]
end

subgraph Connectors["Connector Layer"]
SQL_C["sql_connector"]
REST_C["rest_connector"]
GitHub_C["github_connector"]
Prefect_C["prefect_connector"]
end

Agent["AI Agent"] -->|MCP stdio| MCP
User["User / Script"] -->|HTTP| REST
User -->|CLI| CLI

CLI --> Ledger
MCP --> T_record
MCP --> T_query
MCP --> T_investigate
MCP --> T_trace
MCP --> T_changelog
MCP --> T_discover
REST --> T_record
REST --> T_query
REST --> T_investigate
REST --> T_trace
REST --> T_changelog
REST --> T_discover
T_record --> Ledger
T_query --> Ledger
T_investigate --> Ledger
T_trace --> Ledger
T_changelog --> Ledger
T_discover --> Ledger

Ledger --> SQLite
Ledger --> Snowflake
Ledger --> HTTP_BE
Ledger --> JSON
Ledger --> Memory

Connectors --> Ledger

HTTP_BE -.->|pass-through| REST

SQL_C -->|queries| ExtDB[("External DB")]
REST_C -->|fetches| ExtAPI["External API"]
GitHub_C -->|reads| GitHubAPI["GitHub API"]
Prefect_C -->|discovers| PrefectCloud["Prefect Cloud"]
```

## Component Architecture

### Presentation Layer
**Purpose**: User-facing interfaces — CLI, MCP server for AI agents, REST API for programmatic access
**Components**:
- `src/model_ledger/cli/app.py` — Typer CLI (list, show, validate, audit-log, export, introspect, mcp, serve)
- `src/model_ledger/mcp/server.py` — FastMCP server with 6 tools + 3 resources, stdio transport
- `src/model_ledger/rest/app.py` — FastAPI with 6 endpoints mirroring tools, uvicorn

### Tool Layer
**Purpose**: Six agent-protocol tool functions with Pydantic I/O contracts — the canonical API surface
**Components**:
- `src/model_ledger/tools/schemas.py` — Pydantic I/O models (single source of truth)
- `src/model_ledger/tools/{record,query,investigate,trace,changelog,discover}.py`
**Pattern**: Pure functions with signature `(Input, Ledger) -> Output`, all JSON-serializable

### SDK Layer
**Purpose**: Core business logic — Ledger class orchestrates registration, recording, tagging, dependency linking, membership, and change propagation
**Components**:
- `src/model_ledger/sdk/ledger.py` — Ledger (v0.3.0+ event-log paradigm)
- `src/model_ledger/sdk/inventory.py` — Inventory (v0.2.0 legacy with DraftVersion context manager)

### Backend Layer (Storage)
**Purpose**: Pluggable persistence implementing LedgerBackend protocol
**Components**:
- `SQLiteLedgerBackend` — WAL mode, zero-dep, stdlib sqlite3
- `SnowflakeLedgerBackend` — Production, batched writes with pandas/SQL MERGE fallback
- `HttpLedgerBackend` — REST API pass-through via httpx
- `JsonFileLedgerBackend` — Git-friendly directory tree (models/, snapshots/, tags/)
- `InMemoryLedgerBackend` — Testing and demo

### Connector Layer (Discovery)
**Purpose**: Config-driven factories that discover models from external data sources
**Components**:
- `sql_connector` — SQL query to DataNode mapping with table parsing
- `rest_connector` — REST API pagination and JSON field mapping
- `github_connector` — GitHub Contents API, repo config scanning
- `prefect_connector` — Prefect Cloud deployment discovery

## Data Flow

### Agent Tool Invocation (MCP)
```mermaid
sequenceDiagram
participant Agent as AI Agent
participant MCP as FastMCP Server
participant Tool as Tool Function
participant SDK as Ledger SDK
participant BE as LedgerBackend

Agent->>MCP: Tool call (stdio)
MCP->>Tool: Construct Pydantic Input
Tool->>SDK: Call Ledger methods
SDK->>BE: Read/Write via protocol
BE-->>SDK: Data
SDK-->>Tool: Results
Tool-->>MCP: Pydantic Output (JSON)
MCP-->>Agent: Tool response
```

### Model Discovery via Connectors
```mermaid
sequenceDiagram
participant Conn as Connector Factory
participant Ext as External Source
participant SDK as Ledger.add()
participant Linker as Ledger.connect()
participant BE as LedgerBackend

Conn->>Ext: Query (SQL/REST/GitHub/Prefect)
Ext-->>Conn: Raw rows/items
Conn->>Conn: Map to DataNodes with DataPorts
Conn->>SDK: Register DataNodes as ModelRefs
SDK->>BE: Content-hash dedup + append snapshots
Linker->>Linker: Match DataPort I/O across all nodes
Linker->>BE: Create dependency links
```

### Composite Change Propagation
```mermaid
sequenceDiagram
participant Caller as record()
participant SDK as Ledger SDK
participant Groups as groups()
participant Parent as Parent Composite

Caller->>SDK: record(member_model, event)
SDK->>SDK: Check event not in _INTERNAL_EVENTS
SDK->>Groups: Find parent composites
Groups-->>SDK: List of parent ModelRefs
SDK->>Parent: record(parent, member_changed, metadata)
Note over Parent: _propagating=True prevents cascading
```

## Integration Points

### External Services
| Service | Purpose | Integration Type |
|---------|---------|-----------------|
| Snowflake | Production storage | Database (MERGE, write_pandas) |
| SQLite | Local persistent storage | Embedded database (stdlib) |
| GitHub API | Discover models from config files | REST API (v3 Contents) |
| Prefect Cloud | Discover orchestration deployments | Python SDK (async) |
| PyPI | Package distribution | CI/CD (GitHub Actions) |
| FastMCP | Expose tools to AI agents | MCP protocol (stdio) |

### MCP-to-REST Pass-Through
The MCP server supports `HttpLedgerBackend` which creates a pass-through mode — all tool calls are forwarded as HTTP requests to a remote REST API deployment.

## Architectural Patterns

| Pattern | Evidence | Description |
|---------|----------|-------------|
| Event Sourcing | ModelRef + Snapshot | All state changes are immutable Snapshots. History replayed for current state. |
| Protocol-First | `@runtime_checkable Protocol` | All extension points use Protocols, not ABCs. Duck typing with type-checker support. |
| Tool-Shaped SDK | 6 tool functions | Every method has clear inputs, JSON-serializable outputs, no side effects beyond the ledger. |
| Factory Pattern | Connector + Backend factories | Config-driven factories return fully-wired instances. |
| Plugin Architecture | Entry points | `importlib.metadata.entry_points()` for introspectors and scanners. |
| Composite Governance | Member tracking via events | Business composites aggregate technical nodes. Membership tracked via snapshots, not FK. |

## Deployment Architecture

**Distribution**: Python package via PyPI (Apache-2.0)
**Build System**: hatchling
**Python**: >=3.10
**Core Dependencies**: pydantic, httpx (minimal)

**Runtime Modes**:
- CLI: `model-ledger <command>`
- MCP server: `model-ledger mcp [--backend sqlite|snowflake|json|http|memory]`
- REST API: `model-ledger serve [--backend ...] [--port 8000]`
- Python SDK: `from model_ledger.sdk.ledger import Ledger`
- Embedded: `Ledger(backend=SQLiteLedgerBackend('db.sqlite'))`

**Optional Dependency Groups**: cli, mcp, rest-api, snowflake, github, excel, introspect-*
80 changes: 80 additions & 0 deletions .rp1/context/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Project Charter: model-ledger

**Version**: 1.0.0
**Status**: Complete
**Created**: 2026-04-16

### Problem & Context

Financial institutions operate hundreds to thousands of ML models and rules-based systems across 10+ platforms, yet typically track only a fraction in any formal inventory. Regulatory mandates — SR 11-7, EU AI Act (enforcement August 2026), OSFI E-23, PRA SS1/23 — require comprehensive, auditable model inventories with full lineage and change trails.

Existing model registries (MLflow, SageMaker, Weights & Biases) are single-platform silos. They track what was trained in their environment but cannot provide a unified, cross-platform view of every deployed model, its dependencies, or its governance status. The result: MRM teams resort to spreadsheets, coverage gaps go undetected, and regulatory audits surface material findings.

**Why now**: Regulatory frameworks like SR 11-7 and the EU AI Act are raising the bar for model governance. Organizations increasingly need a solution that can discover and govern models across all platforms — this is a growing industry need, not a single-deadline event.

### Target Users

| Segment | Role | Primary Need |
|---------|------|--------------|
| Model Risk Management (MRM) teams | Govern and inventory all models | Complete, living inventory satisfying regulatory requirements |
| ML Engineers | Build and deploy models | Discover dependencies, trace impact of changes, understand lineage |
| AI Agents (via MCP) | Query inventory conversationally | Tool-shaped API for natural-language governance queries |
| Regulators / Auditors | Examine compliance posture | Audit trails, compliance documentation, coverage reports |

### Business Rationale

model-ledger is the missing governance layer that sits above platform-specific registries and provides a unified, cross-platform model inventory.

**Core value delivered**:
1. **Unified discovery**: Discovers models across all platforms as one connected graph, eliminating blind spots from platform silos
2. **Immutable audit trail**: Every change is tracked as a content-addressed, append-only event — satisfying regulatory auditability requirements
3. **Dependency mapping**: Maps upstream/downstream dependencies so teams can trace the blast radius of any change
4. **Regulatory compliance**: Validates inventory against SR 11-7, EU AI Act, and NIST AI RMF compliance profiles out of the box
5. **Agent-native interface**: Exposes everything through a tool-shaped API (MCP, REST, CLI) so AI agents can query and manage the inventory conversationally
6. **Composite governance**: Aggregates technical components into business-level composite models, letting regulators examine governable entities rather than raw artifacts

**Differentiation**: Unlike MLflow/SageMaker/W&B (single-platform training registries), model-ledger is a cross-platform governance framework. Unlike GRC tools, it is code-native, event-sourced, and agent-accessible. Apache-2.0 licensing enables adoption without vendor lock-in.

### Scope Guardrails

**In Scope (v0.7.x)**:
- Model registration with content-addressed identity (ModelRef, Snapshot, Tag)
- Append-only event-log paradigm with immutable audit trail
- Dependency graph construction and traversal (add, connect, trace)
- Composite model governance (groups, members, automatic change propagation)
- 6 agent tools: record, query, investigate, trace, changelog, discover
- Three transport surfaces: MCP server, REST API, CLI
- 5 pluggable backends: InMemory, SQLite, Snowflake, HTTP pass-through, JSON files
- 4 source connectors: SQL, REST, GitHub, Prefect
- 3 regulatory compliance profiles: SR 11-7, EU AI Act, NIST AI RMF
- ML model introspection plugins (sklearn, xgboost, lightgbm)
- Audit pack export (HTML, JSON, Markdown)
- Observations, validation runs, and feedback lifecycle
- Scanner protocol for platform-level model discovery
- Plugin discovery via entry_points

**Out of Scope (by design)**:
- Model training / experiment tracking (MLflow/W&B territory)
- Real-time monitoring / alerting
- Automated remediation of findings
- Model serving / deployment
- Feature stores
- Data quality monitoring
- UI / dashboard frontend (REST API exists but no bundled frontend)
- Organization-specific connectors, auth, backends (separate companion packages)
- Model comparison / A/B testing

### Success Criteria

**Success looks like**:
1. **Regulatory readiness**: Model inventory is comprehensive enough to satisfy SR 11-7 and EU AI Act (August 2026 deadline) audit requirements — complete coverage of deployed models with audit trails
2. **Coverage at scale**: Organizations move from partial tracking (~15%) to >90% coverage across all platforms
3. **OSS adoption**: External organizations (banks, fintechs) adopt model-ledger as their model inventory solution — evidenced by PyPI downloads, GitHub stars, and external contributions
4. **Agent-native usage**: AI agents (via MCP) become the primary interface for querying and managing the inventory — model governance becomes conversational
5. **Composite governance**: Business-level composites successfully aggregate thousands of technical nodes with automatic change propagation, enabling regulators to examine governable entities

**Failure looks like**:
- Regulatory audit finds significant gaps in model inventory coverage
- Framework is too complex for MRM teams to adopt — they fall back to spreadsheets
- OSS project stays internal-only with no external adoption
- Event-log paradigm creates performance bottlenecks at scale (>10K models)
Loading
Loading