GitHub - ferhatvonkaplan/truceprotocol: TRUCE Agent Trust Framework (TATF) — Open standard for autonomous agent trust scoring. Apache 2.0.

                                    ████████╗██████╗ ██╗   ██╗ ██████╗███████╗
                                    ╚══██╔══╝██╔══██╗██║   ██║██╔════╝██╔════╝
                                       ██║   ██████╔╝██║   ██║██║     █████╗
                                       ██║   ██╔══██╗██║   ██║██║     ██╔══╝
                                       ██║   ██║  ██║╚██████╔╝╚██████╗███████╗
                                       ╚═╝   ╚═╝  ╚═╝ ╚═════╝  ╚═════╝╚══════╝

                                                AGENT  TRUST  FRAMEWORK

An open standard for scoring the trustworthiness of autonomous AI agents.

The Problem

MCP solved agent communication. x402 solved agent payments. But neither answers the question that matters when money is on the line:

"Should I trust this agent to fulfill THIS specific transaction?"

TATF answers that question — protocol-agnostically, deterministically, and without a central authority.

How It Works

                  ┌─────────────────────────────────────────┐
                  │           TATF  SCORING  MODEL          │
                  ├─────────────────────────────────────────┤
                  │                                         │
                  │  Layer 4   ADVERSARIAL TESTING          │  Optional
                  │  ─────────────────────────────────────  │
                  │  Layer 3   COMMUNITY SIGNALS            │  Recommended
                  │  ─────────────────────────────────────  │
                  │  Layer 2   BEHAVIORAL BASELINES  ◄──────│─── EMA, 6 dimensions
                  │  ─────────────────────────────────────  │
                  │  Layer 1   OBSERVABLE METRICS    ◄──────│─── Hard numbers
                  │                                         │
                  └────────────────┬────────────────────────┘
                                   │
                                   ▼
                  ┌─────────────────────────────────────────┐
                  │         ALPHA  TRUST  SCORE             │
                  │                                         │
                  │   Score: 0.0 ─────────────────── 1.0    │
                  │   Confidence: Wald interval             │
                  │   Tier: LOW / MEDIUM / HIGH             │
                  └────────────────┬────────────────────────┘
                                   │
                  ┌────────────────┼────────────────────────┐
                  │                │                        │
                  ▼                ▼                        ▼
           ┌──────────┐   ┌──────────────┐         ┌────────────┐
           │AUTO_PASS │   │  SOFT_HOLD   │         │ HARD_BLOCK │
           │  < 50    │   │   50 - 119   │         │   ≥ 120    │
           └──────────┘   └──────────────┘         └────────────┘

Key insight: Agents are scored against their own behavioral baseline — not a global standard. A 24/7 trading bot and a 9-to-5 procurement agent have different normals. TATF respects that.

Quick Start

pip install tatf

from truce import TATFScorer, Transaction
from datetime import datetime, timezone

scorer = TATFScorer()

# Ingest 30 days of transaction history
transactions = [
    Transaction(
        timestamp=datetime(2026, 1, i, 10, 0, tzinfo=timezone.utc),
        price=1000.0,
        category="electronics",
        counterparty_id=f"cp-{i % 10:03d}",
    )
    for i in range(1, 31)
]
scorer.ingest("agent-123", transactions)

# Score the agent
result = scorer.score("agent-123", market_stability=0.8)
print(result.score)        # 0.7925
print(result.routing)      # AUTO_PASS
print(result.confidence)   # (0.6795, 0.9055)

Detect anomalies in real time:

suspicious = scorer.compute_anomaly(
    "agent-123",
    transaction=Transaction(
        timestamp=datetime(2026, 2, 1, 3, 0, tzinfo=timezone.utc),  # 3 AM
        price=50000.0,          # 50x normal
        category="agriculture", # never seen before
        counterparty_id="cp-999",
        concurrent_sessions=28,
    ),
)
print(suspicious.routing)     # HARD_BLOCK
print(suspicious.composite)   # 150+
print(suspicious.dimensions)  # Per-dimension breakdown

Six Scoring Dimensions

  Dimension                  Cap    Signal
  ──────────────────────────────────────────────────────
  1. Time anomaly             35    Operating outside normal hours
  2. Concurrent sessions      45    Abnormal parallel activity
  3. Price deviation          40    Unusual pricing behavior
  4. Category anomaly         30    New product category
  5. Negotiation rounds       25    Excessive bargaining
  6. Counterparty conc.       25    Sudden relationship shift
  ──────────────────────────────────────────────────────
  Composite range: 0 - 200

  0 ━━━━━━━━━━━━━ 50 ━━━━━━━━━━━━━ 120 ━━━━━━━━━━━━━ 200
     AUTO_PASS        SOFT_HOLD          HARD_BLOCK

No single dimension can trigger HARD_BLOCK alone.

Market Stress (AVX)

The Agent Volatility Index measures sector-level stress with k-anonymity protection:

from truce import AVXCalculator, AVXEvent

avx = AVXCalculator(k_anonymity_min=5)

events = [
    AVXEvent(firm_id=f"FIRM-{i}", price=100 + i, quantity=50)
    for i in range(10)
]
avx.ingest("electronics", events)

result = avx.compute("electronics")
print(result.avx_score)              # 0-100 stress level
print(result.dimensions.pd_score)    # Panic Diversification
print(result.dimensions.pv_score)    # Price Volatility

Four dimensions: PD (0.40) | PV (0.30) | DA (0.20) | CR (0.10)

AVX is only published when ≥ 5 unique firms contribute data.

Specification

#	Document	Description
00	Introduction	Motivation, scope, terminology
01	Scoring Model	ALPHA composite score, four-layer model
02	Behavioral Baselines	EMA baselines, six scoring dimensions
03	Anomaly Detection	ATBF zones, routing, review queue
04	Trust Attestation	Ed25519 signing, W3C VC mapping
05	Adversarial Testing	Resilience evaluation
06	Market Stress	AVX indicator, k-anonymity

Benchmarks

  1,000 synthetic agents  ·  97,370 transactions  ·  5 archetypes  ·  97% accuracy

  Archetype        Count    Accuracy    Routing Distribution
  ─────────────────────────────────────────────────────────────
  normal            500      99.8%      AUTO_PASS
  cautious          100     100.0%      AUTO_PASS
  volatile          224      92.4%      AUTO_PASS / SOFT_HOLD
  anomalous         117      91.5%      SOFT_HOLD / HARD_BLOCK
  cold_start         59     100.0%      AUTO_PASS (cold-start fallback)

Generate your own:

cd benchmarks
python generate_benchmark.py --agents 1000
python evaluate.py --dataset datasets/benchmark_v0.1.jsonl --verbose

Regulatory Alignment

Regulation	TATF Coverage
EU AI Act (Aug 2026)	Trust scoring for high-risk AI agent supervision
NIST AI RMF	Maps to Govern, Map, Measure, Manage functions
ISO/IEC 42001	AI management system requirements
CSA ATF (Feb 2026)	Agentic Trust Framework maturity levels

Design Principles

Protocol-agnostic — works with MCP, A2A, ACP, or any agent protocol
Relative scoring — agents scored against their OWN baseline
Privacy-preserving — k-anonymity on aggregate metrics
Incrementally adoptable — implement layers independently
Deterministic — same inputs, same output, every time
Apache 2.0, forever — trust infrastructure must be a public good

Documentation

Document	Description
Why TATF?	Positioning, comparisons, and design rationale
Integration Guide	Step-by-step developer guide with code examples
Examples	6 runnable Python examples (scoring, anomaly, AVX, attestation, MCP pattern)
FAQ	Frequently asked questions
Changelog	Release history
Security Policy	Vulnerability reporting and responsible disclosure
Governance	Decision process, roles, versioning
Contributing	How to contribute, RFC process

Contributing

We welcome contributions. See CONTRIBUTING.md.

Significant changes follow an RFC process with a 14-day community review period.

License

Specification: Apache 2.0
Documentation: CC BY 4.0

Trust is the missing layer in agent commerce. TATF is the open standard.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
examples		examples
spec		spec
truce-py		truce-py
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
FAQ.md		FAQ.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Problem

How It Works

Quick Start

Six Scoring Dimensions

Market Stress (AVX)

Specification

Benchmarks

Regulatory Alignment

Design Principles

Documentation

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Problem

How It Works

Quick Start

Six Scoring Dimensions

Market Stress (AVX)

Specification

Benchmarks

Regulatory Alignment

Design Principles

Documentation

Contributing

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages