Skip to content

VenturFlow/Assay

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

assay

Assay every AI agent decision before money moves. A safety and validation library for AI agentic workflows in finance, contributed by VenturFlow to the open-source community.

Apache 2.0 Python 3.11+ Status: early


About this project

This library is an open-source goodwill release from VenturFlow — a separate, commercial Agentic AI platform built for venture-capital firms. We hit the same set of agent-safety problems again and again while building VenturFlow's platform, decided the field shouldn't each have to solve them from scratch, and pulled out the layer that we think is generally useful and ship it here under Apache 2.0.

This repository is not the VenturFlow platform. It's an ad-hoc, focused library: just the validation and safety layer that sits between an AI agent and a downstream action — a trade, a wire, a filing, a report.

Use this if you're shipping AI agents that act on financial workflows and you want guardrails you can read, version, and audit.

Visit VenturFlow if you want the full agentic stack purpose-built for VC firms (research, sourcing, diligence, portfolio, LP reporting, …). The two interoperate but neither is required for the other.


What it does

Sits between an AI agent and the action it's about to take, at four boundaries (and more to come):

                ┌─────────────────────────────────────────┐
                │            YOUR AI AGENT                │
                │   (Claude / GPT / Gemini / any LLM)     │
                └────────────────┬────────────────────────┘
                                 │
   ┌─────────────────────────────┼─────────────────────────────┐
   │ 1. OUTPUT VALIDATION        │  before the agent's          │
   │    schema dispatch +        │  recommendation is acted on  │
   │    regulatory rule packs    │                              │
   ├─────────────────────────────┼─────────────────────────────┤
   │ 2. TOOL-CALL GATE           │  before any side-effectful   │
   │    typed args + policies    │  tool runs                   │
   │    + approval thresholds    │                              │
   ├─────────────────────────────┼─────────────────────────────┤
   │ 3. TRAJECTORY VALIDATOR     │  during / after a whole      │
   │    predecessors + budget    │  agent session               │
   │    + loop detection         │                              │
   ├─────────────────────────────┼─────────────────────────────┤
   │ 4. ENTITY RESOLVER          │  any time a ticker/CUSIP/    │
   │    ground every named       │  CIK/ISIN/counterparty is    │
   │    identifier               │  mentioned                   │
   └─────────────────────────────┼─────────────────────────────┘
                                 │
                          Downstream action
                  (order / wire / filing / report)

Every check writes to an append-only audit log. Every regulatory rule pack is stamped with its citation and effective date.


Status & disclaimers

  • Early. API is stabilising; we'll keep the slash-command and validate_tool_call / AssayValidator surfaces stable as v2.0 lands. Internals will keep moving.
  • Scaffolding, not legal advice. The shipped regulatory packs (SEC 15c3-1, Reg T, Volcker, FINRA 4210, MiFID II suitability, OFAC sanctions) are machine-checkable subsets of named requirements with citations. They are not a substitute for compliance counsel and they do not auto-track regulatory amendments.
  • BYOD. Your firm's data — restricted securities, sanctions lists, sector allowlists, portfolio state — stays local. Nothing is sent to VenturFlow or any third party unless you configure a remote LLM provider for the optional semantic-consistency check.

Install

Requirements

  • Python 3.11+
  • pip install pydantic pyyaml structlog requests
  • Optional, only for the semantic-consistency layer:
    • pip install anthropic (default provider) or pip install openai
    • ANTHROPIC_API_KEY or OPENAI_API_KEY

From source

git clone https://github.com/VenturFlow/Assay.git
cd assay
pip install -e ".[dev]"

Configure your environment

cp .env.example .env
# edit .env — at minimum set FIRM_DATA_PATH to your BYOD JSON

Quick start — one example per layer

1. Validate an agent output

from assay.validator import AssayValidator

v = AssayValidator(
    firm_data_path="data/my_firm.json",
    enable_semantic_check=False,
)

result = v.validate({
    "ticker": "MSFT",
    "asset_class": "equity",
    "action": "BUY",
    "confidence": 0.82,
    "position_size_pct": 0.03,
    "reasoning": "Liquid name, stable bid-ask, clean fit for the rotation.",
    "risk_score": 4.0,
    "time_horizon": "medium",
    "flags": ["liquidity_checked"],
})

print(result["passed"], result["action"])         # True passed
print(result["workflow_type"])                    # trade_recommendation (auto-detected)
print(result["active_packs"])                     # [{"name": "firm_base", "version": "...", ...}]

2. Gate a tool call before it runs

from assay.tools import ToolGate

decision = ToolGate().validate_tool_call(
    "wire_transfer",
    {
        "source_account_id": "A1",
        "beneficiary_name": "OK Counterparty",
        "beneficiary_account": "US-12345",
        "beneficiary_country": "US",
        "amount_usd": 2_000_000,
        "purpose": "vendor payment",
    },
)

print(decision.action)        # require_approval
print(decision.reason)        # amount_usd=2000000.0 crosses approval threshold 1000000

3. Validate a whole agent trajectory

from assay.trajectory import AgentSession, TrajectoryValidator

session = AgentSession(agent_id="trade-bot-1", goal="rotate into MSFT")
session.record_plan("read state, check market data, place a small order")
session.record_tool_call("read_position", {"account_id": "A1"})
session.record_tool_call("read_market_data", {"ticker": "MSFT"})
session.record_tool_call("place_order", {
    "account_id": "A1", "ticker": "MSFT", "side": "buy",
    "quantity": 100, "order_type": "limit", "limit_price": 412.20,
})

result = TrajectoryValidator().check(session)
print(result.passed)                  # True
print(result.budget_consumed)         # {"tool_calls": 3, "cost_usd": 0.0, ...}

4. Resolve a named entity

from assay.entities import MockResolver

print(MockResolver().resolve("AAPL", "ticker").found)            # True
print(MockResolver().resolve("FAKETICKER123", "ticker").found)   # False

The four layers in detail

1. Output validation (AssayValidator)

Dispatches to one of 10 typed workflow schemas (auto-detected if workflow_type is absent), then runs five sub-layers:

  1. Schema — Pydantic model validation per workflow.
  2. Business rules — the active rule packs filtered by workflow type.
  3. Risk guardrails — position-size / concentration / confidence checks for trade-like workflows.
  4. Firm BYOD — restricted securities, watchlists, approved sectors.
  5. Semantic consistency (optional, AI-powered) — does the agent's reasoning actually support its conclusion?
  6. Entity check (optional) — ticker/CUSIP/CIK/ISIN/counterparty grounding.

Every call appends a structured entry to logs/audit.jsonl. Outputs that violate hard rules trigger escalation through the configured channel (Slack today; PagerDuty/email are pluggable).

Workflow schemas

workflow_type Pydantic model Typical agent
trade_recommendation TradeRecommendation Single-name trade or position-sizing bot
portfolio_summary PortfolioSummary Portfolio analyst
portfolio_rebalance PortfolioRebalance Rebalancer / tax-loss harvester
nav_reconciliation NavReconciliation Fund-ops agent
due_diligence_report DueDiligenceReport DD bot / IC pre-read
kyc_flag KycFlag AML / sanctions screening
margin_call MarginCall Margin monitor
credit_memo CreditMemo Credit underwriting agent
options_strategy OptionsStrategy Options structurer
ocio_recommendation OcioRecommendation OCIO / IPS-aligned allocator

Shipped rule packs (limited testing for now)

Pack Citation Applies to
firm_base (internal default) trade, rebalance, OCIO
sec_15c3_1 17 CFR 240.15c3-1 trade, rebalance, margin call
reg_t_margin 12 CFR 220 trade, options, margin call
volcker 12 USC 1851 / Reg VV trade, rebalance
finra_4210 FINRA Rule 4210 margin, options, trade
mifid_ii_suitability Dir 2014/65/EU Art. 25 trade, OCIO, rebalance
ofac_sanctions 31 CFR Ch. V; SDN List trade, KYC, credit memo, rebalance

Each pack carries version, effective_date, regulator, citation. Compose them in your firm BYOD file:

{
  "firm_name": "Acme Capital",
  "rule_packs": ["firm_base", "sec_15c3_1", "ofac_sanctions"]
}

Add a firm-specific pack by writing one more YAML under assay/rules/packs/ and referencing it. See assay/rules/packs/_README.md for the rule grammar.

2. Tool-call gate (ToolGate)

Pre-execution validation for any agent tool call. Returns allow, deny, or require_approval.

Registered tools out of the box: place_order, cancel_order, wire_transfer, submit_filing, request_quote, read_position, read_market_data. Read-only tools auto-allow once typed args pass. Side-effectful tools go through:

  1. Typed-args validation (Pydantic per tool).
  2. Tool-policy packs in assay/tools/policies/.
  3. Firm rule packs with applies_to: ["tool:<name>"] (e.g. OFAC auto-applies to tool:wire_transfer).
  4. Approval thresholds — defaults: wire_transfer.amount_usd >= $1M and place_order.estimated_notional_usd >= $5Mrequire_approval.

Add a new tool by:

  1. Defining a Pydantic args model in assay/tools/schemas.py.
  2. Registering it in assay/tools/registry.py.
  3. (Optional) adding policy YAML in assay/tools/policies/.

3. Trajectory validator (TrajectoryValidator)

Catches failures that only show up at the session level — not a single bad output, but a bad sequence.

Built-in policy types:

  • RequiredPredecessorplace_order must come after read_position and read_market_data.
  • BudgetCap — caps on tool_calls, cost_usd, retries_per_tool, elapsed_seconds.
  • LoopDetection — flags N consecutive identical (tool, args) calls.
  • ToolFrequencyCap — e.g. request_quote ≤ 50/session.

Default policies are tuned for trade-execution agents; override per-session via the policies argument.

4. Entity resolver (EntityChecker)

Catches the single most common agent failure in finance: invented tickers, CUSIPs, CIKs, ISINs, fund/counterparty names.

Pluggable resolvers:

  • MockResolver — small built-in map, for tests and offline runs.
  • LocalRegistryResolver — reads CSVs from a directory you control (BYOD). Validates symbol shape (CUSIP-9, ISIN-12, etc.).
  • OpenFigiResolver — stub for the OpenFIGI API (drop in an API key + a small POST).
  • ChainedResolver — try registries in order; return the first hit.

default_resolver() builds a chain from env + firm data: LocalRegistry (if configured) → OpenFIGI (if key set) → Mock.


Bring your own data (BYOD)

Your firm's data never leaves the local environment. The example file data/my_firm_example.json shows the supported keys:

Key Used by Purpose
rule_packs rules engine Which packs are active
restricted_securities firm-data validator + tool gate Tickers that can never be recommended or ordered
watchlist firm-data validator Tickers that require human sign-off
approved_sectors firm-data validator Sector allowlist
sanctioned_entities OFAC pack + tool gate Names blocked everywhere (load your SDN list here)
sanctioned_jurisdictions OFAC pack + tool gate Country blocklist
portfolio_positions risk guardrails Current weights for concentration checks
sector_weights risk guardrails Current sector exposure
custom_rules rules engine Extra YAML-style rules merged at runtime
risk_overrides risk guardrails Per-firm threshold overrides
entity_registry_dir entity resolver Directory of CSVs for the LocalRegistry resolver

Get started:

cp data/my_firm_example.json data/my_firm.json
echo "FIRM_DATA_PATH=./data/my_firm.json" >> .env

Audit log

Every output validation appends one JSON line to logs/audit.jsonl:

{
  "timestamp": "2025-01-15T18:11:14Z",
  "agent_id": "trade-bot-1",
  "firm": "Acme Capital",
  "output": { ...validated output... },
  "validation": {
    "passed": true,
    "action": "passed",
    "workflow_type": "trade_recommendation",
    "active_packs": [...],
    "all_violations": [],
    "all_warnings": []
  },
  "action_taken": "passed"
}

Drop it into Splunk, Datadog, or any append-only log pipeline. For SOC 2 / regulated firms we recommend adding hash-chained or signed entries (see roadmap below).


Claude Code plugin

Install:

claude plugin add ./claude-code-plugin

Slash commands you get:

Command Boundary
/vf-validate <json-or-path> output validation
/vf-tool-check <tool> <args> pre-execution tool gate
/vf-trajectory <session-or-path> whole-session check
/vf-entity <kind> <values...> entity grounding
/vf-rules, /vf-packs, /vf-audit, /vf-firm-init inspection / setup

Plus a assay-validation skill (auto-suggests the right layer) and a assay-validator subagent (runs the check, refuses workarounds). Details in claude-code-plugin/README.md.


Cowork plugin

Install:

Claude → Settings → Integrations → Cowork Plugins → Add local plugin
→ point at: cowork-plugin/manifest.json

Exposes six commands over the Cowork stdin/stdout JSON protocol: validate, validate_tool_call, validate_trajectory, resolve_entity, rules, audit. Details in cowork-plugin/README.md.


Provider configuration (semantic check)

The optional semantic-consistency layer runs an independent LLM call to ask "does this reasoning actually support this conclusion?" It's model-agnostic via a provider abstraction in assay/providers/.

Claude (Anthropic) — default

export ASSAY_PROVIDER=claude
export ANTHROPIC_API_KEY=sk-ant-...

OpenAI-compatible

Works with OpenAI, Azure OpenAI, Together AI, Groq, Mistral, and local LM Studio:

export ASSAY_PROVIDER=openai        # or: azure | together | groq | mistral | local
export OPENAI_API_KEY=sk-...

Project structure

assay/
├── assay/
│   ├── validator.py             # main orchestrator
│   ├── schema/                  # 10 Pydantic workflow models + registry
│   ├── rules/
│   │   ├── engine.py            # pack-aware rule evaluator
│   │   └── packs/               # YAML rule packs (firm_base + 6 regulatory)
│   ├── risk/                    # numeric guardrails
│   ├── byod/                    # firm-data loader + validator
│   ├── entities/                # resolver ABC + Mock/Local/OpenFIGI/Chained
│   ├── tools/                   # tool gate + typed schemas + policies
│   ├── trajectory/              # AgentSession + TrajectoryValidator + policies
│   ├── semantic/                # AI consistency checker
│   ├── audit/                   # append-only logger
│   ├── escalation/              # Slack / log-only / email
│   ├── providers/               # Claude + OpenAI-compatible
│   └── cowork/                  # legacy Cowork plugin shim
├── claude-code-plugin/          # Claude Code v2 plugin (commands/skill/agent/scripts)
├── cowork-plugin/               # Cowork v2 plugin (manifest + entry script)
├── data/
│   └── my_firm_example.json
├── examples/
│   ├── run_demo_agent.py        # output-validation demo
│   └── run_agentic_loop.py      # tool gate + trajectory + entity demo
├── tests/                       # 40 tests covering all four layers
└── pyproject.toml

Testing

pip install -e ".[dev]"
PYTHONPATH=. pytest tests/ -v

Current state: 40 tests passing across schema dispatch, rule packs, tool gate, trajectory, entity resolver.

For UI/agent integration, run the demos:

PYTHONPATH=. FIRM_DATA_PATH=data/my_firm_example.json python examples/run_demo_agent.py
PYTHONPATH=. FIRM_DATA_PATH=data/my_firm_example.json python examples/run_agentic_loop.py

Roadmap

These are things we have working internally at VenturFlow and want to upstream once they stabilise:

  • Adversarial eval harness — red-team trajectory suite (prompt-injected term sheets, restricted-ticker traps, sycophancy probes). Wires into eval frameworks like inspect-ai.
  • Tamper-evident audit log — hash-chained or signed entries with provenance fields (prompt template version, model snapshot, retrieved-doc hashes, firm-data revision).
  • Citation grounding — when an agent claims "per Q3 2025 10-K, revenue grew 12%," verify the claim against the cited document.
  • Determinism + replay — backtest historical agent outputs against today's rule packs to find what would have been blocked.
  • Mid-trajectory HITL — pause-and-approve gates with timeouts, not just post-hoc escalation.
  • Confidence calibration tracking — log claimed confidence vs. realised outcome; auto-tune per (agent, asset class).
  • Policy-as-code — optional OPA/Rego/Cedar/CEL backend for complex multi-field conditions.
  • More regulatory packs — Form PF, SEC Marketing Rule, EU AI Act high-risk classification, GDPR-for-financial-data, Basel III concentration limits.

We're not in a hurry — better to ship one correct pack than ten approximate ones.


Contributing

This is an early, ad-hoc project. The bar for contributions:

  1. Tests. Anything you add should have a test that fails without your change.
  2. Citations. New regulatory packs must reference an authority and stamp version + effective_date. No "vibes-based compliance."
  3. No silent rule weakening. Don't make a recommendation pass by editing a pack. Either fix the recommendation or escalate.
  4. Backwards-compat for slash commands and the AssayValidator / ToolGate public API. Internals are fair game.

Open issues / PRs at github.com/VenturFlow/Assay.


Who is VenturFlow?

VenturFlow is an Agentic AI platform purpose-built for venture-capital firms — sourcing, diligence, portfolio support, LP reporting. We're a commercial product. This validation library is something we hit the need for repeatedly while building VenturFlow's agentic stack, and we believe other teams shipping AI agents into financial workflows shouldn't have to re-solve it.

If this library is useful to you, great — we're glad. If you want the full agentic platform on top, come talk to us.


License

Apache License 2.0 — see LICENSE.

Copyright (c) 2026 VenturFlow
Licensed under the Apache License, Version 2.0.

Disclaimer

This library and the regulatory rule packs it ships are not legal, compliance, tax, or investment advice. The packs encode named requirements as machine-checkable rules; they do not substitute for review by qualified counsel and they do not track regulatory amendments automatically. You are responsible for the rules you choose to apply to your agents.


An open-source contribution from VenturFlow.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages