10 research projects. 4 peer-reviewed publications. One mission.
2.4 million AI agents deployed in production. Zero standard methodology for verifying they work correctly. Agents leak data, exceed budgets, drift from intended behavior, and fail in ways no one predicted.
Every framework helps you build agents. We're working on making them reliable.
An open-source research initiative spanning the complete agent development lifecycle:
- Behavioral specification & verification
- Security validation & supply chain analysis
- Token-efficient statistical testing
- Privacy-preserving memory architectures
- Communication fidelity benchmarks
- Chaos engineering & resilience testing
10 projects. Each backed by formal methods, mathematical proofs, and peer-reviewed research.
| Paper | Venue | Year |
|---|---|---|
| Agent Behavioral Contracts: Formal Specification and Runtime Enforcement | arXiv:2602.22302 | 2026 |
| SkillFortify: Formal Verification for AI Agent Skill Security | arXiv:2603.00195 | 2026 |
| Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows | arXiv:2603.02601 | 2026 |
| Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense | arXiv:2603.02240 | 2026 |
More papers in preparation. Conference targets: ASE 2026, NeurIPS 2026, AAMAS 2027, ICSE 2027.
- Every project is backed by published research — not blog posts.
- Every tool is open-source and framework-agnostic.
- We solve problems with math and proofs — not marketing.
An independent research initiative by Varun Pratap Bhardwaj
We don't just identify problems in agent development. We solve them.