v0.1.0
v0.1.0 (2026-06-13)
This release is published under the MIT License.
Bug Fixes
- Add parallel-safe Annotated reducers to BaseAgentState (
04ed6b8)
All state fields now use Annotated reducers for LangGraph parallel fan-out compatibility:
- metadata: dict merge (last-writer-wins per key)
- response/agent_type/routing_decision/error: last-writer-wins
- suggestions/citations/steps_taken: list concatenation (operator.add)
- messages: add_messages (existing)
This fixes 'Can receive only one value per step' errors when parallel graph branches write to the same state field.
-
ci: Fix codecov param on v7, add pr-title check filter, and import Callable (
490583c) -
ci: Fix semantic release token and upgrade deprecated Node.js 20 actions (
7dfea0d) -
release: Disable semantic-release build command and pypi upload (
2b1c427)
Chores
-
Fix workflows, docker, setup-telemetry, and resolve all type/mypy errors for first release (
54cd42d) -
Run pre-commit quality gate autofix, add importlib.util, resolve mypy and formatting (
1140c7b)
Documentation
-
Comprehensive v0.1.0 release documentation (
06c34b1) -
Create root CHANGELOG.md with detailed release notes
-
Update docs/changelog.md with full feature breakdown
-
Update docs/index.md with badges, expanded features, architecture diagram
-
Update docs/getting-started/installation.md with extras table, pip/uv/poetry tabs
-
Update docs/getting-started/quickstart.md with expected outputs, next steps
-
Update docs/cli/commands.md with click-style help for all 7 commands
-
Update docs/guide/optimization.md with all 7 strategies, GEval, per-agent pattern
-
Fix noqa WPS433 → F811 in optimize/report.py
-
Format cli/commands.py
-
mkdocs build --strict passes with 0 errors
-
Mkdocs Material + CI/CD + semantic release + pre-commit (
2f672e8)
Documentation (MkDocs Material): - 16 documentation pages (getting-started, user guide, CLI, architecture) - mike versioning (dev/latest aliases) - Dark/light theme with deep purple accent - Mermaid diagrams, tabbed content, code copy
CI/CD (GitHub Actions): - ci.yml: lint (ruff), test (Python 3.11/3.12/3.13 matrix), typecheck (mypy) - release.yml: python-semantic-release + PyPI publish + GitHub Release - docs.yml: auto-deploy docs via mike on push/tag
Pre-commit: - trailing whitespace, end-of-file-fixer, check-yaml/toml/json - ruff lint + format - mypy type checking - conventional-pre-commit (commit message enforcement)
Author: UnicoLab - Updated pyproject.toml, LICENSE, README, all URLs - Added semantic-release config, coverage thresholds - Added docs and cli optional deps - Cleaned up 12 stale test files, removed conflicting pytest.ini - Enhanced Makefile with docs-serve, docs-build, docs-deploy, check-all
88/88 tests passing, docs build clean in strict mode
Features
-
Add Pixar logo, fix 130 lint errors, format all files (
902072d) -
Add assets/logo.png (Pixar-style 3D robot mascot)
-
Update README.md with centered logo
-
Fix 130 ruff lint errors (import sorting, unused imports, etc.)
-
Reformat 51 files with ruff format
-
All 161 tests pass, all linting clean
-
Verify CI workflows reference UnicoLab/agentomatic
-
Add PromptOptimizationLoop — local-first prompt optimizer (
3b4143f)
Generic, framework-agnostic iterative prompt optimization engine:
-
PromptOptimizationLoop: evaluate → analyse failures → LLM rewrite → repeat
-
4 rewrite strategies: iterative, adversarial, structured, minimal
-
Built-in scorers: keyword_overlap, contains_score
-
Pluggable scoring: sync or async (LLM-as-a-judge)
-
Pluggable rewrite LLM: any LangChain model or raw callable
-
Early stopping with patience
-
Rich HTML reports with SVG evolution charts
-
JSON experiment tracking for reproducibility
-
Zero project-specific deps — works with any agent
-
Complete framework with SQLAlchemy storage, full example, and integration tests (
c358ed1)
New modules:
- storage/models.py: SQLAlchemy ORM (ThreadModel, MessageModel, FeedbackModel)
- storage/sqlalchemy.py: Async SQLAlchemyStore with connection pooling
- examples/full_agent/: Full weather agent demonstrating ALL overwrite options
(manifest, graph, nodes, config, schemas, tools, api, prompts, langgraph.json)
Fixes:
- platform.py: Mount routers for programmatic agents at build-time
(previously only mounted during lifespan, breaking TestClient)
Tests:
-
test_integration.py: 20 integration tests (platform + agent endpoints)
-
41/41 tests passing (unit + integration)
-
Deep DeepEval + HolySheet integration (
fc2dc44)
Metrics (metrics.py): - GEvalMetric — DeepEval GEval with custom criteria + eval_steps (chain-of-thought LLM-as-judge, Ollama fallback) - DeepEvalMetric — wrap ANY deepeval metric instance as BaseMetric - RedTeamMetric — adversarial safety scoring (bias+toxicity) - resolve_metrics() now supports 'geval:criteria' shorthand syntax - All DeepEval imports gracefully degrade via try/except
Reports (report.py): - HolySheet integration for interactive React dashboards - KPI cards (baseline, best, improvement, duration, iteration) - LineChart (score vs iteration per metric) - DataTable (full iteration history) - Markdown (prompt diffs via difflib) - Falls back to inline SVG/HTML if HolySheet not installed
Synthesizer (synthesizer.py): - generate_from_docs() — DeepEval Synthesizer.generate_goldens_from_docs() - red_team() — DeepEval RedTeamer with 40+ vulnerability scans - to_deepeval_dataset() / from_deepeval_dataset() — bidirectional bridge - Convenience: generate_from_docs(), red_team() at module level
Dependencies: - optimize extra: deepeval>=2.0, holysheet>=0.1
Tests: 139/139 passing | Docs: builds clean
-
Make skip_paths customizable in AuthMiddleware (
acc6de2) -
Optimization endpoints, feedback collection, OpenTelemetry (
3dcfc8c)
API Endpoints: - POST /{agent}/optimize/invoke — full pipeline context (retrieval_context, tool_calls, reasoning, citations) for DeepEval metrics - POST /{agent}/feedback — async user feedback (thumbs, rating, correction) - GET /{agent}/feedback — list feedback entries - GET /{agent}/feedback/export — JSONL export for optimization datasets
Feedback System (middleware/feedback.py): - FeedbackCollector — async collector with in-memory buffer + BaseStore backend - @collect_feedback decorator — auto-record agent I/O for dataset building - Module-level singleton via get_collector()/set_collector() - Corrections feed back as expected_answers for optimization
OpenTelemetry (observability/telemetry.py): - setup_telemetry(app) — auto-configures from env vars - Auto-instruments FastAPI + httpx - @Traced decorator — creates spans for any sync/async function - OTLP exporter (gRPC → HTTP fallback) or ConsoleSpanExporter - Graceful no-op when opentelemetry not installed
Runner Context (optimize/runner.py): - RunResult now carries retrieval_context, tool_calls, reasoning, steps_taken - Auto-tries /optimize/invoke first, falls back to /invoke - submit_feedback() method for feeding optimization results back
Platform (core/platform.py): - enable_feedback=True (default) — auto-attaches feedback endpoints - enable_telemetry=True (default) — auto-configures OTEL on build()
Dependencies: - telemetry = [opentelemetry-api, opentelemetry-sdk, fastapi+httpx instrumentors] - all extra now includes telemetry
Tests: 161/161 passing | Docs: builds clean
- Production-ready pluggable architecture (
f096a9d)
Storage: - BaseStore ABC defining the universal storage protocol - MemoryStore and SQLAlchemyStore both inherit BaseStore - Users can implement RedisStore, MongoStore, etc. by subclassing - Platform accepts store= param with auto-init/close lifecycle
Middleware pipeline: - AuthMiddleware: API key auth via header/query param - RateLimitMiddleware: sliding-window per-IP rate limiter - MetricsMiddleware: Prometheus counters + histograms - All toggleable via platform constructor params - Custom middleware via middleware=[] param
Platform integration: - Storage wired into router_factory (threads, messages, feedback) - Health endpoint includes storage health check - /api/v1/storage/stats and /api/v1/feedback endpoints - Lifecycle hooks properly init/close storage
Tests: - 63/63 passing (21 unit + 42 integration) - Coverage: invoke, chat, stream, A2A, threads, lifecycle, middleware, storage, feedback, programmatic reg, custom routers
- Prompt optimization engine with DSPy-inspired strategies (
c12f00e)
Optimization Module (agentomatic.optimize): - PromptOptimizer — like model.fit() for prompts - Separate rewrite_llm and eval_llm for full model control - 7 optimization strategies (iterative_rewrite, few_shot, chain_of_thought, mipro, bootstrap_random_search, ensemble) - Rich progress display with per-iteration metrics table - Auto HTML reports with SVG charts + prompt diffs - Experiment tracking (JSON logs) - Prompt versioning (branching in prompts.json) - Early stopping with patience + target score - Cross-prompt A/B comparison
Data Synthesis (agentomatic.optimize.synthesizer): - DataSynthesizer — generate eval datasets from descriptions - 5 augmentation strategies (paraphrase, perturbation, expansion, adversarial, formality_shift) - generate_dataset() + augment_dataset() convenience functions - Generate from system prompts or agent descriptions
Metrics (agentomatic.optimize.metrics): - Built-in: ExactMatchMetric, ContainsMetric (no LLM needed) - DeepEval: answer_relevancy, faithfulness, hallucination, etc. - LLMJudgeMetric — custom criteria with LLM-as-judge - CustomMetric — wrap any sync/async callable
CLI: agentomatic optimize --dataset qa.jsonl
Docs: guide/optimization.md with full usage reference
Tests: 51 new tests (139/139 total passing)
Install: pip install agentomatic[optimize]
- Rich CLI/TUI + Chainlit debug UI + 5 scaffolding templates (
642cdf7)
CLI (agentomatic command): - init: Interactive agent scaffolding with 5 templates (basic, full, rag, chatbot, custom) - run: Start platform with --with-ui flag - list: Rich table of discovered agents - test: Interactive terminal testing against running agents - inspect: Show agent structure, manifest, config - doctor: Environment health check with dep status - ui: Launch Chainlit debug UI standalone
Templates generate ALL agent files: - init.py, graph.py, nodes.py (always) - config.py, schemas.py, tools.py, api.py (full template) - prompts.json, langgraph.json, .env.example, README.md
Chainlit Debug UI (pip install agentomatic[ui]): - Mounts at /chat inside the same FastAPI app - Agent selector, real-time streaming, tool call visualization - Chain-of-thought display, feedback collection - Dark theme with agentomatic branding
Dependencies: - cli extra: rich>=13.0, questionary>=2.0 - ui extra: chainlit>=2.0
Tests: 88/88 passing (21 unit + 42 integration + 25 CLI)
- 🚀 agentomatic v0.1.0 — zero-code multi-agent API framework (
ba9d6ae)
Complete rewrite of lm_agents_api into a reusable Python package:
Core Framework: - AgentPlatform: one-liner app factory with auto-discovery - AgentRegistry: auto-discovers agents from folder structure - AgentManifest: frozen dataclass identity card per agent - BaseAgentState: LangGraph-compatible default state - RouterFactory: auto-generates 12+ REST endpoints per agent
Features: - Auto-generated endpoints: invoke, stream, chat, health, config, prompts, A2A - Framework-agnostic: LangGraph, LangChain, or plain async functions - Circuit breakers & concurrency limiting - Prometheus metrics with graceful fallback - Pluggable storage (memory/SQLAlchemy) - Prompt versioning with JSON-based management - A2A protocol with auto-generated agent cards - Feature flags via env vars (streaming, auth, metrics, etc.) - CLI: init, run, list commands - LoggingMiddleware with X-Request-ID and timing
Package: - hatchling build system - Optional extras: langgraph, ollama, openai, azure, vertex, metrics, db - PEP 561 py.typed marker - Comprehensive test suite (21 tests, all passing) - MIT license
Example: - hello_agent: 3-line main.py demonstrating the framework
- prompts: Export PromptManager from root and support prompts_file parameter (
bb3cbf2)
Refactoring
-
Cleaning the implementation (
36f035b) -
Improving and testing (
af37c0a) -
Improving dockers (
5f65ae3) -
Migrate CLI from argparse to click, replace print with loguru (
6b00b9e) -
Replace argparse with click decorators for all 8 commands
-
All print() → loguru logger.info/success/warning/error
-
Add click>=8.1.0 to dependencies
-
Update entry point to cli click group
-
161/161 tests pass
-
Saving state (
0e73eaf) -
Simplification (
32e5987) -
Simplifying (
fabdc64)
Testing
- Adding some first tests (
ce21191)