Skip to content

v0.1.0

Choose a tag to compare

@github-actions github-actions released this 13 Jun 20:04
· 15 commits to main since this release

v0.1.0 (2026-06-13)

This release is published under the MIT License.

Bug Fixes

  • Add parallel-safe Annotated reducers to BaseAgentState (04ed6b8)

All state fields now use Annotated reducers for LangGraph parallel fan-out compatibility:

  • metadata: dict merge (last-writer-wins per key)
  • response/agent_type/routing_decision/error: last-writer-wins
  • suggestions/citations/steps_taken: list concatenation (operator.add)
  • messages: add_messages (existing)

This fixes 'Can receive only one value per step' errors when parallel graph branches write to the same state field.

  • ci: Fix codecov param on v7, add pr-title check filter, and import Callable (490583c)

  • ci: Fix semantic release token and upgrade deprecated Node.js 20 actions (7dfea0d)

  • release: Disable semantic-release build command and pypi upload (2b1c427)

Chores

  • Fix workflows, docker, setup-telemetry, and resolve all type/mypy errors for first release (54cd42d)

  • Run pre-commit quality gate autofix, add importlib.util, resolve mypy and formatting (1140c7b)

Documentation

  • Comprehensive v0.1.0 release documentation (06c34b1)

  • Create root CHANGELOG.md with detailed release notes

  • Update docs/changelog.md with full feature breakdown

  • Update docs/index.md with badges, expanded features, architecture diagram

  • Update docs/getting-started/installation.md with extras table, pip/uv/poetry tabs

  • Update docs/getting-started/quickstart.md with expected outputs, next steps

  • Update docs/cli/commands.md with click-style help for all 7 commands

  • Update docs/guide/optimization.md with all 7 strategies, GEval, per-agent pattern

  • Fix noqa WPS433 → F811 in optimize/report.py

  • Format cli/commands.py

  • mkdocs build --strict passes with 0 errors

  • Mkdocs Material + CI/CD + semantic release + pre-commit (2f672e8)

Documentation (MkDocs Material): - 16 documentation pages (getting-started, user guide, CLI, architecture) - mike versioning (dev/latest aliases) - Dark/light theme with deep purple accent - Mermaid diagrams, tabbed content, code copy

CI/CD (GitHub Actions): - ci.yml: lint (ruff), test (Python 3.11/3.12/3.13 matrix), typecheck (mypy) - release.yml: python-semantic-release + PyPI publish + GitHub Release - docs.yml: auto-deploy docs via mike on push/tag

Pre-commit: - trailing whitespace, end-of-file-fixer, check-yaml/toml/json - ruff lint + format - mypy type checking - conventional-pre-commit (commit message enforcement)

Author: UnicoLab - Updated pyproject.toml, LICENSE, README, all URLs - Added semantic-release config, coverage thresholds - Added docs and cli optional deps - Cleaned up 12 stale test files, removed conflicting pytest.ini - Enhanced Makefile with docs-serve, docs-build, docs-deploy, check-all

88/88 tests passing, docs build clean in strict mode

Features

  • Add Pixar logo, fix 130 lint errors, format all files (902072d)

  • Add assets/logo.png (Pixar-style 3D robot mascot)

  • Update README.md with centered logo

  • Fix 130 ruff lint errors (import sorting, unused imports, etc.)

  • Reformat 51 files with ruff format

  • All 161 tests pass, all linting clean

  • Verify CI workflows reference UnicoLab/agentomatic

  • Add PromptOptimizationLoop — local-first prompt optimizer (3b4143f)

Generic, framework-agnostic iterative prompt optimization engine:

  • PromptOptimizationLoop: evaluate → analyse failures → LLM rewrite → repeat

  • 4 rewrite strategies: iterative, adversarial, structured, minimal

  • Built-in scorers: keyword_overlap, contains_score

  • Pluggable scoring: sync or async (LLM-as-a-judge)

  • Pluggable rewrite LLM: any LangChain model or raw callable

  • Early stopping with patience

  • Rich HTML reports with SVG evolution charts

  • JSON experiment tracking for reproducibility

  • Zero project-specific deps — works with any agent

  • Complete framework with SQLAlchemy storage, full example, and integration tests (c358ed1)

New modules:

  • storage/models.py: SQLAlchemy ORM (ThreadModel, MessageModel, FeedbackModel)
  • storage/sqlalchemy.py: Async SQLAlchemyStore with connection pooling
  • examples/full_agent/: Full weather agent demonstrating ALL overwrite options
    (manifest, graph, nodes, config, schemas, tools, api, prompts, langgraph.json)

Fixes:

  • platform.py: Mount routers for programmatic agents at build-time
    (previously only mounted during lifespan, breaking TestClient)

Tests:

  • test_integration.py: 20 integration tests (platform + agent endpoints)

  • 41/41 tests passing (unit + integration)

  • Deep DeepEval + HolySheet integration (fc2dc44)

Metrics (metrics.py): - GEvalMetric — DeepEval GEval with custom criteria + eval_steps (chain-of-thought LLM-as-judge, Ollama fallback) - DeepEvalMetric — wrap ANY deepeval metric instance as BaseMetric - RedTeamMetric — adversarial safety scoring (bias+toxicity) - resolve_metrics() now supports 'geval:criteria' shorthand syntax - All DeepEval imports gracefully degrade via try/except

Reports (report.py): - HolySheet integration for interactive React dashboards - KPI cards (baseline, best, improvement, duration, iteration) - LineChart (score vs iteration per metric) - DataTable (full iteration history) - Markdown (prompt diffs via difflib) - Falls back to inline SVG/HTML if HolySheet not installed

Synthesizer (synthesizer.py): - generate_from_docs() — DeepEval Synthesizer.generate_goldens_from_docs() - red_team() — DeepEval RedTeamer with 40+ vulnerability scans - to_deepeval_dataset() / from_deepeval_dataset() — bidirectional bridge - Convenience: generate_from_docs(), red_team() at module level

Dependencies: - optimize extra: deepeval>=2.0, holysheet>=0.1

Tests: 139/139 passing | Docs: builds clean

  • Make skip_paths customizable in AuthMiddleware (acc6de2)

  • Optimization endpoints, feedback collection, OpenTelemetry (3dcfc8c)

API Endpoints: - POST /{agent}/optimize/invoke — full pipeline context (retrieval_context, tool_calls, reasoning, citations) for DeepEval metrics - POST /{agent}/feedback — async user feedback (thumbs, rating, correction) - GET /{agent}/feedback — list feedback entries - GET /{agent}/feedback/export — JSONL export for optimization datasets

Feedback System (middleware/feedback.py): - FeedbackCollector — async collector with in-memory buffer + BaseStore backend - @collect_feedback decorator — auto-record agent I/O for dataset building - Module-level singleton via get_collector()/set_collector() - Corrections feed back as expected_answers for optimization

OpenTelemetry (observability/telemetry.py): - setup_telemetry(app) — auto-configures from env vars - Auto-instruments FastAPI + httpx - @Traced decorator — creates spans for any sync/async function - OTLP exporter (gRPC → HTTP fallback) or ConsoleSpanExporter - Graceful no-op when opentelemetry not installed

Runner Context (optimize/runner.py): - RunResult now carries retrieval_context, tool_calls, reasoning, steps_taken - Auto-tries /optimize/invoke first, falls back to /invoke - submit_feedback() method for feeding optimization results back

Platform (core/platform.py): - enable_feedback=True (default) — auto-attaches feedback endpoints - enable_telemetry=True (default) — auto-configures OTEL on build()

Dependencies: - telemetry = [opentelemetry-api, opentelemetry-sdk, fastapi+httpx instrumentors] - all extra now includes telemetry

Tests: 161/161 passing | Docs: builds clean

  • Production-ready pluggable architecture (f096a9d)

Storage: - BaseStore ABC defining the universal storage protocol - MemoryStore and SQLAlchemyStore both inherit BaseStore - Users can implement RedisStore, MongoStore, etc. by subclassing - Platform accepts store= param with auto-init/close lifecycle

Middleware pipeline: - AuthMiddleware: API key auth via header/query param - RateLimitMiddleware: sliding-window per-IP rate limiter - MetricsMiddleware: Prometheus counters + histograms - All toggleable via platform constructor params - Custom middleware via middleware=[] param

Platform integration: - Storage wired into router_factory (threads, messages, feedback) - Health endpoint includes storage health check - /api/v1/storage/stats and /api/v1/feedback endpoints - Lifecycle hooks properly init/close storage

Tests: - 63/63 passing (21 unit + 42 integration) - Coverage: invoke, chat, stream, A2A, threads, lifecycle, middleware, storage, feedback, programmatic reg, custom routers

  • Prompt optimization engine with DSPy-inspired strategies (c12f00e)

Optimization Module (agentomatic.optimize): - PromptOptimizer — like model.fit() for prompts - Separate rewrite_llm and eval_llm for full model control - 7 optimization strategies (iterative_rewrite, few_shot, chain_of_thought, mipro, bootstrap_random_search, ensemble) - Rich progress display with per-iteration metrics table - Auto HTML reports with SVG charts + prompt diffs - Experiment tracking (JSON logs) - Prompt versioning (branching in prompts.json) - Early stopping with patience + target score - Cross-prompt A/B comparison

Data Synthesis (agentomatic.optimize.synthesizer): - DataSynthesizer — generate eval datasets from descriptions - 5 augmentation strategies (paraphrase, perturbation, expansion, adversarial, formality_shift) - generate_dataset() + augment_dataset() convenience functions - Generate from system prompts or agent descriptions

Metrics (agentomatic.optimize.metrics): - Built-in: ExactMatchMetric, ContainsMetric (no LLM needed) - DeepEval: answer_relevancy, faithfulness, hallucination, etc. - LLMJudgeMetric — custom criteria with LLM-as-judge - CustomMetric — wrap any sync/async callable

CLI: agentomatic optimize --dataset qa.jsonl

Docs: guide/optimization.md with full usage reference

Tests: 51 new tests (139/139 total passing)

Install: pip install agentomatic[optimize]

  • Rich CLI/TUI + Chainlit debug UI + 5 scaffolding templates (642cdf7)

CLI (agentomatic command): - init: Interactive agent scaffolding with 5 templates (basic, full, rag, chatbot, custom) - run: Start platform with --with-ui flag - list: Rich table of discovered agents - test: Interactive terminal testing against running agents - inspect: Show agent structure, manifest, config - doctor: Environment health check with dep status - ui: Launch Chainlit debug UI standalone

Templates generate ALL agent files: - init.py, graph.py, nodes.py (always) - config.py, schemas.py, tools.py, api.py (full template) - prompts.json, langgraph.json, .env.example, README.md

Chainlit Debug UI (pip install agentomatic[ui]): - Mounts at /chat inside the same FastAPI app - Agent selector, real-time streaming, tool call visualization - Chain-of-thought display, feedback collection - Dark theme with agentomatic branding

Dependencies: - cli extra: rich>=13.0, questionary>=2.0 - ui extra: chainlit>=2.0

Tests: 88/88 passing (21 unit + 42 integration + 25 CLI)

  • 🚀 agentomatic v0.1.0 — zero-code multi-agent API framework (ba9d6ae)

Complete rewrite of lm_agents_api into a reusable Python package:

Core Framework: - AgentPlatform: one-liner app factory with auto-discovery - AgentRegistry: auto-discovers agents from folder structure - AgentManifest: frozen dataclass identity card per agent - BaseAgentState: LangGraph-compatible default state - RouterFactory: auto-generates 12+ REST endpoints per agent

Features: - Auto-generated endpoints: invoke, stream, chat, health, config, prompts, A2A - Framework-agnostic: LangGraph, LangChain, or plain async functions - Circuit breakers & concurrency limiting - Prometheus metrics with graceful fallback - Pluggable storage (memory/SQLAlchemy) - Prompt versioning with JSON-based management - A2A protocol with auto-generated agent cards - Feature flags via env vars (streaming, auth, metrics, etc.) - CLI: init, run, list commands - LoggingMiddleware with X-Request-ID and timing

Package: - hatchling build system - Optional extras: langgraph, ollama, openai, azure, vertex, metrics, db - PEP 561 py.typed marker - Comprehensive test suite (21 tests, all passing) - MIT license

Example: - hello_agent: 3-line main.py demonstrating the framework

  • prompts: Export PromptManager from root and support prompts_file parameter (bb3cbf2)

Refactoring

  • Cleaning the implementation (36f035b)

  • Improving and testing (af37c0a)

  • Improving dockers (5f65ae3)

  • Migrate CLI from argparse to click, replace print with loguru (6b00b9e)

  • Replace argparse with click decorators for all 8 commands

  • All print() → loguru logger.info/success/warning/error

  • Add click>=8.1.0 to dependencies

  • Update entry point to cli click group

  • 161/161 tests pass

  • Saving state (0e73eaf)

  • Simplification (32e5987)

  • Simplifying (fabdc64)

Testing

  • Adding some first tests (ce21191)