Releases: SoundMindsAI/relyloop
RelyLoop v0.1.3 — MVP1 backlog fully drained
Docs-only milestone release. The MVP1 actionable backlog is now fully drained — the 01_mvp1/ planned-features bucket is empty.
What landed
- PR #310 — the two remaining deferred-by-design MVP1 folders were reclassified out of
01_mvp1/:chore_demo_reseed_stale_recovery_atomic_cas→99_backlog/(defense-in-depth; already Priority: Backlog) andinfra_agent_sibling_worktree_isolation→99_backlog/(phases 1+2 shipped; only phase3 remains, defer-until-incident). - PR #311 — refreshed the compressed-context docs for the post-MVP1 reality:
state.md: updated Last-5-merges, rewrote In-flight/Queued (next stop is the02_mvp2/bucket), and markedinfra_ci_smoke_makeup+chore_starlette_422_deprecationresolved.CLAUDE.md: fixed a stale Next.js 14 → 16 reference in the Frontend Conventions stack.
Notes
- No code or schema changes since v0.1.2. Alembic head remains
0020_studies_baseline_trial. - Next stop: MVP2 / v0.2 — "Three-Engine + Real Signals" (Apache Solr adapter + UBI judgments).
🤖 Generated with Claude Code
RelyLoop v0.1.1 — MVP1 alpha feature-complete
Patch release on top of v0.1.0. The MVP1 dashboard reads Path to MVP1: 0, 36/36 scoped done (100%) — the MVP1 alpha is feature-complete. The two remaining held items are correctly classified for MVP2 and visible on the new MVP2 dashboard.
This release lands the post-launch polish that accumulated since the v0.1.0 alpha cut: one new user-visible feature, two new backend surfaces, a wave of CI / tooling hardening, and three idea drops that drained the backlog to zero actionable items.
What's new since v0.1.0
Features
- feat_query_inline_crud (PR #101) — PATCH / DELETE / GET endpoints on
/api/v1/query-sets/{id}/queries+ inline editable table on/query-sets/[id]page. 12 stories, zero migrations. - feat_judgments_periodic_resume_sweep (PR #104) — in-worker
resume_stuck_judgment_listsArq cron that re-enqueues everyjudgment_lists.status='generating'row every 15 minutes via deterministic Arq_job_iddedup. Replaces the old boot-time-only sweep with continuous coverage. - chore_chat_last_message_preview (PR #117) — adds
last_message_preview(truncated to 120 chars) andlast_message_attoConversationSummary./chatlist page renders the preview under the title and shows last-touch time instead of created-at. LATERAL JOIN against the existingmessages_conversation_idx— no migration.
Tooling + infrastructure
- Per-release dashboards + top-level roadmap roll-up (PR #119) —
docs/00_overview/DASHBOARD.md+dashboard.htmlindex over the canonical release matrix (MVP1 → MVP2 → MVP3 → MVP4 → GA v1 → v2+). Per-release dashboards auto-discovered via_mvpNfolder suffix; bidirectional navigation between roadmap and per-release detail. - env-defense workflow + gitleaks (PR #94, PR #99) —
.env*filename guard CI workflow + content-scan step, surfaced after a local.envcorruption incident. make backend-*sub-targets (PR #110) —make backend-fmt/backend-lint/backend-typecheckfor Node-18 contributors who can't run the bundledmake fmt(which gates on pnpm's Node ≥20.18 engine).- structlog test helpers (PR #114) —
backend/tests/_log_helpers.pyfactorsassert_log_level,find_log_events, andRecordingLoggerafter a two-CI-run debugging arc on PR #112. - Dashboard regen idempotency + relative-link rewriting (PR #108) — pre-commit hook no longer churns on no-op writes; one-liner links extracted from idea files resolve correctly when embedded in the dashboard.
Bug fixes
- bug_query_inline_crud_since_filter_uuidv7_ms_collision (PR #106) — 10ms sleep in
_seed_sethelpers to avoid UUIDv7 ms-collision flake. - chore_digest_worker_narrow_except (PR #112) — narrowed
except Exceptionin the digest worker'soptuna.importance.get_param_importancescall so future dep-regressions like the PR #92 sklearnImportErrorsurface at ERROR level on day one instead of silently shipping empty importance maps for days.
Idea drops (won't-do)
- chore_cluster_run_query_history (PR #103) — superseded by
feat_chat_agent'srun_querytool. - chore_studies_ui_shadcn_polish (PR #116) —
ClusterFilterSelectprecedent established native<select>as the project's standard for page-level filter/control surfaces; F1 inconsistency claim retired. - chore_demo_recording_mvp3 (PR #119) — single-maintainer alpha base rates make a 4–6 hour record-edit-upload-embed task unlikely to execute;
tutorial-first-study.mdserves the demo's discovery role; any pre-MVP4 recording would need re-shooting once MVP4 auth UI lands.
Held for MVP2
Two ideas explicitly held for MVP2 on the new MVP2 dashboard:
bug_chat_long_conversation_truncation_mvp2— chat agent silently drops load-bearing context past 100-message cap. Latent bug; no operator has hit it. MVP2 timing aligns with Langfuse trace tooling for summarization prompt calibration.infra_arq_subprocess_test_mvp2— subprocess-driven Arq worker test for narrow Arq-version-regression guard. Trigger-locked at three conditions (arq pin bump, 3rd cron, MVP3 hardening opt-in).
Stack
Unchanged from v0.1.0:
- Python 3.13 + FastAPI · Next.js 16 (React 19, TypeScript App Router, Turbopack) · Tailwind 4 · Vitest 4
- Postgres 16 + SQLAlchemy 2.0 async + Alembic
- Redis 7 + Arq workers
- Optuna with TPE sampler + RDBStorage · pytrec_eval
openaiSDK pointed at any OpenAI-compatible endpoint viaOPENAI_BASE_URL- ElasticAdapter handling both ES 8.11+/9.x and OpenSearch 2.x/3.x
- Single-tenant, no auth, Docker Compose-only deployment
Try it
git clone https://github.com/SoundMindsAI/relyloop
cd relyloop
git checkout v0.1.1
make up
# Then follow docs/08_guides/tutorial-first-study.mdLicense: Apache 2.0. Status: alpha. Multi-tenant + SSO arrive at MVP4 / v0.4.
RelyLoop v0.1.0 — MVP1 alpha
RelyLoop v0.1.0 — MVP1 alpha
What's in MVP1
The full Karpathy loop end-to-end on Elasticsearch and OpenSearch, single-tenant, no auth, Docker Compose:
- Engine adapter — one
SearchAdapterProtocol covering both ES 8.11+/9.x and OpenSearch 2.x/3.x. Cluster registration via UI or API. - Optuna optimizer — TPE sampler against a parametrized query template; up to N trials per study; per-trial budget guard;
pytrec_evalmetrics (ndcg@k,map,precision,recall,mrr,err). - LLM-as-judge —
POST /api/v1/judgments/generaterates query-document pairs against a rubric. ~$0.01–$0.05 per query set withgpt-4o-mini. Provider-agnostic: works against any OpenAI-compatible endpoint (Ollama / LM Studio / vLLM / TGI) viaOPENAI_BASE_URL. - Digest — LLM-generated narrative summary of each completed study, plus parameter-importance chart and recommended config.
- GitHub PR worker — winning configs land as Pull Requests against a central search-config Git repo. Operator's CI deploys.
- Chat agent — describe the problem in chat; the agent introspects the cluster, proposes a search-space, and queues the study after operator confirmation. 19-tool surface.
- Operator tutorial + sample data — 1,000 curated Amazon ESCI products + 48 queries + canonical Jinja2 query template.
docs/08_guides/tutorial-first-study.mdwalksgit clone → Open PRin under 30 minutes on a fresh laptop. - CI smoke gate — every PR runs the full Karpathy loop end-to-end against a fresh stack with a budgeted OpenAI key. Same operator path as the tutorial; no degraded variants.
Full feature list: see docs/02_product/mvp1-user-stories.md.
Audience
Technical evaluators, Relevance Engineers, and search-platform teams considering an open-source query-tuning tool. Not yet production-deployable — see docs/01_architecture/deployment.md for the MVP1 → MVP3 → GA v1 deployment maturity ramp.
How to install
Follow the tutorial: docs/08_guides/tutorial-first-study.md.
Operators build images locally via make up. Pre-built GHCR images ship at MVP3 per the canonical release matrix; until then, make up triggers a local Docker build of relyloop/api and relyloop/ui on first run.
Known limitations
This is alpha. Three operator-visible issues are tracked but not blocking:
- Long chat sessions silently drop context after 100 messages. The agent prompt-window cap is brute-force; smarter context management ships in MVP2. Tracked:
bug_chat_long_conversation_truncation. - Query templates created via the API with declared params can't be used for LLM judgment generation. Workaround: use one template with
declared_params={}for judgment generation, a separate template with declared params for the optimization study (this is what the tutorial does). Tracked:bug_judgment_template_default_params_contract. - Worker may need a manual restart after first-run
make migrate. If youmake upand immediately fire a study beforemake migratecompletes, the Arq worker dies on Optuna schema init and stays down. Workaround:docker compose restart workeraftermake migrate. Tracked:bug_worker_optuna_init_race.
How to provide feedback
- GitHub Discussions: https://github.com/SoundMindsAI/relyloop/discussions
- Issues (bug reports, feature requests): https://github.com/SoundMindsAI/relyloop/issues/new/choose
Roadmap
| Release | Theme | Adds |
|---|---|---|
| MVP1 / v0.1.0 (you are here) | "The Loop" | ES + OpenSearch adapter, OpenAI-compatible LLM, GitHub provider, single-tenant, no auth, Docker Compose, 80% coverage gate |
| MVP2 / v0.2 | "Observable" | Langfuse + ClickHouse + SigNoz; canonical event catalog; audit_log + immutability trigger; lineage columns; PII redaction; trace propagation |
| MVP3 / v0.3 | "Production Stacks" | Lucidworks Fusion adapter; multi-Git-provider abstraction (GitLab, Bitbucket); production install (TLS via Caddy + Let's Encrypt, managed Postgres/Redis); AWS managed OpenSearch |
| MVP4 / v0.4 | "Multi-tenant, Multi-LLM" | tenants + tenant_memberships + users + api_keys; tenant_id columns + backfill; SSO via reverse proxy; native non-OpenAI provider SDKs |
| GA v1 | "Production-ready" | LangGraph orchestrator + PostgresSaver; full RFC 7807 errors; Idempotency-Key; Helm chart; container scanning; image signing |
Canonical release matrix: docs/01_architecture/tech-stack.md.