Skip to content

Document production autoresearch patterns for ZeroAPI policy tuning#8

Merged
dorukardahan merged 3 commits intodorukardahan:mainfrom
AytuncYildizli:docs/mahmory-autoresearch-usage
Apr 10, 2026
Merged

Document production autoresearch patterns for ZeroAPI policy tuning#8
dorukardahan merged 3 commits intodorukardahan:mainfrom
AytuncYildizli:docs/mahmory-autoresearch-usage

Conversation

@AytuncYildizli
Copy link
Copy Markdown
Contributor

Summary

  • add a detailed reference doc describing how Mahobrain uses autoresearch across Mahmory, skill routing, and tweet quality
  • explain which parts map well to ZeroAPI and which should stay out of scope
  • link the new doc from the main README

Why

ZeroAPI already has a clean runtime routing boundary. This doc gives a concrete, production-grounded pattern for how offline policy tuning can evolve threshold and provider-bias logic without coupling runtime routing to a live optimizer.

Testing

  • reviewed README link locally
  • reviewed new markdown doc locally

Add a concrete reference for how Mahobrain runs offline autoresearch across Mahmory, skill routing, and tweet quality. The goal is to show where ZeroAPI can borrow the same discipline for threshold and provider-bias tuning without coupling runtime routing to background optimization.

Constraint: ZeroAPI should describe the pattern without taking a runtime dependency on Mahobrain internals
Rejected: Inline a short README note only | too little detail for future policy work
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep autoresearch offline and file-backed; do not make ZeroAPI routing depend on a live optimizer
Tested: README link added and new reference doc reviewed locally
Not-tested: No automated tests in docs-only change
@dorukardahan
Copy link
Copy Markdown
Owner

Thanks for the reference — taking the autoresearch concept and implementing it directly into the routing policy layer instead of keeping it as a standalone doc.

@dorukardahan
Copy link
Copy Markdown
Owner

Hey, sorry for closing this too fast — I misread the intent. You're right that this pattern is directly relevant to ZeroAPI.

While reviewing your doc, I went ahead and made the plugin constants configurable so they can actually be tuned without code changes:

  • vision_keywords and risk_levels are now config-driven in zeroapi-config.json (previously hardcoded in TypeScript)
  • Added scripts/eval.ts — reads routing logs and reports category distribution, risk rate, provider diversity, and gives tuning suggestions
  • Added a "Policy Tuning" section to SKILL.md with the measure → experiment → promote workflow

Your doc would pair well with these changes. Could you trim it down to focus on the parts most relevant to ZeroAPI? Specifically:

  1. The skill-routing lane section is the closest sibling to what ZeroAPI does — keep that detailed
  2. The framework shape and guardrails sections are useful as-is
  3. The tweet-quality and Mahmory internals sections could be shortened or dropped since they're less applicable here
  4. The hardcoded date ("On April 10, 2026, the latest Mahmory run completed") should come out

Also if you want, you could reference the new scripts/eval.ts and the tunable config fields in your doc's "What ZeroAPI can borrow" section — since those now exist, it's no longer hypothetical.

Let me know if this works for you or if you'd rather take a different direction.

@dorukardahan dorukardahan reopened this Apr 10, 2026
@dorukardahan
Copy link
Copy Markdown
Owner

Update: I also ran this through Codex CLI (codex review --uncommitted) for a second opinion. It caught several real bugs that I've now fixed on main:

Fixed (from Codex review):

  • P1: Regex crash on special chars — Category keywords like c++ would crash routing (Invalid regular expression). Now escaped with the same escapeRegex() used for high-risk keywords.
  • P1: Config validationvision_keywords and risk_levels had no runtime validation. A malformed config like vision_keywords: "image" (string instead of array) would crash on .some(). Added type guards in isValidConfig().
  • P2: Eval no-override metric — The eval script lumped high-risk blocks, no-keyword-match, and no-switch-needed into one bucket. Now reports them separately so tuning suggestions are accurate.
  • P2: Reason field parsing — Changed from \S+ to .+$ so multi-word keywords aren't truncated in the eval report.
  • P3: Node formatting — Replaced %-15s %4d (not supported in Node) with proper padStart/padEnd.
  • Vision false positivesUI substring matching build a CLI tool → switched to word-boundary regex matching.

All 51 tests pass. The policy tuning infrastructure is ready — your doc would be a great companion to explain the pattern behind it.

Shorten the reference doc by removing the tweet-quality deep dive and expanding the skill-routing explanation. Keep the document focused on the parts that map directly onto ZeroAPI policy tuning.

Constraint: The doc should stay useful to ZeroAPI maintainers without reading like a generic framework essay
Rejected: Keep all three targets equally detailed | routing relevance is uneven for this repo
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep future edits biased toward routing-policy lessons, not unrelated eval lanes
Tested: Markdown diff reviewed locally
Not-tested: No automated tests in docs-only change
Refocus the reference doc on routing autoresearch patterns and reduce unrelated Mahmory emphasis so the document stays aligned with ZeroAPI's scope.

Constraint: The doc should read as routing guidance, not a memory-system case study
Rejected: Keep the old title and framing | still over-signals unrelated internals
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep future doc edits scoped to routing-policy lessons that apply to ZeroAPI
Tested: Markdown diff reviewed locally
Not-tested: No automated tests in docs-only change
@dorukardahan dorukardahan merged commit bcd406c into dorukardahan:main Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants