Mathodology · MathModelAgent-Pro

👉 Active work: open issues · project board · roadmap — every task is tracked as an issue and labeled by phase, so contributors always see the current plan, not yesterday's.

Production-grade AI agent system for mathematical modeling competitions (MCM / ICM / CUMCM / 华数杯).

One problem statement → a full submission-ready paper (PDF / DOCX / TeX / Markdown) in ≈ 30 minutes. Five agents (Analyzer → Searcher → Modeler → Coder → Writer) drive a live Jupyter kernel, stream tokens through a Rust gateway, and produce a 15–20 page paper with 5+ award-grade figures (sensitivity tornado, Monte-Carlo box-plot, 2-D heatmap, convergence curve, residual diagnostic) that follows COMAP Outstanding / CUMCM 一等奖 conventions.

Why Mathodology

Math modeling competitions reward a narrow, non-obvious skill stack: restate an ambiguous problem crisply, pick a model that actually fits, run the solve with the right diagnostics, and write it up in the style judges skim-read. The 72-hour contest format compresses this into one very stressful weekend.

Off-the-shelf LLM chat interfaces don't cut it — they hallucinate citations, forget they already picked a model two turns ago, and produce walls of text without figures. Mathodology is built specifically for this workflow:

First-principles model derivation, not boilerplate method lookups
Live Jupyter kernel execution with reproducible numerical results
Figures are first-class — every chart is saved, captioned, and linked by id
Award-mode prompts internalize COMAP official tips + CUMCM 评阅规范 + 2016 MCM B judges' commentary
Multi-format export — paper goes out as submission-ready PDF with real \begin{figure} blocks, not screenshots

Features

5-agent linear pipeline — Analyzer (scope + sub-questions + approaches) → Searcher (arXiv + web) → Modeler (ModelSpec) → Coder (Jupyter cells + figures) → Writer (structured paper draft)
3 competition paper templates — cumcm (中文 ctexart · xelatex · Fandol fonts) · huashu (华数杯) · mcm (MCM/ICM English article)
4 export formats — PDF via Tectonic · DOCX via Pandoc · raw LaTeX · Markdown
20 canonical chart types injected into the Coder prompt — tornado, heatmap, Monte-Carlo box-plot, Pareto front, convergence, residual, QQ, ROC, confusion matrix, network graph, radar, contour, 3D surface, etc.
MCP web search via open-webSearch — Bing · Baidu · DuckDuckGo · CSDN · Juejin · Brave · Exa · Startpage, no API keys needed
Robust LLM routing — OpenAI-compatible / Anthropic / Ollama / vLLM multi-protocol gateway with automatic fallback, per-call cost accounting, transport-error retry with exponential backoff
Streaming everything — tokens fan out via Redis Streams + WebSocket; frontend renders KaTeX math and shiki-highlighted code in real time
Deterministic figure pipeline — Writer uses [[FIG:<id>]] placeholders; pipeline substitutes them against the Coder's registered Figure list so figure references never break
197 tests — 45 gateway (Rust) + 197 worker (Python; catalog + MCP client + pipeline) + vue-tsc + vite build

Quick start

Pick the path that matches your platform — each guide covers prerequisites, three install options (portable archive / native installer / source build), and troubleshooting.

Platform	Guide	Native installer
macOS (Intel + Apple Silicon)	docs/install/macos.md	`.pkg`
Linux (Debian/Ubuntu/Fedora/Arch/Alpine; x86_64 + aarch64)	docs/install/linux.md	`.deb`
Windows 10/11 + Server 2022	docs/install/windows.md	`.msi`
Production / Docker / systemd	docs/install/server.md	Multi-arch GHCR images

Three-line version, if you already have a working dev environment:

git clone https://github.com/ymylive/mathodology.git && cd mathodology
cp .env.example .env                                     # edit; add at least one *_API_KEY
./scripts/install.sh && just bootstrap && just migrate && just dev

./scripts/install.sh (and its Windows twin scripts\install.ps1) auto-detects your package manager (brew / apt / dnf / pacman / zypper / apk / winget / scoop) and installs Postgres, Redis, Python 3.11, uv, pandoc, and tectonic. Run ./scripts/preflight.sh any time to verify everything is in place.

A typical CUMCM-style problem run takes about 28 minutes end-to-end and costs around ¥2 on a mid-tier reasoning model.

Pre-built releases

Every release ships portable archives (.tar.gz / .zip) for 5 targets — Linux x86_64 + aarch64, macOS x86_64 + aarch64, Windows x86_64 — plus native installers (.deb, .pkg, .msi) and multi-arch Docker images on GHCR. See the latest release and docs/install/server.md for production deployment.

How it works

 ┌─────────┐  ProblemInput   ┌──────────┐   AgentEvent stream   ┌─────────┐
 │ Web UI  │ ──────────────▶ │ Gateway  │ ◀──────────────────── │ Worker  │
 │  (Vue)  │                 │  (Rust)  │ XADD mm:events:<run>  │(Python) │
 │         │ ◀── WS replay ─ │          │                       │         │
 └─────────┘                 └──────────┘                       └─────────┘
                                  │                                  │
                                  │ XADD mm:jobs                     │ spawns
                                  ▼                                  ▼
                            ┌──────────┐                   ┌──────────────┐
                            │  Redis   │                   │  Jupyter     │
                            │  Streams │                   │  kernel      │
                            └──────────┘                   └──────────────┘

Pipeline event sequence (one run)

stage.start(analyzer) → token*N → cost → agent.output(AnalyzerOutput)  → stage.done
stage.start(searcher) → log(queries) → log(papers) → agent.output(SearchFindings) → stage.done
stage.start(modeler)  → token*N → cost → agent.output(ModelSpec)       → stage.done
stage.start(coder)    → token*N → cost → log("executing cell N")*7
                                      → kernel.stdout*M
                                      → agent.output(CoderOutput)      → stage.done
stage.start(writer)   → token*N → cost → agent.output(PaperDraft)      → stage.done
done(status=success, notebook_path, paper_path, meta_path)

Agent responsibilities

Agent	Job	Output contract
Analyzer	Restate the problem, enumerate sub-questions, list assumptions, propose 2-6 candidate approaches, identify data requirements	`AnalyzerOutput`
Searcher	Derive 4-5 focused queries, hit arXiv + (via MCP) Bing/Baidu/DuckDuckGo/CSDN/Juejin in parallel, dedupe by URL/DOI, LLM-synthesize key findings	`SearchFindings` (papers, key_findings, datasets_mentioned)
Modeler	Consult HMML library of canonical methods, produce ONE fully-specified modeling approach with first-principles-derived equations, variables with units, numbered algorithm outline, validation strategy with dimensional / boundary / baseline checks, sensitivity plan for ≥ 3 key parameters	`ModelSpec`
Coder	Iterate up to 7 turns in a persistent Jupyter kernel; each turn picks a chart type from the catalog, runs one focused cell, saves PNG+SVG figures, registers them in `figures_saved`	`CoderOutput` (cells, figures, notebook_path, summary)
Writer	Produce a 12-section paper — abstract with ≥ 2 numerical results, Problem Analysis, Assumptions with Justification+Impact, Symbol table, Model, Algorithm, Results, Sensitivity Analysis, Strengths & Weaknesses, Conclusion, References ≥ 15 — embedding figures via `[[FIG:<id>]]` placeholders	`PaperDraft`

Paper export — 4 formats × 3 templates

The gateway exposes GET /runs/:run_id/export/:format?template=<t> which produces:

Format	Pipeline	Typical size
`pdf`	Tera template → LaTeX → Tectonic (xelatex)	900 KB – 1.5 MB
`docx`	`paper.md` → Pandoc → .docx with embedded PNGs	600-800 KB
`tex`	Rendered Tera template	30-40 KB
`md`	Pre-substituted `paper.md`	25-35 KB

Templates (crates/gateway/templates/):

Template	Class	Cover	CJK
`cumcm`	`ctexart`	国赛摘要页 + 目录	Fandol Song/Hei bundled in Tectonic
`huashu`	`ctexart`	华数杯封面 + 页眉	Fandol
`mcm`	`article`	Summary sheet with Team Control Number / Problem / Year	`ctex` fallback for accidental Chinese

Frontend side, the ExportPanel component shows a template selector + 5 format buttons (PDF / DOCX / LaTeX / Markdown / Notebook) with error mapping:

404 — 论文还未生成 (run not finished)
503 — 服务器缺少 tectonic/pandoc (binary not on PATH)
500 — displays ≤ 4 KB of the compile stderr

Error codes stay in the response body as a clean JSON { code, error }. Dev token auth via Authorization: Bearer <DEV_AUTH_TOKEN> or ?token=....

Chart catalog — 20 canonical types

apps/agent-worker/src/agent_worker/chart_catalog.py ships 20 vetted chart types the Coder picks from. Every entry has: id (slug), display name (中英双语), when-to-use, when-NOT-to-use, 2-3 typical pitfalls, and a runnable matplotlib template 17-32 lines long.

Grouped by purpose:

Purpose	Types
Distribution	`histogram_kde`, `boxplot_grouped`, `violinplot`
Correlation / sensitivity	`heatmap_correlation`, `heatmap_sensitivity`, `tornado_sensitivity`
Optimization landscape	`contour_2d`, `surface_3d`, `pareto_front`
Time series / trend	`line_plot`, `line_with_ci`, `convergence_curve`
Regression diagnostics	`scatter_regression`, `residual_plot`, `qq_plot`
Classification diagnostics	`roc_curve`, `confusion_matrix`
Category comparison	`bar_grouped_stacked`, `radar_chart`
Relational	`network_graph`

The Coder's prompt carries only a compact markdown index (~1 KB / 20 rows). Snippets stay in the module — the LLM doesn't read them; the humans + future maintainers do.

Helpers styled_figure(), save_figure(fig, fig_id, caption, width), and annotate_peak(ax, x, y, label) are inlined into the Jupyter kernel bootstrap so they're globally available without imports.

Award-mode prompts

Writer / Modeler / Coder prompts are rewritten around 19 actionable rules distilled from 16 authoritative sources — COMAP official MCM/ICM Procedures and Tips PDF, 2016 MCM Problem B judges' commentary, Pitt MCM Guide, CUMCM 赛区评阅工作规范, Tsinghua 清风数学建模 materials. See memory/project_award_mode.md for the full distillation.

Highlights:

Writer (hard rules)

Abstract ≤ 1 page, must include ≥ 2 concrete numerical results in the first paragraph (e.g., "prediction error 3.2%", "fuel saving 17%"). No vague "significant improvement".
Four-element abstract order: context → method → results (with units) → conclusion. Forbidden opening: restating the problem.
No school / student names / region anywhere. Signatures in memo problems: Sincerely, Team #<ID>.
Assumptions as numbered triples {Assumption, Justification, Impact} — every assumption tied to literature / data / domain reasoning.
References ≥ 15, each inline-cited ≥ 1x. English sources preferred for MCM; GB/T 7714 for CUMCM.
Mandatory sections: Sensitivity Analysis · Strengths & Weaknesses (≥ 3 strengths, ≥ 2 weaknesses — honest limitations, not "time constraints").
Pipeline self-check at end: abstract ≤ 1 page + ≥ 2 numbers · no names · every sub-question answered · ≥ 15 references · Sensitivity + S&W present · every [[FIG:xxx]] discussed in prose · no filler phrases.

Modeler

Must propose ≥ 2 candidate models with explicit selection criterion (AIC / BIC / CV error / robustness / interpretability for the decision-maker). No "effect was better".
Prefer first-principles derivation from the problem's physics / economics / business logic. Escalating complexity (LP → MILP → DL) requires justifying why the simpler method is insufficient (Occam's razor guardrail).
validation_strategy MUST include all three of: dimensional analysis, boundary / limit-case test, baseline comparison.
Sensitivity plan: ≥ 3 key parameters × ±10% / ±20% + Monte Carlo with N ≥ 1000 when parameters are highly uncertain. Tornado plot + heatmap figures requested explicitly.

Coder

Chart captions must be independently readable — include variable names AND a key number ("残差直方图（均值 0.01，标准差 0.08，N=200）").
Honor the Modeler's sensitivity plan: produce the requested tornado / heatmap / Monte-Carlo figures across iterations.
np.random.seed(...) fixed before any stochastic call; constants named at the top of the first cell. Every printed numerical result must be reproducible from the notebook alone.

Raise MAX_ITERATIONS to 7 lets the Coder spread analyses across turns — baseline, tornado, heatmap, Monte-Carlo, scan, convergence, diagnostic — instead of cramming all 5 figures into one cell.

Multi-engine web search (MCP)

Searcher fans out across sources in parallel:

Source	Transport	Typical hit rate	Auth
arXiv	HTTP Atom API	High for methods / English papers	none
Bing / DuckDuckGo / Brave / Exa / Startpage	MCP stdio → open-webSearch	Medium	none (scraping)
Baidu / CSDN / Juejin	MCP stdio → open-webSearch	High for Chinese competition context	none

For CUMCM / 华数杯 (Chinese competitions), the Searcher auto-generates an extra Chinese methodology query to hit Baidu / CSDN / Juejin — which cover 国赛论文精选 blog posts and solution walkthroughs that arXiv misses entirely.

Graceful degradation paths:

OPEN_WEBSEARCH_DISABLED=1 → disable MCP, fall back to arXiv-only
MCP subprocess spawn fails (Node missing, binary not in PATH) → log warning, return empty web results, arXiv continues
Per-engine rate limit (captcha / 429) → that engine returns empty for the run, others continue
Per-query timeout (30s) → drop that query, keep others

Shared _stream_with_retry helper covers transport errors (httpx.RemoteProtocolError, ReadTimeout, ConnectError, PoolTimeout) AND silent empty 200 OK responses with exponential backoff 5s / 15s / 30s across 4 attempts. Observed on cornna/gpt-5.4 upstream; retry logic prevents the Writer / Coder from failing on random upstream hiccups.

Benchmarks

Representative run (bus-scheduling CUMCM problem, gpt-5.4 via cornna provider, reasoning=low):

Stage	Duration	Notes
Analyzer	132 s	12 sub-questions, 5 proposed approaches
Searcher	4 s	arXiv returned 0 (niche problem); MCP disabled in this run
Modeler	312 s	28 variables, 18 equations, 13-step algorithm, full sensitivity plan
Coder	758 s	7 cells, 5 figures (baseline / tornado / heatmap / MC box-plot / headway scan)
Writer	501 s	12 sections, 16 references, abstract with 4 numerical results
Total	28 min 30 s	¥2.22

Exported paper: 17 pages · 921 KB PDF · 5 figures embedded · 10 image streams. All section headings, equations (LaTeX), and references rendered in CUMCM template with Fandol CJK fonts via Tectonic.

Test suite baseline:

cargo test -p gateway — 45 passed, 3 #[ignore] (tectonic/pandoc real-compile, opt-in via -- --ignored)
uv run pytest apps/agent-worker — 197 passed (figure pipeline + chart catalog + MCP stdio client)
pnpm --filter web typecheck && build — vue-tsc clean, bundle 171 KB gzip (shiki ~282 KB gzip lazy-loaded)

Configuration

`.env` (examples of relevant keys)

# Gateway
GATEWAY_HOST=127.0.0.1
GATEWAY_PORT=8080
DEV_AUTH_TOKEN=dev-local-insecure-token

# Infra
REDIS_URL=redis://127.0.0.1:6379/0
DATABASE_URL=postgres://mm:mm@127.0.0.1:5432/mm
RUNS_DIR=/absolute/path/to/runs             # MUST be absolute

# LLM provider keys (fill in at least one)
DEEPSEEK_API_KEY=
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
MOONSHOT_API_KEY=
CORNNA_API_KEY=sk-...

# Web search (MCP)
OPEN_WEBSEARCH_CMD=open-websearch            # or absolute path
OPEN_WEBSEARCH_ENGINES=bing,duckduckgo,baidu,csdn,juejin
OPEN_WEBSEARCH_DISABLED=false                # set to true to skip web search entirely

`config/providers.toml` (LLM router)

[[providers]]
name = "deepseek"
kind = "openai_compat"
base_url = "https://api.deepseek.com/v1"
api_key_env = "DEEPSEEK_API_KEY"
models = ["deepseek-chat", "deepseek-reasoner"]
price_input_per_1m = 1.0                     # RMB per 1M tokens
price_output_per_1m = 2.0

[router]
default_model = "deepseek-chat"
fallback = ["deepseek-chat", "moonshot-v1-32k"]

Per-run model override via ProblemInput.model_override — e.g., force claude-sonnet-4-6 for one particularly hard problem while the rest default to deepseek-chat.

Architecture

Repo layout

math_agent/
├── apps/
│   ├── agent-worker/              # Python worker
│   │   └── src/agent_worker/
│   │       ├── agents/            # analyzer / searcher / modeler / coder / writer
│   │       ├── prompts/           # v1.toml per agent
│   │       ├── tools/             # arxiv client + web_search_mcp stdio client
│   │       ├── kernel/            # Jupyter session manager (inline chart helpers)
│   │       ├── chart_catalog.py   # 20 canonical chart types
│   │       ├── _chart_helpers.py  # styled_figure / save_figure / annotate_peak
│   │       ├── pipeline.py        # 5-agent orchestration + paper.meta.json
│   │       └── main.py            # XREADGROUP consumer on mm:jobs
│   └── web/                       # Vue 3 SPA
│       └── src/
│           ├── views/             # Showcase / Dashboard / Workbench
│           ├── components/        # ExportPanel, PaperDraft, AgentOutputCard…
│           ├── stores/            # Pinia: run, settings
│           └── api/               # figures + export + runs clients
├── crates/
│   └── gateway/
│       ├── src/
│       │   ├── routes/            # runs · figures · export · llm · stats · ws_run
│       │   ├── llm/               # OpenAI-compat + Anthropic streaming adapters
│       │   ├── auth.rs            # dev-token middleware
│       │   ├── cost.rs            # per-call cost accounting
│       │   └── app.rs             # axum router wiring
│       ├── templates/             # cumcm / huashu / mcm .tex.tera
│       └── tests/                 # integration tests (including #[ignore] real-compile)
├── packages/
│   ├── contracts/                 # OpenAPI + event JSON schema (source of truth)
│   ├── py-contracts/              # pydantic v2 codegen target
│   └── ts-contracts/              # TypeScript codegen target
├── config/
│   └── providers.toml             # LLM router
├── docs/
│   ├── demo.gif                   # README hero
│   └── promo/                     # polished promo HTML + MP4 sources
├── scripts/                       # bootstrap + smoke_e2e.sh
├── justfile                       # task runner
└── Procfile.dev                   # overmind dev orchestration

Data contracts

Layer	Format	Generated from
Event envelope (WebSocket)	`AgentEvent { run_id, agent, kind, seq, ts, payload }`	`packages/contracts/events.schema.json`
LLM completion	OpenAI-compatible SSE	gateway translates from Anthropic / others
Agent I/O	Pydantic v2 (Python) + TS interface	`packages/py-contracts/` + `packages/ts-contracts/`
`paper.meta.json`	Structured paper + figure list with `[[FIG:]]` placeholders	Writer emits, gateway export consumes

Seq counter

One Redis INCR mm:seq:<run_id> per event, shared between gateway and worker. WS replay filters by payload.seq. XADD uses * (time-based id), seq is the authoritative monotonic per-run counter.

Developer setup

For end-user install (any of the 5 supported platforms / 4 install paths), see Quick start and the per-platform guides under docs/install/. This section covers source-build setup for contributors.

git clone https://github.com/ymylive/mathodology.git && cd mathodology
./scripts/install.sh --with-source    # adds node + pnpm on top of runtime tools
cp .env.example .env                  # at least one *_API_KEY
just bootstrap                        # cargo fetch + uv sync + pnpm install
just migrate                          # sqlx migrations
just dev                              # overmind: gateway + worker + vite dev :5173

scripts/install.sh auto-detects brew / apt / dnf / pacman / zypper / apk; on Windows use scripts\install.ps1 (winget / scoop). Both are idempotent and prompt before any sudo action — pass --yes / -Yes for CI. scripts/preflight.{sh,ps1} is the verifier: [OK] / [MISS] / [STALE] per tool with install hints inline.

Notes that bite if skipped:

Tectonic downloads a ~200 MB TeXLive bundle into ~/Library/Caches/Tectonic (macOS) or ~/.cache/Tectonic (Linux) on first PDF render. CI must cache this directory.
Pandoc + Tectonic are runtime dependencies, called as subprocesses from the gateway during paper export. The .deb declares them; the portable archive doesn't.
just dev uses overmind (Linux/macOS) which depends on tmux. On Windows, run the three processes in separate terminals — see docs/install/windows.md §4.

For docker-compose Redis + Postgres instead of system services: just infra-up. Tectonic and Pandoc are still required on the host because they run inside the gateway/worker process, not the infra containers.

Testing

just lint        # cargo clippy + ruff + vue-tsc
just test        # cargo test + pytest + vitest
just smoke       # end-to-end ping through the live stack

# Subset runs:
cargo test -p gateway                                    # 45 passed + 3 #[ignore]
cargo test -p gateway --test export_paper -- --ignored   # real tectonic + pandoc compile
uv run pytest apps/agent-worker                          # 197 passed
uv run pytest apps/agent-worker -m mcp                   # real open-websearch subprocess (network)
pnpm --filter web typecheck && pnpm --filter web build

CI runs the equivalent on every PR — see .github/workflows/ci.yml.

Roadmap & milestones

Phase	Status	Issues	Summary
Phase 1 — MVP	✅ Shipped (M1–M8)	predates tracker	4-agent linear pipeline, gateway, streaming UI, local Jupyter
Phase 2 — Knowledge base	✅ Shipped (M9–M11)	#1 · #2 · #3	HMML method library + BM25 retrieval · Searcher agent · hybrid search
Phase 3 — Award-grade output	✅ Shipped (v0.3.0)	#4 · #5	Award-mode prompts · 20-type chart catalog · MAX_ITER=7 · transport retry
Phase 3.5 — Export + MCP	✅ Shipped (v0.3.0)	#6 · #7 · #8	4 formats × 3 templates · Tectonic + Pandoc · open-webSearch MCP
Phase 4 — Critic loop	🟡 In branch	#9	Multi-role Critic · score/checklist thresholds · 2-round self-refine · Coder rerun cap
Phase 5 — Productionization	📋 Planned	#10 · #11 · #12	E2B / Daytona cloud sandbox · multi-tenant JWT · usage metering
Phase 6 — Vision + multi-lang	📋 Planned	#13 · #14 · #15	GPT-4V for chart QA · R + MATLAB kernels · HITL review gates

Detailed milestone log

Milestone	Commit	Summary
M1	`97c1609`	Monorepo skeleton, hello-world ping
M2	`926fc5f`	Gateway + Postgres persistence + run lifecycle
M3	`f7aca8f`	Full LLM gateway + streaming UI
M4	`8151668`	`BaseAgent` + Analyzer calling real LLM gateway
M5	`98d3e1e`	Jupyter kernel + CoderAgent + figure artifact serving
M6	`c578b66`	Modeler + Writer → full 4-agent pipeline
M7	`91833ba`	shadcn-vue + KaTeX + shiki UI polish
M8	`3d21ad0`	Anthropic adapter + provider fallback + CI
M9	`7ec8e50`	HMML knowledge base + Modeler integration
M10	`4ebf7b4`	Searcher agent + 5-agent pipeline
M11	`b7e8b08`	Hybrid BM25 + vector retrieval via fastembed
v0.2.0	`10a06c6`	Editorial UI rebuild + reasoning effort + long context
v0.3.0	`b18df8d`	Competition paper export + award-mode pipeline + MCP search

Known limitations

LLM API keys are loaded from env. Put at least one of DeepSeek / Anthropic / OpenAI / Moonshot / cornna keys in .env (see .env.example). Absent keys, pipelines fail at the first real LLM call.
RUNS_DIR must be an absolute path when the worker and gateway are launched from different working directories, otherwise figure URLs (/runs/:id/figures/...) won't resolve. The default ./runs only works when both processes share cwd — which just dev (overmind) guarantees.
Tectonic + Pandoc required for PDF/DOCX export. PDF/DOCX export returns HTTP 503 with a clear message if either binary is missing. TeX and Markdown export work without them.
open-webSearch is a scraper, not an API. Rate limiting and captcha from Bing/Baidu are regular. Per-engine failures are isolated — others keep working. Set OPEN_WEBSEARCH_DISABLED=1 to skip it entirely.
Rust 1.83 is pinned via rust-toolchain.toml. Several crates in Cargo.lock are held to 1.83-compatible versions; bumping the toolchain is an explicit future task.
Contract codegen (just gen) requires network. datamodel-code-generator is not in uv.lock; it is installed ad-hoc. The contracts-drift CI job is non-blocking.
Upstream LLM flakiness. Some OpenAI-compat proxies occasionally return empty 200 OK responses mid-run. The _stream_with_retry helper (base.py) retries on this with 5s / 15s / 30s backoff, but a persistently unavailable provider will still cause the run to fail after 4 attempts.

Contributing

See CONTRIBUTING.md. Short version: feature branch → PR to main → CI green → squash merge. All new agents / prompts / chart types should have corresponding unit tests; real-external-service tests go behind #[ignore] (Rust) or @pytest.mark.slow|mcp (Python).

License

MIT (core). See LICENSE. Commercial-grade modules (specialized templates, enterprise HITL, multi-tenant billing) may be relicensed under a commercial license in Phase 5+.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.cargo		.cargo
.github/workflows		.github/workflows
apps		apps
config		config
crates/gateway		crates/gateway
data/mcm		data/mcm
docs		docs
installer		installer
packages		packages
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile.gateway		Dockerfile.gateway
Dockerfile.worker		Dockerfile.worker
LICENSE		LICENSE
Procfile.dev		Procfile.dev
README.md		README.md
README_zh.md		README_zh.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
justfile		justfile
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
pyproject.toml		pyproject.toml
rust-toolchain.toml		rust-toolchain.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Mathodology · MathModelAgent-Pro

Table of contents

Why Mathodology

Features

Quick start

Pre-built releases

How it works

Pipeline event sequence (one run)

Agent responsibilities

Paper export — 4 formats × 3 templates

Chart catalog — 20 canonical types

Award-mode prompts

Writer (hard rules)

Modeler

Coder

Multi-engine web search (MCP)

Benchmarks

Configuration

.env (examples of relevant keys)

config/providers.toml (LLM router)

Architecture

Repo layout

Data contracts

Seq counter

Developer setup

Testing

Roadmap & milestones

Detailed milestone log

Known limitations

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`.env` (examples of relevant keys)

`config/providers.toml` (LLM router)

Packages