Map a psychology or cognitive science study as a graph, hand each stage to a focused agent, and move from question to hypothesis with everything traceable on your machine.
Mimir is an active personal project for exploring a more structured way to run research with AI agents. If you try it, fork it, or have ideas, use GitHub Issues to share feedback.
Most AI research workflows still feel like isolated chat threads. I wanted to see the full study pipeline on one canvas, see how each step connects to the last, run one agent at a time with clear inputs and outputs, and keep important decisions reviewable without the whole thing turning into prompt spaghetti.
I also wanted the graph to be the memory. Each agent reads one or more artifacts, produces exactly one new artifact, and stops. No wandering agents, no hidden state in a chat window—just artifacts, parent links, and agent-run records I can inspect later.
The target domain is psychology and cognitive science: questions, literature, theory, hypotheses, confounds, methods, experiments, data, analysis, findings, and reports. This repo is the first slice of that vision—a local-first desktop app that proves the pattern on a small pipeline before expanding to the full chain.
Research OS is a thinking and documentation OS for structured empirical research—not a chatbot that replaces your judgment. Each stage becomes a durable artifact you can edit, challenge, and cite later.
Typical users
- Grad students and postdocs shaping a thesis study or a first preregistered project
- PIs and lab leads who want a reviewable trail of how a study was reasoned through—not just a pile of ChatGPT threads
- Solo researchers who want AI help without losing control: one step at a time, on their machine
What it is not
- A replacement for scientific judgment, ethics review, or running participants
- A single endless conversation where context lives only in the chat window
- Fully automated “AI runs my study”—you trigger each agent when the previous artifact is ready
How you use it (vision)
- Create a project and refine the root Question.
- Run agents stage by stage (literature → hypothesis → skeptic → methods → …).
- Edit every artifact in the inspector before advancing.
- End with a Report linked to the whole graph, plus AgentRun records that show what each step read and wrote.
Research OS lets me run a research project more like a structured lab notebook than a single long conversation. I create a project, get a root Question artifact, edit content in an inspector, and advance the study by running agents from the graph. Each run is stored as an event linked to its inputs and outputs, so I can always answer: what did this agent read, what did it write, and when?
- Create and delete local research projects.
- View the study as a stage-ordered graph (Question → Literature → Hypothesis, expanding later).
- Select artifacts from the graph or an artifact list and edit Markdown content with version tracking.
- Run the full research pipeline from the workspace (stub agents by default; Literature and Hypothesis can use an LLM when configured).
- Keep all data on disk in SQLite—no cloud required for the MVP.
- Artifact model — every node has id, type, title, content, version, timestamps, and parent links
- Research graph — React Flow canvas with stage-based layout and lineage edges
- Artifact inspector — read and edit content; version increments on save
- Bounded agents — Literature and Hypothesis agents read parents, write one child, record an
AgentRuntrace - OpenAI-compatible LLM — Groq, OpenAI, LM Studio, or other compatible APIs via
.env - Offline stubs — full UI flow works without an API key
- Local-first — single-user SQLite database in the app data directory
- Project management — create, open, and delete projects from the home screen
Implemented
- Tauri desktop shell (macOS dev path verified)
- SQLite schema: projects, artifacts, artifact links, agent runs
- Project list with create / delete
- Workspace: graph, artifact chips, inspector, agent action buttons
- Full pipeline (stub agents + manual steps): Question → Literature → Hypothesis → Skeptic → Methods → Experiment → Dataset (manual) → Stats → Finding (manual) → Writing → Report
- LLM optional for Literature and Hypothesis only (stub fallback; rest of pipeline needs no API key)
Planned
- Theory stage and agent
- Real literature search (e.g. Semantic Scholar / OpenAlex) before synthesis
- Re-run agents and branching graphs (multiple hypotheses per review)
The repo is evolving quickly; the core pattern—graph as memory, one artifact per agent step—is stable, but agent coverage and verification are still early.
Two views: (1) the research agent workflow—what the graph in the app represents—and (2) the desktop app stack—how the software is built.
Each artifact is a node. Each agent reads one or more parent artifacts, writes exactly one new artifact, records an AgentRun trace, and stops. You trigger agents manually from the UI (no background wandering).
Solid arrows = implemented (stub content unless noted). Dashed = planned (Theory optional branch).
flowchart TB
Q["Question<br/><i>manual · bootstrap</i>"]
Q -->|"Literature Agent ✅"| LR["LiteratureReview"]
LR -->|"Hypothesis Agent ✅"| H["Hypothesis"]
H -->|"Skeptic Agent ✅"| CA["ConfoundAnalysis"]
CA -->|"Methods Agent ✅"| ED["ExperimentDesign"]
ED -->|"Experiment Agent ✅"| ER["ExperimentRun"]
ER -->|"manual import ✅"| DS["Dataset"]
DS -->|"Stats Agent ✅"| AN["Analysis"]
AN -->|"manual edit ✅"| FI["Finding"]
FI -->|"Writing Agent ✅"| RP["Report"]
LR -.-> T["Theory<br/><i>optional · edit or future agent</i>"]
T -.-> H
classDef done fill:#ecfdf5,stroke:#059669,stroke-width:2px
classDef todo fill:#f8fafc,stroke:#94a3b8,stroke-width:1px,stroke-dasharray:4
classDef root fill:#eef2ff,stroke:#4f46e5,stroke-width:2px
classDef side fill:#fff7ed,stroke:#ea580c,stroke-width:1px,stroke-dasharray:4
class Q root
class LR,H,CA,ED,ER,DS,AN,FI,RP done
class T side
Traceability: Every agent click also creates an
AgentRunartifact (linked to inputs and output). These are stored for audit but hidden on the main graph in v0.
| Agent | Reads | Writes | Status |
|---|---|---|---|
| Literature | Question |
LiteratureReview |
✅ Implemented (LLM or stub) |
| Hypothesis | LiteratureReview |
Hypothesis |
✅ Implemented (LLM or stub) |
| Skeptic | Hypothesis |
ConfoundAnalysis |
✅ Stub (LLM later) |
| Methods | ConfoundAnalysis |
ExperimentDesign |
✅ Stub |
| Experiment | ExperimentDesign |
ExperimentRun |
✅ Stub |
| Stats | Dataset |
Analysis |
✅ Stub |
| Writing | Finding |
Report |
✅ Stub |
Manual artifact steps
| Stage | How it enters the graph |
|---|---|
Question |
Created when you create a project |
Dataset |
Add Dataset after ExperimentRun |
Finding |
Add Finding after Analysis |
Theory |
Not wired yet — optional future stage |
Rules (every agent)
- Read parent artifact(s) from SQLite.
- Call LLM (if configured) or stub template.
- Insert one child artifact +
artifact_links+agent_runsrow +AgentRunevent artifact. - Stop — no further steps until you click again.
flowchart TB
L["Literature Agent ✅"]
Hy["Hypothesis Agent ✅"]
Sk["Skeptic Agent"]
Me["Methods Agent"]
Ex["Experiment Agent"]
St["Stats Agent"]
Wr["Writing Agent"]
L --> Hy --> Sk --> Me --> Ex --> St --> Wr
That column is the intended agent order along the pipeline—not autonomous chaining. You still run one agent at a time from the workspace.
This is the Tauri / React / Rust layout—the diagram that shows where code runs, not which agent comes next:
┌─────────────────────────────────────────────────────────────┐
│ Research OS (Tauri desktop) │
│ ┌──────────────────────┐ ┌─────────────────────────────┐ │
│ │ React UI │ │ Rust backend │ │
│ │ · Project list │◄─┤ · SQLite (artifacts, links) │ │
│ │ · React Flow graph │ │ · Agent runners (LLM / stub) │ │
│ │ · Artifact inspector │ │ · IPC commands │ │
│ └──────────────────────┘ └─────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
~/Library/Application Support/.../research-os.db
| Layer | Role |
|---|---|
| Graph (UI) | Renders artifact nodes and parent→child edges from SQLite |
| Artifacts (DB) | Source of truth for content, type, version, timestamps |
| Agent runs (DB) | Trace records + AgentRun artifacts for auditability |
| Agents | Rust modules invoked by IPC; one run per button click |
- One agent, one artifact, then stop — no autonomous multi-step wandering
- The graph is the memory — not chat history buried in a sidebar
- Full traceability — agent runs link inputs, outputs, status, and prompt version
- Domain-shaped pipeline — artifact types match a real study lifecycle, not generic “documents”
- Local-first — projects and data stay on my machine in v0
- Provider-flexible — any OpenAI-compatible API; stubs when no key is set
- Skeptic agent after Hypothesis (confounds, alternatives, threats to validity)
- Literature search — retrieve real papers, then LLM synthesis with citations
- Methods and Experiment agents — design and protocol artifacts
- Stats and Writing agents — analysis plan and report draft
- Re-run and branch — new artifact versions without losing history
- Better graph UX — layout, agent-run history panel, clearer selection
- Packaged releases —
.app/ installer for non-developer use
If you want to try the project, report a bug, or share an idea, open a GitHub issue. This repo is public so people can inspect the work, fork it, and follow the project as it develops.
- Tauri 2 + Rust — desktop shell, SQLite, agents, IPC
- React + TypeScript + Tailwind CSS — UI
- Vite — frontend build and dev server
- React Flow (
@xyflow/react) — research graph - SQLite (
rusqlite, bundled) — local persistence - OpenAI-compatible APIs — Literature and Hypothesis agents (optional)
- Node.js (LTS recommended)
- Rust (
rustup) — https://rustup.rs - Xcode Command Line Tools (macOS) —
xcode-select --install - Git — for cloning and version control
git clone git@github.com:Yaph123/Research-OS.git
cd Research-OS
npm installRust dependencies are fetched on the first cargo tauri dev or cargo build.
source "$HOME/.cargo/env" # if Rust was just installed
npm run tauri devUse the Research OS desktop window that opens—not the localhost:5173 browser tab (the Rust backend only runs inside Tauri).
Useful commands:
npm run dev # frontend only (no agents / DB)
npm run build # production frontend build
npm run tauri build # packaged desktop appAgents use an OpenAI-compatible chat API when configured. Without a key, stub templates still run so you can explore the UI.
-
Copy the example env file:
cp .env.example .env
-
Edit
.envin the project root. Never commit.env.
Groq (free tier, no credit card):
OPENAI_API_KEY=gsk_your_groq_key_here
OPENAI_API_BASE=https://api.groq.com/openai/v1
OPENAI_MODEL=llama-3.3-70b-versatileOpenAI (paid API credits, separate from ChatGPT Plus):
OPENAI_API_KEY=sk-proj-your-key-here
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_MODEL=gpt-4o-mini- Restart the app:
npm run tauri dev
The workspace shows a green LLM connected bar when the key loads, or an amber bar for stub mode.
| Variable | Default |
|---|---|
OPENAI_API_BASE |
https://api.openai.com/v1 |
OPENAI_MODEL |
gpt-4o-mini |
- Open Research OS from
npm run tauri dev. - Create a project on the home screen (each project starts with a Question artifact).
- Open the project, write your research question in the inspector, and Save content.
- Select the Question and click Run Literature Agent →.
- Select the Literature review and click Run Hypothesis Agent →.
- Edit any artifact content; version increments on each save.
Data is stored at:
~/Library/Application Support/com.research-os.desktop/research-os.db
- docs/ASSUMPTIONS.md — MVP design decisions and constraints
This repo uses SSH for Git:
git remote -v # should show git@github.com:Yaph123/Research-OS.git
git pushIf HTTPS tokens fail with 403, switch to SSH:
git remote set-url origin git@github.com:Yaph123/Research-OS.git