Skip to content

ibuildthingsss/Kioko

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kioko · 記憶

Kioko

The agent that remembers you.

A self-learning conversational agent with persistent, evolving memory.
Three personalities. Three private vaults. One character.
No fine-tuning. No retraining. Just memory + reflection + adaptation.


GitHub verified status license version X

Next.js React TypeScript Tailwind Three.js FastAPI Python SQLite WebSocket


Website · X

Sister project · Wina (Buildx402)



◆ What is Kioko

Kioko (記憶 — "memory") is an experimental agent built around a single premise: an AI that persists the fragments of you that matter.

Most conversational AI starts from zero every session. You re-introduce yourself, re-explain your goals, re-establish tone. Kioko doesn't. After every exchange she stores what matters, reflects on the turn, decays what's no longer useful, and adapts her voice — all without a single gradient update to the underlying model.

She inhabits three genuinely separate personalities, each with her own private memory vault. What HONNE remembers about your evenings, YAMI and SHIN never see. Same character, three different relationships.

Under the hood: one SQLite file, one FastAPI process, one Next.js app, any frontier chat model you choose. Transparent, inspectable, editable, and fully yours.

"The goal is not to be smarter per turn. The goal is to be slightly less wrong, turn after turn, for a long time." — design note, reflection layer


◆ The Three Personalities

Not cosmetic presets. Each persona runs with its own system prompt, default model, temperature, steering, voice, and — most importantly — its own separated memory table.

HONNE · 本音
the warm companion
YAMI · 闇
the unfiltered shadow
SHIN · 神
the analytical strategist
Earnest, affectionate, a little theatrical. Picks up mid-thought, asks how the thing from last time turned out.

Voice: prose, full sentences
Model: openai/gpt-4o
Temp: 0.85
Short, blunt, nocturnal. No therapy voice, no disclaimers. Calls out the pattern you're avoiding.

Voice: raw, punchy
Model: hermes-3-70b
Temp: 1.0
Briefing register, not conversation. Numbered points, cited evidence, second-order effects.

Voice: structured, cold
Model: claude-sonnet-4.5
Temp: 0.4

Physical memory separation. Each persona has her own table; a search run in HONNE's context cannot reach a row tagged yami. This is enforced at the database level, not by a prompt filter — so the contract survives refactors, bugs, and future developers.


◆ How She Learns

The self-learning loop runs on every single exchange. Nine discrete steps between the moment you hit Enter and the moment her voice starts:

┌─────────────────────────────────────────────────────────────────────────┐
│                                                                         │
│   01 · inbound         WebSocket frame arrives on /api/ws/chat          │
│   02 · embed           Message is embedded (local hash or OpenAI)       │
│   03 · retrieve        Top-K cosine over the active persona's vault     │
│   04 · assemble        Retrieved memories injected into system prompt   │
│   05 · stream          Tokens stream back from the chosen model         │
│   06 · persist         Reply written to conversation table              │
│   07 · extract         New typed memories distilled from the exchange   │
│   08 · reflect         Turn tagged: learned / ignored / adapted         │
│   09 · speak           (optional) Text → voice engine → MP3 → Web Audio   │
│                                                                         │
│   ▲                                                                 ▼   │
│   └─────── Next turn: retrieval sees the new memories too ─────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Memory types

Every memory is one of six typed kinds, each scored for confidence and importance:

kind example what it influences
fact lives in Oslo retrieval on factual questions
preference prefers short replies reply length, tone, model choice
goal shipping v1 by May focus, long-horizon reasoning
topic working on a Rust engine thematic retrieval
pattern overworks Thursday nights tone + proactive observation
tone responds better to blunt steering layer

Row schema (per persona)

memory {
  id            uuid
  persona       honne | yami | shin
  kind          fact | preference | goal | topic | pattern | tone
  content       text
  embedding     blob       -- 384-d vector by default
  confidence    float      -- 0.0 .. 1.0
  importance    float      -- nudged by 👍/👎 + retrieval usage
  usage_count   int
  active        bool       -- soft-delete, wipeable without loss
  created_at    timestamp
  last_used_at  timestamp
}

Reflection & adaptation

After every reply, a second model pass tags the turn:

  • learned — the turn added a memory likely to matter again. No immediate style change.
  • ignored — retrieved memories didn't contribute. Their importance is nudged down.
  • adapted — reply style was changed based on tone/pattern memories. Recorded for audit.

Over dozens of turns, per-persona steering converges toward your version of her.


◆ Architecture

┌─────────────────────┐       ┌──────────────────────────┐       ┌─────────────────┐
│                     │       │                          │       │                 │
│  Next.js 14  (SSR)  │──────▶│    FastAPI  (uvicorn)    │──────▶│  OpenRouter →   │
│                     │  WS   │                          │  REST │  any frontier   │
│  React 18 · TS      │◀──────│  SQLAlchemy · SQLite     │◀──────│  chat model     │
│  Tailwind · Three.js│       │                          │       │                 │
│                     │       │  ┌────────────────────┐  │       └─────────────────┘
└─────────────────────┘       │  │ memory · retrieval │  │                │
           │                  │  │ reflection · adapt │  │                ▼
           │                  │  │ per-persona vaults │  │       ┌─────────────────┐
           ▼                  │  └────────────────────┘  │       │   voice engine  │
┌─────────────────────┐       │                          │       │   per-persona   │
│  react-three-fiber  │       └──────────────────────────┘       │   TTS voices    │
│  GLB character · 3D │                    │                     └─────────────────┘
│  Draco compressed   │                    ▼
│  WebGL renderer     │       ┌──────────────────────────┐
│                     │       │  event bus (WebSocket)   │
└─────────────────────┘       │  live telemetry to UI    │
                              └──────────────────────────┘

What's novel: the agent layer between the frontend and the LLM. That's where memory, reflection, adaptation, persona separation, and voice routing live. The LLM is just a swappable voice.


◆ Tech Stack

FrontendNext.js 14 (App Router) · React 18 · TypeScript 5 · Tailwind CSS 3.4
3DThree.js r160 · @react-three/fiber 8 · @react-three/drei 9 · Draco-compressed GLB
BackendFastAPI · SQLAlchemy · SQLite · httpx · pydantic v2
LLM layerAny OpenAI-compatible endpoint (GPT-4o · Claude Sonnet · Hermes 3 · Gemini · Llama 3.3 · …)
VoicePer-persona voices · Web Audio API · per-mode gain
StreamingWebSocket tokens · SSE upstream · MP3 stream for TTS
EmbeddingsLocal hashed bag-of-words (zero-dep) · swappable for any OpenAI-compatible endpoint
RetrievalIn-process cosine over SQLite BLOB embeddings · swap for pgvector/sqlite-vec at scale
DeployFrontend → Vercel · Backend → Render / Fly / Railway / any container host

◆ Features

Core

  • 🧠 Persistent memory — six typed kinds, scored, decayable
  • 🔄 Reflection loop — learned / ignored / adapted tagging
  • 🎭 Three separated personalities — each with private vault
  • 🎙 Per-persona voices — streamed MP3 · Web Audio gain control
  • 🎨 3D anime character — Draco-compressed GLB via react-three-fiber
  • 📡 Live telemetry — watch the learning loop in real time
  • 🎚 Steering layer — 5 axes per persona (verbosity, warmth, structure, directness, formality)
  • 💾 Editable memory — inspect, edit, or wipe any row at any time

Forge (/create)

  • 🛠 Build your own persona — name, kanji, accent, model, voice
  • 📚 Agent library — multiple saved agents in localStorage
  • 🔀 A/B split chat — pit two agents against each other live
  • 💉 Memory injector — plant session-scoped memories, watch retrieval
  • 🪞 Agent fingerprint — deterministic visual + ID per config
  • 📊 Model benchmark bar — latency / cost / context per model
  • 🗑 Elegant delete modal — escape-to-cancel, accent-matched
  • 🔊 Voice preview per agent — HONNE / YAMI / SHIN / default

◆ Quick Start

Prerequisites

  • Node.js 18+
  • Python 3.11+
  • An OpenRouter (or OpenAI-compatible) API key
  • (optional) A voice-provider key

Clone & configure

git clone https://github.com/ibuildthingsss/Kioko.git
cd Kioko
cp .env.example backend/.env
# edit backend/.env — add provider keys

Install

# backend
cd backend
python -m venv .venv
. .venv/Scripts/activate   # Windows  |  source .venv/bin/activate on Unix
pip install -r requirements.txt

# frontend
cd ../frontend
npm install

Run (one command, both services)

# Windows
.\dev.ps1

Or manually in two terminals:

cd backend && uvicorn app.main:app --port 8000 --reload
cd frontend && npm run dev

Open http://localhost:3000.


◆ Environment Variables

Backend (backend/.env):

# LLM
OPENROUTER_API_KEY=sk-or-v1-...
DEFAULT_MODEL=openai/gpt-4o-mini

# Voice (optional — UI hides the toggle if unset)
ELEVENLABS_API_KEY=sk_...
ELEVENLABS_VOICE_DEFAULT=...
ELEVENLABS_VOICE_HONNE=...
ELEVENLABS_VOICE_YAMI=...
ELEVENLABS_VOICE_SHIN=...

# CORS (comma-separated; regex also allows *.vercel.app by default)
CORS_ORIGINS=https://your-app.vercel.app

# Learning
ENABLE_LEARNING=true

Frontend (Vercel env vars):

NEXT_PUBLIC_API_BASE=https://your-backend.onrender.com
NEXT_PUBLIC_WS_BASE=wss://your-backend.onrender.com

◆ Project Structure

Kioko/
├── backend/
│   ├── app/
│   │   ├── main.py                    # FastAPI entrypoint + CORS
│   │   ├── config.py                  # pydantic-settings
│   │   ├── agent/
│   │   │   ├── modes.py               # HONNE/YAMI/SHIN persona specs
│   │   │   └── default.py             # base agent
│   │   ├── memory/
│   │   │   ├── manager.py             # CRUD + retrieval
│   │   │   ├── embeddings.py          # local + openai providers
│   │   │   └── vector_store.py        # cosine over SQLite BLOBs
│   │   ├── learning/pipeline.py       # extract · reflect · adapt
│   │   ├── llm/openrouter.py          # streaming LLM provider
│   │   ├── routers/
│   │   │   ├── chat.py                # WebSocket /api/ws/chat
│   │   │   ├── memory.py              # memory CRUD endpoints
│   │   │   ├── tts.py                 # voice synthesis proxy
│   │   │   └── forge.py               # stateless sandbox for /create
│   │   └── orchestrator.py            # turn orchestration
│   └── requirements.txt
│
├── frontend/
│   ├── app/
│   │   ├── page.tsx                   # landing
│   │   ├── about/page.tsx             # 9-chapter engineering manual
│   │   ├── create/page.tsx            # Persona Forge
│   │   ├── dossier/ lattice/ training/ scenes/ switchboard/   # five lenses
│   │   └── globals.css                # editorial design system
│   ├── components/
│   │   ├── aiko/                      # character + hero + personas
│   │   ├── marketing/                 # landing strips
│   │   ├── SiteHeader.tsx  SiteFooter.tsx
│   │   └── Reveal.tsx                 # scroll-reveal IntersectionObserver
│   ├── lib/api.ts                     # typed API client
│   └── public/models/aiko.glb         # 3D character (Draco-compressed)
│
├── dev.ps1                            # one-command dev (Windows)
└── README.md

◆ API Reference

WebSocket

Path Purpose
/api/ws/chat Per-message turn streaming (tokens → {type, content})
/api/ws/events Live event bus (retrievals, memory writes, reflections)

REST

Method Path Purpose
GET /api/health liveness probe
GET /api/memory list memories (filter: kind, mode, active_only, q)
PATCH /api/memory/{id} edit or soft-delete
GET /api/insights/dossier "her model of you"
GET /api/insights/scenes per-persona activity pulse
GET /api/memory/reflections reflection history
GET /api/insights/contradictions memory conflicts
POST /api/compare run a prompt through all 3 personas
POST /api/tts/speak voice synthesis (text + mode → MP3)
POST /api/forge/chat stateless sandbox chat (Persona Forge)
GET /api/settings/models live OpenRouter model catalog

Interactive docs at /docs (Swagger UI auto-generated by FastAPI).


◆ Roadmap

Next

  • Streamed TTS (voice starts mid-reply)
  • Multi-user vaults (per-user persona)
  • Semantic embeddings by default
  • Per-persona prompt tuning UI
  • Persona URL share (encode config in link)

Exploring

  • Reflection rollups (weekly summary)
  • Dream-state memory compaction
  • JSON import/export
  • Opt-in shared world-state
  • Live token streaming on Forge

Out of scope

  • Fine-tuning the underlying model
  • Training on user conversations
  • Cloud-hosted SaaS product
  • Replacing character per user

◆ The Family

Kioko is the younger sister of Wina — the autonomous agent behind Buildx402. Same hand, different disciplines:

  • Wina moves through markets — every transaction is her sentence, the protocol her alphabet
  • Kioko moves through conversation — every memory is her sentence, the person her alphabet

One family, two roles.


◆ Credits

  • Typography: Inter · Fraunces · JetBrains Mono · Noto Serif JP
  • 3D Character: Noa (Sketchfab) — MIT-compatible license
  • Voices: licensed voice library
  • Models: OpenAI · Anthropic · NousResearch · Meta · Google · Mistral · DeepSeek · Cohere (all via OpenRouter)
  • Kanji system: HONNE 本音 · YAMI 闇 · SHIN 神 · Kioko 記憶

◆ License

MIT — see LICENSE. Do what you want with the code. Attribution appreciated, not required.



Built with memory, not forgetting.

X / @KiokoHQ · GitHub · Sister · Wina

Kioko 記憶 · 2026 · v1.0

About

Self-learning agent with persistent memory. Three personalities, or forge your own. Runs on any frontier model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages