Skip to content

iSonik/DAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DAI (ダイ, 大)

A local AI that actually learns from your conversations through LoRA fine-tuning on Apple Silicon. It has idle thinking (rumination and daydreaming) and a neurochemistry system that steers what it learns. Runs locally with MLX.

What This Is

Not a chatbot wrapper. Dai is a local language model that updates its own weights through sleep cycles. You talk to it during the day, it curates the conversation, fine-tunes on what matters, and tests itself before committing changes. Between conversations, it replays memories and looks for patterns.

Features

  • LoRA fine-tuning from conversations. Learns your preferences, facts, communication style.
  • Sleep cycle. Nightly curation, compression, training, self-test, and merge pipeline.
  • Self-test with rollback. If post-training scores drop, the adapter gets rolled back automatically.
  • Rumination. Directed reflection on recent conversations during idle time.
  • Daydreaming. Free association between random memories at high temperature.
  • Neurochemistry. Adenosine (sleep pressure), cortisol (learning gain), dopamine (reward tagging).
  • Hybrid memory retrieval. Vector similarity + BM25 across daily/weekly/monthly memory tiers.
  • Tool calling. Memory search during inference so it can recall what you've told it.
  • REM-scoped training. Cycles split across today/week/month scopes with replay buffer mixing.

Requirements

  • macOS with Apple Silicon (M1/M2/M3/M4)
  • Python 3.11+
  • 8GB+ unified memory (16GB+ recommended for 14B model)

Quick Start

git clone https://github.com/iSonik/DAI.git
cd dai
make setup    # picks model size, downloads weights, creates directories
dai           # start chatting

Setup will ask you to choose a model:

Size Model RAM Notes
Small Qwen2.5-7B-Instruct-4bit ~6GB Fast, good for 8GB Macs
Medium Qwen2.5-14B-Instruct-4bit ~10GB Recommended for 16GB+
Large Qwen2.5-32B-Instruct-4bit ~20GB Best quality, 32GB+

How It Works

Awake. You chat. Messages get logged, classified (sentiment, fluff detection), and the neurochemistry updates. Inference runs through MLX with LoRA adapter loaded. Tool calls let Dai search its own memory mid-conversation.

Idle. When you stop typing, background rumination and daydreaming start. Rumination reflects on today's conversations (or the week, or older memories). Daydreaming collides two random memories and free-associates between them. Thoughts are stored as inner monologue.

Sleep. The full pipeline:

  1. Curate. LLM scores each exchange for training value (threshold: 0.6)
  2. Compress. Daily logs older than 7 days get summarized into weekly; weekly into monthly after 30 days
  3. Train. LoRA fine-tuning across REM cycles scoped to today/week/month, with replay buffer mixing (30%)
  4. Self-test. Generates questions from memory, compares pre/post scores, rolls back if it got worse
  5. Merge. Periodically bakes the LoRA adapter into base weights (every 30 days)

The Neurochemistry

Three signals modulate the learning pipeline:

  • Adenosine (sleep pressure). Accumulates with each exchange. Corrections are more tiring than small talk. At threshold (1.0), triggers auto-sleep. Resets after sleeping.
  • Cortisol (learning gain). Spikes when the user corrects Dai. Amplifies training reward so mistakes get prioritized. Decays each sleep cycle.
  • Dopamine (reward tagging). Tags exchanges for training priority. Corrections get the highest reward, approval gets moderate, neutral gets baseline. Feeds directly into curation scoring.

Dai also tracks a happiness signal that drifts based on interaction sentiment. Positive feedback lifts it, corrections lower it, and it slowly regresses to baseline during neutral exchanges.

Configuration

Settings live in two places:

  • data/settings.json has model selection and LoRA rank/layers (written by make setup)
  • dai/config.py has all hyperparameters: training (learning rate, batch size, REM cycles), neurochemistry thresholds, memory compression windows, mind intervals, and more

Project Structure

dai/
  cli/          Chat REPL, commands, display
  inference/    MLX engine, context building, embeddings, tool calling
  memory/       Logging, retrieval (vector + BM25), identity, schema
  mind/         Rumination (directed) + daydreaming (free association)
  replay/       Experience replay buffer for training stability
  sleep/        Curation, compression, LoRA training, self-test, merge
  config.py     All paths, constants, hyperparameters
  neurochemistry.py   Dopamine / cortisol / adenosine system
data/
  daily/        Raw conversation logs (JSONL)
  weekly/       Compressed weekly summaries
  monthly/      Compressed monthly summaries
  training/     Curated training data + replay buffer
  adapters/     Current LoRA adapter weights
  model/        Downloaded base model
  identity.md   What Dai knows about you (builds over time)
  settings.json Model + LoRA config

Running Tests

make test

Commands

  • dai starts chatting
  • dai-sleep manually triggers a sleep cycle
  • dai-setup re-runs setup (model selection, directories)
  • make dev installs in editable mode with dev dependencies

License

MIT

About

The Dreaming AI

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors