ไปฅ้ฒๆไบบๆๅพ่ฏปๅฎ๏ผๆๆๅบฆ็ๆจกๅ้พๆฅๆพ็ฌฌไธ่ก https://pan.baidu.com/s/1sLeSyVp76yzWcR3Q4pX0kA?pwd=0721
100% Local ยท Fully Private ยท Zero API Dependencies
All conversations, voice, images, and character animations are generated on your own machine. No cloud servers, no third-party APIs, no risk of data leakage. Your AI girlfriend belongs to you, and only you.
An AI girlfriend project powered by OpenClaw + QQ Bot + Telegram Bot + llama.cpp + GPT-SoVITS + ComfyUI + Sakura Desktop Pet + Live2D โ running entirely on your own machine.
Characters: Supports hot-swappable AI girlfriends with isolated memories per character.
From Starry Moonlit Cafรฉ & the Butterfly of Death. Tall, aloof, cool exterior with a hidden warmth. A natural quietly-dominant type โ she takes the lead, teases you gently, and guards you fiercely. Speaks little, but every word hits.
From ATRI -My Dear Moments-. Petite, innocent, endlessly curious โ a bright-eyed girl who wears her heart on her sleeve. Runs toward the future with a smile, dragging you along. The polar opposite of Natsume: bubbly and expressive where Natsume is reserved, emotionally transparent where Natsume is guarded, playful where Natsume is composed. If Natsume is the cool winter night, ATRI is the warm summer sun.
| Cloud AI Girlfriend | This Project | |
|---|---|---|
| ๐ก๏ธ Privacy | Chat logs, voice, and images all stored on vendor servers | Everything stays local โ zero data leaves your machine |
| ๐ฐ Cost | Monthly subscriptions / per-token billing adds up | Free, one-time setup, runs forever (bring your own hardware) |
| ๐ Network | Needs internet; dead if servers go down | Works offline โ flip off your WiFi and keep chatting |
| ๐๏ธ Control | Prompts/templates controlled by vendor, can change anytime | You control all models, parameters, and character settings |
| ๐ Content | Heavy censorship, accounts get banned | No censorship โ talk about whatever you want |
| ๐จ Extensibility | Locked into vendor models and features | Mix and match โ swap LLMs, image models, voice models freely |
๐ QQ Bot: text chat + TTS voice + ComfyUI image generation + character memory
๐ Shiki Natsume Live2D: real-time character animation with emotion-driven motions, lip-sync, and speech bubbles. Controlled via local HTTP bridge.
Personality opposite of Natsume, hot-swappable with isolated memory.
๐ ATRI Live2D: silver hair, ruby-red eyes, barefoot in a white dress โ innocent and expressive.
๐ ATRI ComfyUI: AI image generation โ seaside sunset, flowing white dress, warm golden-hour lighting.
| Component | Model |
|---|---|
| GPU | NVIDIA GeForce RTX 5070 Laptop (8 GB VRAM) |
| CPU | Intel Core i9-14900HX (24 cores, 32 threads) |
| RAM | 32 GB DDR5 |
| OS | Windows 11 |
- ๐ฌ QQ + Telegram Dual Channel โ QQ Bot + Telegram Bot integration via OpenClaw Gateway
- ๐ค TTS Voice Synthesis โ Local GPT-SoVITS inference, Japanese voice (emotion-matched per dialogue)
- ๐จ AI Image Generation โ Local ComfyUI inference, SDXL/Illustrious models
- ๐ฅ๏ธ Sakura Desktop Pet โ PySide6 desktop companion with proactive care, screen observation & local LLM awareness
- ๐ญ Live2D Character Model โ Real-time Live2D rendering with 10 motion groups, emotion-driven expressions, and speech bubbles
- ๐ง VRAM Scheduler โ Automatic llama-server โ TTS/ComfyUI orchestration on 8 GB VRAM
- ๐พ Roleplay Memory โ Conversation summaries persisted to
memory/role_play/ - ๐ Multi-Character Hot-Swap โ Switch between AI girlfriends (Natsume โ ATRI) with one command; SOUL/IDENTITY/TTS weights/Live2D model all switch automatically, memories isolated per character
- ๐ Character Card Import โ Auto-detect SillyTavern character cards via
skills/character_importer/, import โ agent auto-switches role - ๐ฌ Chat Import โ Import SillyTavern JSONL chat logs into
memory/role_play/<character>/, agent restores conversation context on role switch
All models hosted on HuggingFace: TAOTAO777/ai-girlfriend-natsume
See models.yaml for full details.
| Model | Purpose | Size |
|---|---|---|
| Qwen3.6-35B-A3B-APEX-I-Compact (Q4_K GGUF) | Chat LLM | 16.11 GB |
| WAI-Nsfw-Illustrious-17 | ComfyUI generation (default) | 6.46 GB |
| miaomiaoHarem_v20 | ComfyUI generation (backup) | 6.46 GB |
| GPT-SoVITS voice weights | TTS voice synthesis | ~303 MB |
# Install huggingface-cli: pip install huggingface_hub
huggingface-cli login
# Download all models
huggingface-cli download TAOTAO777/ai-girlfriend-natsume --local-dir ./models
# Or download individual components:
huggingface-cli download TAOTAO777/ai-girlfriend-natsume llm/ --local-dir ./models
huggingface-cli download TAOTAO777/ai-girlfriend-natsume comfyui-checkpoints/ --local-dir ./checkpoints
huggingface-cli download TAOTAO777/ai-girlfriend-natsume gpt-sovits-weights/ --local-dir ./gpt-sovits-weights- Run
quick_setup.ps1โ interactive wizard that generatesconfig.yamlwith your local paths - (Alternative) Copy
config.example.yamlโconfig.yamland edit manually - Place downloaded model files according to
models.yaml, then updateconfig.yamlpaths
All Python/PS scripts read paths from config.yaml โ no hardcoded paths to edit.
โ ๏ธ Disclaimer: All models are community open-source. This project only provides mirror distribution, non-profit. Copyright belongs to original authors.
Running Qwen3.6-35B-A3B (MoE, Q4_K, 16.10 GiB, 34.66B params) via llama.cpp (b8851-b9222).
llama-server.exe `
-m "Qwen3.6-35B-A3B-uncensored-heretic-APEX-I-Compact.gguf" `
-c 120000 `
--flash-attn on -ctk q8_0 -ctv q8_0 `
-ngl 41 --cpu-moe --cpu-mask 0xFFFFFFFF `
--batch-size 4096 --ubatch-size 2048 --threads 24 `
--api-key *** -rea off --jinja `
--cache-ram 2048 --parallel 1 `
--kv-unified --no-mmap| Metric | Value | Notes |
|---|---|---|
| VRAM Usage | ~4.6 GiB (model) + ~1.2 GiB (KV cache) | ~2 GB free on 8 GB VRAM |
| Prefill Speed | 960 ~ 1390 t/s | 120K context, batch-size 4096 |
| Token Generation | 31 ~ 39 t/s | MoE architecture, 8/256 experts |
| Context Limit | 120K (~120k tokens) | ~59k token full reprocess in ~55s |
| Model Load Time | ~12s | --no-mmap, requires sufficient RAM |
Qwen3.6 MoE uses SSM (Gated Delta Net) hybrid attention with --kv-unified.
Mitigations:
- Periodic
/reset(Natsume writes roleplay summaries tomemory/role_play/before resetting) - Restore context from summaries on startup, keeping actual token count in 5Kโ20K range
config-patch.jsonsets OpenClaw contextWindow to 262144 to match model capacity
8 GB Total VRAM
โโโ llama-server resident: ~5.8 GB (model 4.6G + KV cache 1.2G)
โโโ Free: ~2.2 GB
โ
โโโ TTS inference: stop llama โ ~8 GB free โ resume llama (~70s)
โโโ ComfyUI generation: stop llama โ ~8 GB free โ resume llama (~120s)
AI_Girlfriend/ # OpenClaw workspace root
โโโ start.ps1 # ๐ One-click launch: llama + Live2D + Gateway
โโโ quick_setup.ps1 # ๐ Interactive path config wizard
โโโ config.yaml # Generated config
โโโ download-models.ps1 # One-click model download (Windows)
โโโ download-models.sh # One-click model download (Linux/macOS)
โโโ setup-llama.ps1 # Auto-detect HW + configure llama.cpp (Win)
โโโ setup-llama.sh # Auto-detect HW + configure llama.cpp (Linux/macOS)
โโโ setup-openclaw.ps1 # One-click OpenClaw install + deploy (Win)
โโโ setup-openclaw.sh # One-click OpenClaw install + deploy (Linux/macOS)
โโโ setup-all.ps1 # ๐ All-in-One mega script (Windows)
โโโ setup-all.sh # ๐ All-in-One mega script (Linux/macOS)
โโโ config-qqbot.json # QQ Bot config patch
โโโ config-telegram.json # Telegram Bot config patch
โโโ config-patch.json # OpenClaw LLM config patch
โโโ AGENTS.md # Agent behavior rules
โโโ SOUL.md # Character personality
โโโ IDENTITY.md # Character identity
โโโ USER.md # User info
โโโ HEARTBEAT.md # Heartbeat config
โโโ TOOLS.md # Tool quick reference
โโโ models.yaml # Model catalog + download links
โโโ README.md # This file
โโโ .gitignore
โโโ live2d/ # Live2D character model (Cubism 4 Core)
โ โโโ index.html # Browser frontend (standalone window)
โ โโโ embed.html # Embeddable version
โ โโโ live2dcubismcore.min.js # Cubism Core 4 (207 KB)
โ โโโ plid-v5-bundle.js # pixi-live2d-display v0.5.0 bundle
โ โโโ live2d-bridge.mjs # HTTP (19200) + WebSocket (19201) bridge
โ โโโ pixi.min.js, pixi-shim.js # PIXI.js v7 rendering
โ โโโ model/shiki_natsume/ # Shiki Natsume model files
โ โโโ media/ # Generated screenshots
โ โโโ _archive/ # Debug artifacts
โโโ ren_pro_jp/ # Ren'Py dialog engine (planned)
โโโ memory/ # [.gitignore] Runtime memory
โ โโโ role_play/ # Roleplay conversation logs
โโโ media/ # [.gitignore] Generated media
โ โโโ audio/ # TTS voice output
โ โโโ images/ # ComfyUI image output
โ โโโ *.gif # README demo GIFs
โโโ docs/
โ โโโ telegram-setup.md # Telegram Bot setup guide
โ โโโ qqbot-setup.md # QQ Bot setup guide
โโโ skills/
โโโ live2d/ # ๐ Live2D control skill
โ โโโ SKILL.md # Live2D API invocation guide
โ โโโ scripts/start-live2d.ps1 # Live2D launcher
โ โโโ media/ # Shared media output
โโโ tts/
โ โโโ SKILL.md # TTS invocation guide
โ โโโ run_tts.ps1 # TTS launcher script
โ โโโ tts_call.py # GPT-SoVITS inference
โ โโโ ref_wavs/ # Reference audio clips
โโโ comfyui/
โ โโโ SKILL.md # ComfyUI invocation guide
โ โโโ run_comfyui.ps1 # ComfyUI launcher script
โ โโโ comfyui_call.py # ComfyUI inference
โ โโโ prompt_template.md # Character prompt template
โ โโโ custom_prompt.txt # Custom extra prompt
โโโ sakura/ # Sakura Desktop Pet (PySide6 GUI)
โ โโโ SKILL.md # Sakura skill documentation
โ โโโ main.py # Application entry point
โ โโโ install.bat # Windows dependency installer
โ โโโ start.bat # Windows launcher
โ โโโ app/ # Source code
โโโ llama-management.md # VRAM management architecture doc
โโโ llama-watchdog.ps1 # Llama health check
โโโ cleanup_orphans.ps1 # Orphan process cleanup
โโโ character_importer/ # SillyTavern character card auto-import
| Skill | Type | Llama Kill? | Mechanism |
|---|---|---|---|
| Live2D | HTTP exec | โ No | Direct HTTP calls to localhost:19200 bridge |
| TTS | sessions_spawn | โ Yes | Kill โ GPT-SoVITS โ restart llama |
| ComfyUI | sessions_spawn | โ Yes | Kill โ image gen โ restart llama |
| Sakura | Shared llama-client | โ No | Detects llama down โ waits โ auto-resumes |
| Component | Version / Source | Purpose |
|---|---|---|
| OpenClaw | latest | AI Agent Gateway |
| QQ Bot | OpenClaw qqbot channel | QQ message relay |
| Telegram Bot | OpenClaw telegram channel | Telegram message relay |
| llama.cpp | b9222 | Local LLM inference server |
| GPT-SoVITS v2 | v2pro-20250604 | TTS voice synthesis |
| ComfyUI | aki-v3 | Image generation engine |
| Sakura Desktop Pet | v0.9.6-dev | Desktop companion GUI |
| pixi-live2d-display | v0.5.0 | Live2D WebGL renderer |
| Live2D Cubism Core | 4.x (CDN: cubism.live2d.com/sdk-web/cubismcore/) | Live2D physics/animation |
| Python | 3.12+ | Runtime (Sakura + TTS + ComfyUI) |
One command, from scratch to a fully functional AI girlfriend:
Windows:
powershell -File setup-all.ps1Linux / macOS:
bash setup-all.shAutomated pipeline: environment check โ model download โ llama.cpp setup โ OpenClaw install โ Sakura desktop pet โ workspace deploy โ path check โ launch โ verify.
Supports resume from breakpoint. Flags:
--skip-model-download,--skip-llama-setup,--skip-openclaw-setup,--skip-sakura-setup,--dry-run,--no-start
Install OpenClaw Gateway and deploy the AI Girlfriend workspace:
Windows:
powershell -File setup-openclaw.ps1Linux / macOS:
bash setup-openclaw.shThis script installs Node.js, OpenClaw Gateway, deploys workspace files, installs daemon, and applies config patch.
Flags:
--skip-node,--skip-deploy,--skip-daemon,--no-onboard
Windows:
pip install huggingface_hub
huggingface-cli login
powershell -File download-models.ps1Linux / macOS:
pip install huggingface_hub
huggingface-cli login
bash download-models.shDownloads all 5 model files (~31.7 GB) from HuggingFace with progress reporting and resume support.
Auto-detects GPU, VRAM, CPU cores, RAM and generates optimized launch configs.
Windows:
powershell -File setup-llama.ps1Linux / macOS:
bash setup-llama.shpowershell -File quick_setup.ps1Interactive wizard โ enter your local paths once, all scripts are updated automatically.
# One-click start all services (llama + Live2D + Gateway)
powershell -File start.ps1# Start the bridge
Start-Process node -ArgumentList "live2d-bridge.mjs" -WorkingDirectory live2d -WindowStyle Hidden
# Open in standalone window (Chrome app mode)
Start-Process chrome -ArgumentList "--new-window --app=http://localhost:19200/index.html --window-size=450,650"Live2D runs in a frameless Chrome window โ place it anywhere on your desktop.
# Llama health check (every 10 min)
schtasks /create /tn "llama-watchdog" `
/tr "powershell -File C:\Users\<you>\.openclaw\workspace\skills\llama-watchdog.ps1" `
/sc minute /mo 10
# Orphan process cleanup (hourly)
schtasks /create /tn "cleanup-orphans" `
/tr "powershell -File C:\Users\<you>\.openclaw\workspace\skills\cleanup_orphans.ps1" `
/sc hourly /mo 1User (QQ / Telegram) โโโโโโ Sakura Desktop Pet (PySide6)
โ โ
โผ โผ
OpenClaw Gateway Live2D Bridge (:19200)
โ โฒ โ
โผ โ โผ
โโโโโโ llama-server :8080 โโโโโโโ Browser (Live2D model)
โ (Qwen3.6-35B) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Main session (roleplay) โ
โ TTS (kill โ GPU โ restart) โ
โ ComfyUI (kill โ GPU โ restart) โ
โ Live2D (HTTP โ no kill needed) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Agent Hub โ Immutable Capability Instructions:
โโโโโโโโโโโโโโโ
โ AGENTS.md โ โ Capability hub (never changes on role switch)
โ SOUL.md โ โ Current character persona (hot-swappable)
โ IDENTITY.md โ โ Character metadata
โ TOOLS.md โ โ Quick reference
โ USER.md โ โ User profile
โโโโโโโโฌโโโโโโโโ
โ
โโโโโโโโโโโโโดโโโโโโโโโโโโโ
โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ
โ skills/harem/ โ โ memory/role_play/ โ
โ (ๅญๆกฃๅๅฎซ) โ โ <่ง่ฒ>/ (็ฌ็ซ่ฎฐๅฟ) โ
โ โโ natsume/ โ โ โโ natsume/*.md โ
โ โโ enola/ โ โ โโ enola/*.md โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ
AGENTS.mdstays constant across role switches โ ComfyUI / TTS / Live2D instructions are preservedSOUL.md+IDENTITY.mdare overwritten on switch; harem archives source of truth- Memory per character isolated in
memory/role_play/<name>/โ never cross-contaminated - SillyTavern character cards imported via PNG tEXt chunk parsing โ auto-switch agent persona
Four Skills, One Brain:
| Skill | Location | Llama Interaction |
|---|---|---|
| Live2D | skills/live2d/ |
HTTP API only โ never touches llama |
| TTS | skills/tts/ |
Kill llama โ GPT-SoVITS โ restart + wait /health |
| ComfyUI | skills/comfyui/ |
Kill llama โ image gen โ restart + wait /health |
| Sakura | skills/sakura/ |
Shared llama-client; detects down โ auto-resume |
| Character Importer | skills/character_importer/ |
Agent-level โ no GPU needed; writes SOUL/IDENTITY + memory dir |
VRAM Orchestration Flow:
- Main session receives user request โ assembles command
sessions_spawn(mode="run")creates local model sub-session- Sub-session execs PS script โ
stop_llama()kills llama-server - Full 8 GB VRAM freed โ TTS/ComfyUI inference
start_llama()restarts llama-server (~12s load + ~3s warmup)- Live2D remains active during entire cycle โ bridge doesn't touch GPU
- Sub-session writes
.task_flagsโ announces back to main session - Main session reads media files โ sends via
<qqmedia>/MEDIA:
- RTX 50xx (Blackwell) + CUDA 13.x =
munmap_chunk(): invalid pointercrash โ CUDA 13.x has known memory management incompatibility with llama.cpp on Blackwell GPUs. Solution: use pre-built llama.cpp binaries compiled with CUDA 12.x (not self-compiled with CUDA 13.x). Download from llama.cpp Releases, choosecudart-llama-bin-win-cuda-12.4-x64.zip. RTX 5070 Ti is fully compatible with CUDA 12.x drivers. - Llama-server is offline for ~60โ120s during TTS/ComfyUI inference โ conversation pauses, but Live2D keeps running
- Sub-sessions use local model (same as main), DeepSeek as optional fallback
- Llama-server does not support cross-turn prompt cache reuse (SSM limitation) โ use periodic
/reset - Live2D requires Cubism Core 4 (not 5 or 6) โ pixi-live2d-display v0.5.0 is built for Cubism 4 Framework; Core 5+ causes clipping/layer failures
- All model files protected by
.gitignore - GPT-SoVITS weights are self-trained and not distributed โ train with your own voice data
- @Rvosy โ Creator of Sakura Desktop Pet, authorized for inclusion (Issue #38)
- @guansss โ Creator of pixi-live2d-display
- Live2D Inc. โ Cubism SDK (non-commercial use)



