Skip to content

momori777/Artemis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

98 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

ไปฅ้˜ฒๆœ‰ไบบๆ‡’ๅพ—่ฏปๅฎŒ๏ผŒๆˆ‘ๆŠŠๅบฆ็›˜ๆจกๅž‹้“พๆŽฅๆ”พ็ฌฌไธ€่กŒ https://pan.baidu.com/s/1sLeSyVp76yzWcR3Q4pX0kA?pwd=0721

AI Girlfriend

100% Local ยท Fully Private ยท Zero API Dependencies

All conversations, voice, images, and character animations are generated on your own machine. No cloud servers, no third-party APIs, no risk of data leakage. Your AI girlfriend belongs to you, and only you.


An AI girlfriend project powered by OpenClaw + QQ Bot + Telegram Bot + llama.cpp + GPT-SoVITS + ComfyUI + Sakura Desktop Pet + Live2D โ€” running entirely on your own machine.

Characters: Supports hot-swappable AI girlfriends with isolated memories per character.

Shiki Natsume (ๅ››ๅญฃๅค็›ฎ)

From Starry Moonlit Cafรฉ & the Butterfly of Death. Tall, aloof, cool exterior with a hidden warmth. A natural quietly-dominant type โ€” she takes the lead, teases you gently, and guards you fiercely. Speaks little, but every word hits.

ATRI (ไบšๆ‰˜่މ)

From ATRI -My Dear Moments-. Petite, innocent, endlessly curious โ€” a bright-eyed girl who wears her heart on her sleeve. Runs toward the future with a smile, dragging you along. The polar opposite of Natsume: bubbly and expressive where Natsume is reserved, emotionally transparent where Natsume is guarded, playful where Natsume is composed. If Natsume is the cool winter night, ATRI is the warm summer sun.

โœจ Why This Project?

Cloud AI Girlfriend This Project
๐Ÿ›ก๏ธ Privacy Chat logs, voice, and images all stored on vendor servers Everything stays local โ€” zero data leaves your machine
๐Ÿ’ฐ Cost Monthly subscriptions / per-token billing adds up Free, one-time setup, runs forever (bring your own hardware)
๐ŸŒ Network Needs internet; dead if servers go down Works offline โ€” flip off your WiFi and keep chatting
๐ŸŽ›๏ธ Control Prompts/templates controlled by vendor, can change anytime You control all models, parameters, and character settings
๐Ÿ”ž Content Heavy censorship, accounts get banned No censorship โ€” talk about whatever you want
๐ŸŽจ Extensibility Locked into vendor models and features Mix and match โ€” swap LLMs, image models, voice models freely

๐ŸŽฌ Demo

Multi-Channel Chat

QQ Bot Demo

๐Ÿ‘† QQ Bot: text chat + TTS voice + ComfyUI image generation + character memory

Live2D Desktop Pet

Live2D Demo

๐Ÿ‘† Shiki Natsume Live2D: real-time character animation with emotion-driven motions, lip-sync, and speech bubbles. Controlled via local HTTP bridge.

โญ ATRI โ€” Second AI Girlfriend

Personality opposite of Natsume, hot-swappable with isolated memory.

ATRI Live2D

๐Ÿ‘† ATRI Live2D: silver hair, ruby-red eyes, barefoot in a white dress โ€” innocent and expressive.

ATRI ComfyUI

๐Ÿ‘† ATRI ComfyUI: AI image generation โ€” seaside sunset, flowing white dress, warm golden-hour lighting.

Hardware

Component Model
GPU NVIDIA GeForce RTX 5070 Laptop (8 GB VRAM)
CPU Intel Core i9-14900HX (24 cores, 32 threads)
RAM 32 GB DDR5
OS Windows 11

Features

  • ๐Ÿ’ฌ QQ + Telegram Dual Channel โ€” QQ Bot + Telegram Bot integration via OpenClaw Gateway
  • ๐ŸŽค TTS Voice Synthesis โ€” Local GPT-SoVITS inference, Japanese voice (emotion-matched per dialogue)
  • ๐ŸŽจ AI Image Generation โ€” Local ComfyUI inference, SDXL/Illustrious models
  • ๐Ÿ–ฅ๏ธ Sakura Desktop Pet โ€” PySide6 desktop companion with proactive care, screen observation & local LLM awareness
  • ๐ŸŽญ Live2D Character Model โ€” Real-time Live2D rendering with 10 motion groups, emotion-driven expressions, and speech bubbles
  • ๐Ÿง  VRAM Scheduler โ€” Automatic llama-server โ†” TTS/ComfyUI orchestration on 8 GB VRAM
  • ๐Ÿ’พ Roleplay Memory โ€” Conversation summaries persisted to memory/role_play/
  • ๐Ÿ”„ Multi-Character Hot-Swap โ€” Switch between AI girlfriends (Natsume โ‡„ ATRI) with one command; SOUL/IDENTITY/TTS weights/Live2D model all switch automatically, memories isolated per character
  • ๐Ÿƒ Character Card Import โ€” Auto-detect SillyTavern character cards via skills/character_importer/, import โ†’ agent auto-switches role
  • ๐Ÿ’ฌ Chat Import โ€” Import SillyTavern JSONL chat logs into memory/role_play/<character>/, agent restores conversation context on role switch

Models

All models hosted on HuggingFace: TAOTAO777/ai-girlfriend-natsume

See models.yaml for full details.

Model Purpose Size
Qwen3.6-35B-A3B-APEX-I-Compact (Q4_K GGUF) Chat LLM 16.11 GB
WAI-Nsfw-Illustrious-17 ComfyUI generation (default) 6.46 GB
miaomiaoHarem_v20 ComfyUI generation (backup) 6.46 GB
GPT-SoVITS voice weights TTS voice synthesis ~303 MB

One-command Download

# Install huggingface-cli: pip install huggingface_hub
huggingface-cli login

# Download all models
huggingface-cli download TAOTAO777/ai-girlfriend-natsume --local-dir ./models

# Or download individual components:
huggingface-cli download TAOTAO777/ai-girlfriend-natsume llm/ --local-dir ./models
huggingface-cli download TAOTAO777/ai-girlfriend-natsume comfyui-checkpoints/ --local-dir ./checkpoints
huggingface-cli download TAOTAO777/ai-girlfriend-natsume gpt-sovits-weights/ --local-dir ./gpt-sovits-weights

Local Configuration

  1. Run quick_setup.ps1 โ€” interactive wizard that generates config.yaml with your local paths
  2. (Alternative) Copy config.example.yaml โ†’ config.yaml and edit manually
  3. Place downloaded model files according to models.yaml, then update config.yaml paths

All Python/PS scripts read paths from config.yaml โ€” no hardcoded paths to edit.

โš ๏ธ Disclaimer: All models are community open-source. This project only provides mirror distribution, non-profit. Copyright belongs to original authors.

Local LLM Performance

Running Qwen3.6-35B-A3B (MoE, Q4_K, 16.10 GiB, 34.66B params) via llama.cpp (b8851-b9222).

Launch Command

llama-server.exe `
  -m "Qwen3.6-35B-A3B-uncensored-heretic-APEX-I-Compact.gguf" `
  -c 120000 `
  --flash-attn on -ctk q8_0 -ctv q8_0 `
  -ngl 41 --cpu-moe --cpu-mask 0xFFFFFFFF `
  --batch-size 4096 --ubatch-size 2048 --threads 24 `
  --api-key *** -rea off --jinja `
  --cache-ram 2048 --parallel 1 `
  --kv-unified --no-mmap

Key Metrics

Metric Value Notes
VRAM Usage ~4.6 GiB (model) + ~1.2 GiB (KV cache) ~2 GB free on 8 GB VRAM
Prefill Speed 960 ~ 1390 t/s 120K context, batch-size 4096
Token Generation 31 ~ 39 t/s MoE architecture, 8/256 experts
Context Limit 120K (~120k tokens) ~59k token full reprocess in ~55s
Model Load Time ~12s --no-mmap, requires sufficient RAM

Long Context Stability

Qwen3.6 MoE uses SSM (Gated Delta Net) hybrid attention with --kv-unified.

โš ๏ธ Known Limitation: Cross-turn prompt cache reuse is not supported (SSM architecture limitation). Each request triggers full context re-processing. Longer conversations = higher first-token latency (~55s for 59k tokens).

Mitigations:

  • Periodic /reset (Natsume writes roleplay summaries to memory/role_play/ before resetting)
  • Restore context from summaries on startup, keeping actual token count in 5Kโ€“20K range
  • config-patch.json sets OpenClaw contextWindow to 262144 to match model capacity

VRAM Budget

8 GB Total VRAM
โ”œโ”€โ”€ llama-server resident: ~5.8 GB (model 4.6G + KV cache 1.2G)
โ”œโ”€โ”€ Free: ~2.2 GB
โ”‚
โ”œโ”€โ”€ TTS inference: stop llama โ†’ ~8 GB free โ†’ resume llama (~70s)
โ””โ”€โ”€ ComfyUI generation: stop llama โ†’ ~8 GB free โ†’ resume llama (~120s)

Directory Structure

AI_Girlfriend/                        # OpenClaw workspace root
โ”œโ”€โ”€ start.ps1                         # ๐Ÿš€ One-click launch: llama + Live2D + Gateway
โ”œโ”€โ”€ quick_setup.ps1                     # ๐Ÿ›  Interactive path config wizard
โ”œโ”€โ”€ config.yaml                       # Generated config
โ”œโ”€โ”€ download-models.ps1               # One-click model download (Windows)
โ”œโ”€โ”€ download-models.sh                # One-click model download (Linux/macOS)
โ”œโ”€โ”€ setup-llama.ps1                   # Auto-detect HW + configure llama.cpp (Win)
โ”œโ”€โ”€ setup-llama.sh                    # Auto-detect HW + configure llama.cpp (Linux/macOS)
โ”œโ”€โ”€ setup-openclaw.ps1                # One-click OpenClaw install + deploy (Win)
โ”œโ”€โ”€ setup-openclaw.sh                 # One-click OpenClaw install + deploy (Linux/macOS)
โ”œโ”€โ”€ setup-all.ps1                     # ๐Ÿš€ All-in-One mega script (Windows)
โ”œโ”€โ”€ setup-all.sh                      # ๐Ÿš€ All-in-One mega script (Linux/macOS)
โ”œโ”€โ”€ config-qqbot.json                 # QQ Bot config patch
โ”œโ”€โ”€ config-telegram.json              # Telegram Bot config patch
โ”œโ”€โ”€ config-patch.json                 # OpenClaw LLM config patch
โ”œโ”€โ”€ AGENTS.md                         # Agent behavior rules
โ”œโ”€โ”€ SOUL.md                           # Character personality
โ”œโ”€โ”€ IDENTITY.md                       # Character identity
โ”œโ”€โ”€ USER.md                           # User info
โ”œโ”€โ”€ HEARTBEAT.md                      # Heartbeat config
โ”œโ”€โ”€ TOOLS.md                          # Tool quick reference
โ”œโ”€โ”€ models.yaml                       # Model catalog + download links
โ”œโ”€โ”€ README.md                         # This file
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ live2d/                           # Live2D character model (Cubism 4 Core)
โ”‚   โ”œโ”€โ”€ index.html                    # Browser frontend (standalone window)
โ”‚   โ”œโ”€โ”€ embed.html                    # Embeddable version
โ”‚   โ”œโ”€โ”€ live2dcubismcore.min.js       # Cubism Core 4 (207 KB)
โ”‚   โ”œโ”€โ”€ plid-v5-bundle.js             # pixi-live2d-display v0.5.0 bundle
โ”‚   โ”œโ”€โ”€ live2d-bridge.mjs             # HTTP (19200) + WebSocket (19201) bridge
โ”‚   โ”œโ”€โ”€ pixi.min.js, pixi-shim.js     # PIXI.js v7 rendering
โ”‚   โ”œโ”€โ”€ model/shiki_natsume/          # Shiki Natsume model files
โ”‚   โ”œโ”€โ”€ media/                        # Generated screenshots
โ”‚   โ””โ”€โ”€ _archive/                     # Debug artifacts
โ”œโ”€โ”€ ren_pro_jp/                       # Ren'Py dialog engine (planned)
โ”œโ”€โ”€ memory/                           # [.gitignore] Runtime memory
โ”‚   โ””โ”€โ”€ role_play/                    # Roleplay conversation logs
โ”œโ”€โ”€ media/                            # [.gitignore] Generated media
โ”‚   โ”œโ”€โ”€ audio/                        # TTS voice output
โ”‚   โ”œโ”€โ”€ images/                       # ComfyUI image output
โ”‚   โ””โ”€โ”€ *.gif                         # README demo GIFs
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ telegram-setup.md             # Telegram Bot setup guide
โ”‚   โ””โ”€โ”€ qqbot-setup.md                # QQ Bot setup guide
โ””โ”€โ”€ skills/
    โ”œโ”€โ”€ live2d/                       # ๐Ÿ†• Live2D control skill
    โ”‚   โ”œโ”€โ”€ SKILL.md                  # Live2D API invocation guide
    โ”‚   โ”œโ”€โ”€ scripts/start-live2d.ps1  # Live2D launcher
    โ”‚   โ””โ”€โ”€ media/                    # Shared media output
    โ”œโ”€โ”€ tts/
    โ”‚   โ”œโ”€โ”€ SKILL.md                  # TTS invocation guide
    โ”‚   โ”œโ”€โ”€ run_tts.ps1               # TTS launcher script
    โ”‚   โ”œโ”€โ”€ tts_call.py               # GPT-SoVITS inference
    โ”‚   โ””โ”€โ”€ ref_wavs/                 # Reference audio clips
    โ”œโ”€โ”€ comfyui/
    โ”‚   โ”œโ”€โ”€ SKILL.md                  # ComfyUI invocation guide
    โ”‚   โ”œโ”€โ”€ run_comfyui.ps1           # ComfyUI launcher script
    โ”‚   โ”œโ”€โ”€ comfyui_call.py           # ComfyUI inference
    โ”‚   โ”œโ”€โ”€ prompt_template.md        # Character prompt template
    โ”‚   โ””โ”€โ”€ custom_prompt.txt         # Custom extra prompt
    โ”œโ”€โ”€ sakura/                       # Sakura Desktop Pet (PySide6 GUI)
    โ”‚   โ”œโ”€โ”€ SKILL.md                  # Sakura skill documentation
    โ”‚   โ”œโ”€โ”€ main.py                   # Application entry point
    โ”‚   โ”œโ”€โ”€ install.bat               # Windows dependency installer
    โ”‚   โ”œโ”€โ”€ start.bat                 # Windows launcher
    โ”‚   โ””โ”€โ”€ app/                      # Source code
    โ”œโ”€โ”€ llama-management.md           # VRAM management architecture doc
    โ”œโ”€โ”€ llama-watchdog.ps1            # Llama health check
    โ”œโ”€โ”€ cleanup_orphans.ps1           # Orphan process cleanup
    โ””โ”€โ”€ character_importer/           # SillyTavern character card auto-import

Skills Overview

Skill Type Llama Kill? Mechanism
Live2D HTTP exec โŒ No Direct HTTP calls to localhost:19200 bridge
TTS sessions_spawn โœ… Yes Kill โ†’ GPT-SoVITS โ†’ restart llama
ComfyUI sessions_spawn โœ… Yes Kill โ†’ image gen โ†’ restart llama
Sakura Shared llama-client โŒ No Detects llama down โ†’ waits โ†’ auto-resumes

Prerequisites

Component Version / Source Purpose
OpenClaw latest AI Agent Gateway
QQ Bot OpenClaw qqbot channel QQ message relay
Telegram Bot OpenClaw telegram channel Telegram message relay
llama.cpp b9222 Local LLM inference server
GPT-SoVITS v2 v2pro-20250604 TTS voice synthesis
ComfyUI aki-v3 Image generation engine
Sakura Desktop Pet v0.9.6-dev Desktop companion GUI
pixi-live2d-display v0.5.0 Live2D WebGL renderer
Live2D Cubism Core 4.x (CDN: cubism.live2d.com/sdk-web/cubismcore/) Live2D physics/animation
Python 3.12+ Runtime (Sakura + TTS + ComfyUI)

Quick Start

๐Ÿš€ All-in-One (Recommended)

One command, from scratch to a fully functional AI girlfriend:

Windows:

powershell -File setup-all.ps1

Linux / macOS:

bash setup-all.sh

Automated pipeline: environment check โ†’ model download โ†’ llama.cpp setup โ†’ OpenClaw install โ†’ Sakura desktop pet โ†’ workspace deploy โ†’ path check โ†’ launch โ†’ verify.

Supports resume from breakpoint. Flags: --skip-model-download, --skip-llama-setup, --skip-openclaw-setup, --skip-sakura-setup, --dry-run, --no-start


Step-by-Step

0. Setup OpenClaw

Install OpenClaw Gateway and deploy the AI Girlfriend workspace:

Windows:

powershell -File setup-openclaw.ps1

Linux / macOS:

bash setup-openclaw.sh

This script installs Node.js, OpenClaw Gateway, deploys workspace files, installs daemon, and applies config patch.

Flags: --skip-node, --skip-deploy, --skip-daemon, --no-onboard

1. Download Models

Windows:

pip install huggingface_hub
huggingface-cli login
powershell -File download-models.ps1

Linux / macOS:

pip install huggingface_hub
huggingface-cli login
bash download-models.sh

Downloads all 5 model files (~31.7 GB) from HuggingFace with progress reporting and resume support.

2. Setup llama.cpp

Auto-detects GPU, VRAM, CPU cores, RAM and generates optimized launch configs.

Windows:

powershell -File setup-llama.ps1

Linux / macOS:

bash setup-llama.sh

3. Configure Paths

powershell -File quick_setup.ps1

Interactive wizard โ€” enter your local paths once, all scripts are updated automatically.

4. Quick Launch

# One-click start all services (llama + Live2D + Gateway)
powershell -File start.ps1

5. Start Live2D Individually

# Start the bridge
Start-Process node -ArgumentList "live2d-bridge.mjs" -WorkingDirectory live2d -WindowStyle Hidden

# Open in standalone window (Chrome app mode)
Start-Process chrome -ArgumentList "--new-window --app=http://localhost:19200/index.html --window-size=450,650"

Live2D runs in a frameless Chrome window โ€” place it anywhere on your desktop.

5. Windows Task Scheduler (optional)

# Llama health check (every 10 min)
schtasks /create /tn "llama-watchdog" `
  /tr "powershell -File C:\Users\<you>\.openclaw\workspace\skills\llama-watchdog.ps1" `
  /sc minute /mo 10

# Orphan process cleanup (hourly)
schtasks /create /tn "cleanup-orphans" `
  /tr "powershell -File C:\Users\<you>\.openclaw\workspace\skills\cleanup_orphans.ps1" `
  /sc hourly /mo 1

Architecture

User (QQ / Telegram) โ”€โ”€โ”€โ”€โ”€โ”€ Sakura Desktop Pet (PySide6)
  โ”‚                                    โ”‚
  โ–ผ                                    โ–ผ
OpenClaw Gateway              Live2D Bridge (:19200)
  โ”‚                               โ–ฒ       โ”‚
  โ–ผ                               โ”‚       โ–ผ
  โ”Œโ”€โ”€โ”€โ”€โ”€ llama-server :8080 โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   Browser (Live2D model)
  โ”‚         (Qwen3.6-35B)             โ”‚
  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
  โ”‚  Main session (roleplay)          โ”‚
  โ”‚  TTS (kill โ†’ GPU โ†’ restart)       โ”‚
  โ”‚  ComfyUI (kill โ†’ GPU โ†’ restart)   โ”‚
  โ”‚  Live2D (HTTP โ†’ no kill needed)   โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Agent Hub โ€” Immutable Capability Instructions:

         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚  AGENTS.md   โ”‚  โ† Capability hub (never changes on role switch)
         โ”‚  SOUL.md     โ”‚  โ† Current character persona (hot-swappable)
         โ”‚  IDENTITY.md โ”‚  โ† Character metadata
         โ”‚  TOOLS.md    โ”‚  โ† Quick reference
         โ”‚  USER.md     โ”‚  โ† User profile
         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ–ผ                        โ–ผ
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚ skills/harem/ โ”‚   โ”‚ memory/role_play/    โ”‚
 โ”‚   (ๅญ˜ๆกฃๅŽๅฎซ)  โ”‚   โ”‚   <่ง’่‰ฒ>/ (็‹ฌ็ซ‹่ฎฐๅฟ†)   โ”‚
 โ”‚ โ”œโ”€ natsume/   โ”‚   โ”‚ โ”œโ”€ natsume/*.md      โ”‚
 โ”‚ โ””โ”€ enola/     โ”‚   โ”‚ โ””โ”€ enola/*.md        โ”‚
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  • AGENTS.md stays constant across role switches โ€” ComfyUI / TTS / Live2D instructions are preserved
  • SOUL.md + IDENTITY.md are overwritten on switch; harem archives source of truth
  • Memory per character isolated in memory/role_play/<name>/ โ€” never cross-contaminated
  • SillyTavern character cards imported via PNG tEXt chunk parsing โ†’ auto-switch agent persona

Four Skills, One Brain:

Skill Location Llama Interaction
Live2D skills/live2d/ HTTP API only โ€” never touches llama
TTS skills/tts/ Kill llama โ†’ GPT-SoVITS โ†’ restart + wait /health
ComfyUI skills/comfyui/ Kill llama โ†’ image gen โ†’ restart + wait /health
Sakura skills/sakura/ Shared llama-client; detects down โ†’ auto-resume
Character Importer skills/character_importer/ Agent-level โ€” no GPU needed; writes SOUL/IDENTITY + memory dir

VRAM Orchestration Flow:

  1. Main session receives user request โ†’ assembles command
  2. sessions_spawn(mode="run") creates local model sub-session
  3. Sub-session execs PS script โ†’ stop_llama() kills llama-server
  4. Full 8 GB VRAM freed โ†’ TTS/ComfyUI inference
  5. start_llama() restarts llama-server (~12s load + ~3s warmup)
  6. Live2D remains active during entire cycle โ€” bridge doesn't touch GPU
  7. Sub-session writes .task_flags โ†’ announces back to main session
  8. Main session reads media files โ†’ sends via <qqmedia> / MEDIA:

โš ๏ธ Important Notes

  • RTX 50xx (Blackwell) + CUDA 13.x = munmap_chunk(): invalid pointer crash โ€” CUDA 13.x has known memory management incompatibility with llama.cpp on Blackwell GPUs. Solution: use pre-built llama.cpp binaries compiled with CUDA 12.x (not self-compiled with CUDA 13.x). Download from llama.cpp Releases, choose cudart-llama-bin-win-cuda-12.4-x64.zip. RTX 5070 Ti is fully compatible with CUDA 12.x drivers.
  • Llama-server is offline for ~60โ€“120s during TTS/ComfyUI inference โ€” conversation pauses, but Live2D keeps running
  • Sub-sessions use local model (same as main), DeepSeek as optional fallback
  • Llama-server does not support cross-turn prompt cache reuse (SSM limitation) โ€” use periodic /reset
  • Live2D requires Cubism Core 4 (not 5 or 6) โ€” pixi-live2d-display v0.5.0 is built for Cubism 4 Framework; Core 5+ causes clipping/layer failures
  • All model files protected by .gitignore
  • GPT-SoVITS weights are self-trained and not distributed โ€” train with your own voice data

๐Ÿ™ Credits

About

๐Ÿฉตuncensored Fully offline AI girlfriends harem Openclaw+Local LLM+GPT-SoVITS+ComfyUI image+Live2D+desktop pet+SilllyTavern Character card import| Dual channels for QQ & Telegram | Dynamic 8GB VRAM scheduling, can run offlineๅฎŒๅ…จ็ฆป็บฟ AI ๅฅณๅ‹ๅŽๅฎซ |ๅฐ้พ™่™พ+ๆœฌๅœฐๅคงๆจกๅž‹+GPT-SoVITS+ComfyUI+Live2D+ๆกŒๅฎ +้…’้ฆ†่ง’่‰ฒๅกๅฏผๅ…ฅ|QQ & Telegram ๅŒ้€š้“ | 8GB ๆ˜พๅญ˜ๅŠจๆ€่ฐƒๅบฆๆ–ญ็ฝ‘ๅฏ่ท‘

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors