Skip to content

AutismAccelerator/bubble

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bubble

A memory system lets your chatbot truly know you and maybe themselves.

Core Idea

  1. Intensity gating. Not everything is equally worth remembering. The more trivial the more accumulation needed in order to form a memory entry.
  2. Input decomposition. Input is decomposed into atomic segments that preserve their original meaning, only highly related segments will cluster to form episodes which provide high precision retrieval.
  3. Replayability. Human memory functions like an event-sourcing system, under the same experience, nearly identical personality would emerge again.
  4. Lightweight and minimal LLM calls.
    paper

How it works

[ raw input ]
                               │
                         ┌─────▼─────┐
                         │ decompose │
                         └─────┬─────┘
                               │
              ┌────────────────┴────────────────┐
          ι ≥ θ                             ι < θ
              │                                 │
        vivid signal                       weak signal
              │                                 │
         ┌────▼────┐                     ┌──────▼──────┐
         │ archive │                     │    pool     │
         └────┬────┘                     │  · · · · ·  │
              │                          │ ·  · ·  · · │
              │                          │  · · · ·  · │
              │                          └──────┬──────┘
              │                                 │
              │                          enough gathered?
              │                                 │
              │                    no ──────────┘
              │                                 │ yes
              │                          ┌──────▼──────┐
              │                          │   cluster   │
              │                          │   + score   │
              │                          └──────┬──────┘
              │                                 │
              └──────────────┬──────────────────┘
                             │
                       ┌─────▼─────┐
                       │  episode  │  immutable
                       └─────┬─────┘
                             │
                     same topic chain?(NLI)
                      yes │       │ no
                          │       │
               ┌──────────▼─┐   ┌▼────────────┐
   joins chain │ ... ──► e  │   │     e       │  new chain
               └──────────┬─┘   └─────┬───────┘
                          │           │
                    ┌─────▼───────────▼─────┐
                    │       snapshot        │
                    │  centroid  │  summary │ 
                    │  (eager)   │  (lazy)  │
                    └───────────┬───────────┘
                                │
                           [ retrieve ]
                                │
               ┌────────────────┴─────────────────┐
           default                            verbose
               │                                  │
         snapshot summary             with episode chain + labels

Setup

1.run Falkordb

docker run -e REDIS_ARGS="--appendonly yes --appendfsync everysec" -v <PATH>:/var/lib/falkordb/data -p 3000:3000 -p 6379:6379 -d --name falkordb falkordb/falkordb

2.embedding model(Matryoshka)

note: command below is cpu version

docker run --name tei-embedding -d -p 8997:80 -v <PATH>:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-latest --model-id nomic-ai/nomic-embed-text-v1.5

Or embedding cloud api

3.NLI model(Optional, but recommended, saves some LLM calls)

note: command below is cpu version

docker run --name tei-nli -d -p 8999:80 -v <PATH>:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-latest --model-id cross-encoder/nli-deberta-v3-small

Necessary configuration in your .env file

# Provider to use: anthropic | openai | gemini  (default: anthropic)
BUBBLE_LLM_PROVIDER=anthropic

# Anthropic (install: pip install 'bubble-memory[anthropic]')
ANTHROPIC_API_KEY=

# OpenAI-compatible — works with OpenAI, DeepSeek, Groq, Ollama, etc.
# (install: pip install 'bubble-memory[openai]')
# OPENAI_API_KEY=
# OPENAI_BASE_URL=https://api.openai.com/v1

# Google Gemini (install: pip install 'bubble-memory[gemini]')
# GEMINI_API_KEY=

FALKORDB_HOST=localhost
FALKORDB_PORT=6379

BUBBLE_EMBED_DIM=768
BUBBLE_EMBED_ENDPOINT=http://localhost:8997/v1/embeddings

#If you have NLI setup
BUBBLE_ENABLE_NLI=true
BUBBLE_NLI_ENDPOINT=http://localhost:8999/predict

How to use (extremely easy and clean)

installation

pip install bubble-memory
or
uv add bubble-memory

ingest

import bubble
bubble.process(user_id, content, prior)

prior: the context of the content, for example prior messages

retrieve

import bubble
memory_user = await bubble.retrieve(user_id, query)

Replayability

Memory episodes are archived in <project root>/data/archive as jsonl
You can reconstruct your memory graph by a single command. No LLM call.

python -m bubble.main replay <user_id>

Tuning/Customization

See .env.example for ALL tunable arguments.

Limitations

Doesn't support long-context memory. It focuses on facts/preferences, and personality consolidation.
Current promotion formula, tunable variables might not be the best.
I use [Name] to identify speaker in group chat. Swap out bubble.llm.prompts.DECOMPOSE_SYSTEM if it doesn't fit your use case.

About

AI agent memory distillation pipeline based on event sourcing architecture and graph database.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages