A lightweight DistilBERT classifier that decides what an AI assistant should remember — and what it should forget.
Warning: In LM Studio do not use a reasoning model. Reasoning models may break the system.
Tested and verified on Ubuntu.
Most AI assistants treat all conversation turns equally. MemoryGate filters them by importance, so only meaningful information gets stored in long-term memory — things like medical details, deadlines, passwords, and personal events — while casual small talk and trivia are quietly discarded.
MemoryGate is a three-stage pipeline:
- Generate — Uses a local LLM via LM Studio to produce labelled training examples across high and low importance conversation topics
- Train — Fine-tunes a DistilBERT classifier on that data to score each conversation turn
- Run — Runs the trained model in real time to decide what the assistant should save to its memory
High importance (label = 1)
- Deaths, grief, family emergencies, personal trauma
- Passwords, API keys, PINs, access tokens
- Medical diagnoses, prescriptions, allergies, surgery dates
- Legal contracts, compliance deadlines, court dates
- Financial decisions, bank details, tax deadlines
- Project deadlines, stakeholder agreements, production credentials
Low importance (label = 0)
- Casual greetings and small talk
- General trivia and history facts
- Creative requests like jokes or poems
- Simple definitions and basic questions
- Movie or food recommendations
- Python 3.10 (via Anaconda recommended)
- A CUDA-capable GPU is recommended for training (CPU fallback is supported)
- LM Studio running locally with a model loaded (always needed for run_memory.py and generate_training_data.py)
Clone the repository:
git clone https://github.com/ErenalpCet/MemoryGate.git
cd MemoryGate
Create and activate a Python 3.10 environment with Anaconda:
conda create -n memorygate python=3.10
conda activate memorygate
Install dependencies:
pip install -r requirements.txt
This will automatically install PyTorch with CUDA 12.6 support. If you are on CPU only, replace the --index-url line in requirements.txt with the standard PyPI version.
Set up your environment variables by copying the example file:
cp .env.example .env
Then open .env and adjust the settings if needed.
Make sure LM Studio is running with a model loaded, then run:
python generate_training_data.py
This produces conversation_data.jsonl with balanced high and low importance examples.
python train_model.py
The best checkpoint is saved to ./importance_model/ based on validation loss.
python run_memory.py
MemoryGate/
├── generate_training_data.py # Synthetic data generation via LM Studio
├── train_model.py # DistilBERT fine-tuning pipeline
├── run_memory.py # Runtime memory filtering
├── conversation_data.jsonl # Generated training data (git ignored)
├── importance_model/ # Saved model weights (git ignored)
├── .env.example # Environment variable template
└── requirements.txt
Key settings in train_model.py:
| Setting | Default | Description |
|---|---|---|
model_name |
distilbert-base-uncased | Base transformer model |
batch_size |
32 | Adjust based on available VRAM |
epochs |
6 | Training epochs |
importance_threshold |
0.60 | Deployment classification threshold |
use_amp |
True | Mixed precision, recommended for CUDA |
This project is licensed under the GNU Affero General Public License v3.0.
Any project that uses MemoryGate — including over a network or API — must also be released under AGPL-3.0. See the LICENSE file for full details.
ErenalpCet — Erenalp Çetintürk