Skip to content

andausman/convtag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ConvTag

ConvTag is a conversation labeling and active-learning platform for AI teams. It captures every exchange between users and an AI assistant, automatically predicts what type of interaction it was, lets humans correct those predictions, and retrains the classifier on that feedback — closing the loop continuously.

Repository shape

convtag/
├── tagger/          FastAPI backend — storage, LLM, pipeline, training, export
├── chat/            Next.js frontend — chat, review, analytics, settings
├── data/            SQLite database (auto-created) + settings.json + model files
├── tests/           End-to-end backend tests
├── seed_data.csv    Optional bootstrap training data (text, tag)
└── .env.local       Local environment config (copy from .env.example)

System model

  1. A user message creates conversation context.
  2. The assistant (OpenAI or Anthropic) produces a reply.
  3. That reply is automatically tagged by the pipeline.
  4. A reviewer confirms or corrects the label in the Queue.
  5. Reviewed labels become training examples.
  6. One click retrains the embedding classifier — the new model is picked up immediately.

Labeling pipeline

Five stages run in order. The first to accept wins:

Stage What it does
Rule check Keyword + regex patterns from rules.yaml — instant, no model
Embedding classifier LogisticRegression on text embeddings (TF-IDF or OpenAI)
Heuristic Label-keyword matching from labels.py
LLM classifier Zero-shot classification via your configured LLM
Fallback Returns unknown with low confidence

Label taxonomy (20 labels)

Task execution: code, debugging, math, data_analysis, task_completion, instruction, planning, translation

Knowledge: factual_qa, explanation, comparison, summarization, reasoning

Creative & social: creative, opinion, roleplay, conversation

Meta: clarification, refusal, safety

Custom labels can be added in Settings without touching code.

Frontend pages

Page What it shows
Queue Low-confidence outputs sorted for human review; keyboard shortcuts (j/k/1–9)
Playground Live conversation with the assistant; every reply is auto-tagged; load sample data
Import Bulk-ingest turns from JSONL, CSV, or JSON file/paste
Sessions All past sessions with label summaries
Session detail Turn-by-turn review with pipeline trace
Training Retrain the embedding classifier; readiness card; live metrics; version history
Analytics Coverage metrics, label distribution, confidence histogram, activity timeseries
Export Download JSONL or RLHF-format data with label and confidence filters
Settings LLM provider, embeddings, pipeline toggles, thresholds, labels, label activity

Quick start

1. Environment

Copy-Item .env.example .env.local

Edit .env.local — at minimum set one of:

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

2. Run the tagger

cd tagger
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
python -m uvicorn app:app --reload --port 8000

3. Run the chat app

cd chat
npm install
npm run dev

Open http://localhost:3002.

Environment variables

Variable Default Description
OPENAI_API_KEY Required for OpenAI LLM or embeddings
ANTHROPIC_API_KEY Required for Anthropic LLM
LLM_PROVIDER openai openai or anthropic
LLM_MODEL gpt-4.1-mini Model name for chat replies
EMBEDDING_PROVIDER tfidf openai or tfidf (local fallback)
EMBEDDING_MODEL text-embedding-3-small OpenAI embedding model
LLM_CLASSIFIER_ENABLED true Enable/disable the LLM classifier stage
TAGGER_API_KEY Optional API key for the tagger service
TAGGER_DATABASE_PATH data/convtag.db SQLite database path
TAGGER_MODEL_PATH tagger/model.joblib Active classifier model path
TAGGER_MODEL_DIR tagger/models/ Versioned model storage directory
TAGGER_URL http://127.0.0.1:8000 Tagger URL used by the Next.js proxy

All variables can be changed in .env.local. Provider and API key changes require a tagger restart. Pipeline toggles, thresholds, and labels can be changed live from the Settings page without restarting.

Training

Training runs in a background thread on the tagger and completes regardless of whether the browser stays open. The Training page polls for new versions and updates automatically when training finishes.

A held-out test split requires 20+ reviewed examples. With fewer, the model still trains but per-label metrics are not available.

To bootstrap from a CSV before any chat data exists:

.\tagger\.venv\Scripts\Activate.ps1
python tagger\trainer.py seed_data.csv

The CSV must have text and tag columns.

Export formats

  • JSONL — one JSON object per reviewed label: { "text", "label", "source", "session_id", "turn_index" }
  • RLHF — paired turns: { "prompt", "chosen", "rejected", "label" } for reward model training

API routes (tagger)

GET  /health
POST /start_session
POST /message
GET  /api/sessions
GET  /api/session/{session_id}
GET  /api/summary
GET  /api/labels/uncertain
GET  /api/labels/{label_id}
PATCH /api/labels/batch
PATCH /api/labels/{label_id}
POST /api/ingest
POST /api/ingest/sample
GET  /api/training/examples
POST /api/model/train
GET  /api/model/versions
GET  /api/analytics/coverage
GET  /api/analytics/label_stats
GET  /api/analytics/timeseries
GET  /api/analytics/confidence
GET  /api/export/jsonl
GET  /api/export/rlhf
GET  /api/settings
POST /api/settings

Verification

python -m py_compile tagger\app.py tagger\classifier.py tagger\config.py tagger\llm_agent.py tagger\pii.py tagger\storage.py tagger\tagger.py tagger\trainer.py
python -m unittest tests.test_system
cd chat && npm run build

About

Open Source AI Conversation Labeling platform

Topics

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors