LLM Chatbot Backend Framework

A modular, production-ready backend framework for building AI chat applications powered by large language models (LLMs), using FastAPI and dependency injection. This framework supports multiple expert types and incorporates the latest techniques for building scalable chatbot applications.

Features

🤖 AI & LLM Integration

🧠 Multiple Expert Types: Support for different chat modes:
- QNA: Basic question-answering chatbot
- RAG: Retrieval-Augmented Generation with document processing
- DEEPRESEARCH: Advanced research assistant with web search capabilities
🔄 Flexible LLM Support: Multiple LLM providers (OpenAI, Azure OpenAI, LlamaCpp, Vertex AI)
🎯 Streaming Responses: Real-time streaming chat using Server-Sent Events (SSE)
📚 Document Processing: Built-in PDF processing and vector storage for RAG

💬 Conversation & Memory Management

🗄️ Multiple Memory Backends: In-memory, MongoDB, and SQL storage options
🔄 Conversation History: Persistent conversation management with user sessions
🧹 Memory Operations: Clear, retrieve, and manage conversation history

🛠️ Development & Architecture

🚀 FastAPI Integration: Modern, async API with automatic OpenAPI documentation
🔌 Dependency Injection: Clean component management using the Injector pattern
⚡ Async Support: Full asynchronous operation for high-performance applications
🧪 Extensible Tool System: Easy integration of custom tools (web search, etc.)

🔧 Operations & Reliability

📝 Comprehensive Logging: Detailed logging with configurable levels
🔒 Error Handling: Robust error management with custom middleware
🧪 Testing Infrastructure: Complete test suite with async support
🐳 Docker Support: Containerized deployment ready

🖥️ User Interfaces

🖥️ Multiple CLI Modes: Interactive command-line interfaces for different bot types
🎨 Colorized Output: Beautiful terminal interface with colored responses
📊 Streaming CLI: Real-time token-by-token response streaming

Quick Start

Prerequisites

Python 3.12+
MongoDB (optional, for persistent memory)
uv - Fast Python package installer and resolver

Installation

Clone the repository:
```
git clone <repository-url>
cd backend
```

Install dependencies using uv:

# Create and activate virtual environment
uv venv
source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate  # On Windows

# Install dependencies
uv pip install -r requirements.txt

Set up environment variables:

# Core Configuration
MODEL_TYPE=OPENAI  # Options: OPENAI, LLAMA, AZUREOPENAI, VERTEX

# OpenAI Configuration
OPENAI_API_KEY=your_openai_key
BASE_MODEL_NAME=gpt-3.5-turbo

# Azure OpenAI Configuration (if using Azure)
AZURE_CHAT_MODEL_KEY=your_azure_key
AZURE_CHAT_MODEL_VERSION=2024-02-15-preview
AZURE_CHAT_MODEL_ENDPOINT=your_endpoint
AZURE_CHAT_MODEL_DEPLOYMENT=your_deployment

# MongoDB Configuration (optional)
MONGO_URI=mongodb://localhost:27017/chatbot
MONGO_DATABASE=langchain_bot
MONGO_COLLECTION=chatbot

# Vector Database Configuration
VECTOR_DATABASE_TYPE=CHROMA
VECTOR_DATABASE_CHROMA_PATH=./chroma_db

# Embedding Configuration (for RAG)
EMBEDDING_TYPE=AZUREOPENAI
AZURE_EMBEDDING_MODEL_KEY=your_azure_key
AZURE_EMBEDDING_MODEL_ENDPOINT=your_endpoint
AZURE_EMBEDDING_MODEL_DEPLOYMENT=your_deployment
AZURE_EMBEDDING_MODEL_VERSION=2024-02-15-preview

# Tavily Search (for DEEPRESEARCH)
TAVILY_API_KEY=your_tavily_key

# Server Configuration
PORT=8080
HOST=0.0.0.0
LOG_LEVEL=INFO

Running the Application

1. API Server

Start the FastAPI server:

python app.py

The API will be available at http://localhost:8080 with automatic documentation at http://localhost:8080/docs

2. Command Line Interface

The framework provides a unified CLI with three specialized bot modes:

QNA Bot - Basic question-answering:

# Basic usage
python cli.py --mode qna

# With custom model and streaming
python cli.py --mode qna --model azureopenai --stream

RAG Bot - Document-aware chat:

# Basic RAG chat
python cli.py --mode rag

# Process a document and chat about it
python cli.py --mode rag --document path/to/document.pdf --stream

# With custom model
python cli.py --mode rag --model openai --conversation-id doc_session

DeepResearch Bot - Advanced research assistant:

# Research mode with web search capabilities
python cli.py --mode deepresearch

# With streaming and custom model
python cli.py --mode deepresearch --model azureopenai --stream

CLI Features & Commands

All CLI modes support:

exit or quit: Exit the chat
clear: Clear conversation history
Ctrl+C: Force exit
--stream: Enable real-time token streaming
--model: Choose LLM provider (openai, llama, azureopenai)
--conversation-id: Set custom session ID

RAG-specific features:

--document: Process and index documents (PDF, TXT, etc.)
Document question-answering based on uploaded content

DeepResearch-specific features:

Web search integration for current information
Advanced reasoning and research capabilities

API Endpoints

The REST API provides the following endpoints:

Chat Endpoints

POST /api/v1/chat/message - Send a message and get response
POST /api/v1/chat/stream - Stream chat responses (SSE)
DELETE /api/v1/chat/clear - Clear conversation history

Expert Management

GET /api/v1/experts/available - List available expert types
POST /api/v1/experts/switch - Switch between expert types

RAG Operations

POST /api/v1/rag/upload - Upload and process documents
GET /api/v1/rag/documents - List processed documents
DELETE /api/v1/rag/documents/{doc_id} - Delete a document

Health & Status

GET /api/v1/health - Health check endpoint
GET / - API documentation redirect

Project Structure

backend/
├── api/                          # FastAPI application layer
│   ├── app.py                   # FastAPI app factory and configuration
│   ├── middleware/              # Custom middleware (error handling)
│   └── v1/                      # API v1 endpoints
│       ├── chat.py             # Chat endpoints
│       ├── experts.py          # Expert management
│       ├── rag.py              # RAG operations
│       ├── health.py           # Health checks
│       └── models.py           # API request/response models
├── src/                         # Core application logic
│   ├── base/                   # Base components and factories
│   │   ├── brains/             # LLM brain implementations
│   │   ├── components/         # Core components (LLMs, embeddings, etc.)
│   │   └── memories/           # Memory backend implementations
│   ├── experts/                # Expert implementations
│   │   ├── qna/               # Basic Q&A expert
│   │   ├── rag_bot/           # RAG-powered expert
│   │   └── deepresearch_bot/  # Research expert with web search
│   ├── common/                # Shared utilities
│   │   ├── config.py          # Configuration management
│   │   ├── logging.py         # Logging setup
│   │   ├── exceptions.py      # Custom exceptions
│   │   └── schemas.py         # Data models
│   ├── chat_engine.py         # Main chat engine
│   └── config_injector.py     # Dependency injection setup
├── clis/                       # Command line interfaces
│   ├── cli.py                 # QNA bot CLI
│   ├── rag_cli.py             # RAG bot CLI
│   └── deepresearch_cli.py    # DeepResearch bot CLI
├── tests/                      # Test suite
├── docs/                       # Documentation
├── app.py                      # Main application entry point
├── requirements.txt            # Python dependencies
├── pyproject.toml             # Project configuration
├── Dockerfile                 # Docker configuration
└── Makefile                   # Build and deployment scripts

Architecture

The framework follows Clean Architecture principles with clear separation of concerns:

Core Components

Chat Engine: Orchestrates conversations between users and experts
Experts: Specialized AI assistants (QNA, RAG, DeepResearch)
Brains: LLM abstraction layer supporting multiple providers
Memory: Conversation storage with pluggable backends
Tools: External capabilities (web search, document processing)

Dependency Injection

The framework uses the Injector pattern for clean dependency management:

Configuration-driven component initialization
Easy testing with mock dependencies
Modular and extensible architecture

Development

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src

# Run integration tests
pytest -m integration

Code Quality

# Run linting
flake8 .

# Type checking
mypy src/

# Clean up build files
make clean

Docker Support

# Build Docker image
make docker-build

# Run Docker container
make docker-run

# Docker Compose (if available)
docker-compose up

Configuration

The application uses environment variables for configuration. Key settings:

Required Settings

MODEL_TYPE: LLM provider (OPENAI, AZUREOPENAI, LLAMA, VERTEX)
OPENAI_API_KEY or equivalent provider keys

Optional Settings

MONGO_URI: MongoDB connection for persistent memory
VECTOR_DATABASE_TYPE: Vector database type (CHROMA, INMEMORY)
TAVILY_API_KEY: Web search API key for DeepResearch
LOG_LEVEL: Logging verbosity (DEBUG, INFO, WARNING, ERROR)

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes following the existing code style
Add tests for new functionality
Ensure all tests pass (pytest)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow Clean Architecture principles
Write comprehensive tests
Use type hints throughout
Follow existing code style and patterns
Update documentation for new features

License

This project is licensed under the MIT License - see the LICENSE file for details.

Documentation

📚 Detailed documentation is available in the docs/ folder:

API Documentation - Complete API reference and examples
Folder Structure - Project organization and architecture
Expert Switching - Guide to switching between bot types

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
api		api
clis		clis
docs		docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
app.py		app.py
cli.py		cli.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

License

minhbtrc/chatbot-template

Folders and files

Latest commit

History

Repository files navigation

LLM Chatbot Backend Framework

Features

🤖 AI & LLM Integration

💬 Conversation & Memory Management

🛠️ Development & Architecture

🔧 Operations & Reliability

🖥️ User Interfaces

Quick Start

Prerequisites

Installation

Running the Application

1. API Server

2. Command Line Interface

CLI Features & Commands

API Endpoints

Chat Endpoints

Expert Management

RAG Operations

Health & Status

Project Structure

Architecture

Core Components

Dependency Injection

Development

Running Tests

Code Quality

Docker Support

Configuration

Required Settings

Optional Settings

Contributing

Development Guidelines

License

Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages