A secure, multi-provider LLM playground with user authentication that allows you to interact with various AI models, compare responses, and get ratings from different agents.
- User Registration & Login: Secure account creation and authentication
- Session Management: Proper Django session handling with CSRF protection
- Protected Routes: All playground features require authentication
- Modern UI: Beautiful login/register pages with gradient backgrounds
- Password Validation: Django's built-in password strength requirements
- Multi-Provider Support: Connect with multiple LLM providers (OpenAI, Anthropic, Gemini, DeepSeek, Groq, and more)
- Model Selection: Choose specific models for each provider
- Response Comparison: Compare responses from different agents side-by-side
- Follow-Up Conversations: Continue conversations with individual agents directly from result cards
- Context-Aware Threads: Each follow-up includes full conversation history for coherent multi-turn dialogues
- Rating System: Get automatic ratings from other agents on response quality
- Per-Rater Breakdown: See which agents rated each response and their individual scores
- Async Job Processing: Non-blocking concurrent requests to multiple providers
- Real-time Updates: Stream results as they arrive from different providers
- Modern UI: Clean, responsive interface with Markdown rendering and smooth interactions
- Django Integration: Built with Django for easy deployment and management
- User Credits: Each user starts with 10 free credits
- Pay-per-Query: 1 credit deducted per successful query (including follow-ups)
- Credit Packages: Purchase additional credits (Starter, Professional, Enterprise)
- Credit Tracking: Real-time credit balance display
- Query History: Database storage of all queries and results with API-based retrieval
- Database-Backed History: All queries stored in database, accessible across devices
- API-Based Retrieval: History fetched via REST API instead of localStorage
- Configurable Limit: View last 10, 25, 50, or 100 queries
- Full Query Context: Access to original messages, guides, providers, and results
- Clear History: Delete all or individual queries from database
- Auto-Save: Automatically saves queries with results to database
- Comprehensive Tests: 60+ test cases covering all functionality
- Authentication Tests: Login, registration, session management
- Agent System Tests: Rating calculations, provider mappings
- Follow-Up Conversation Tests: Context history, credit deduction, multi-turn dialogues
- Integration Tests: End-to-end user flows
- Performance Tests: Load time and response time validation
- Security Tests: CSRF protection, authentication requirements
| Provider | Status | Models |
|---|---|---|
| OpenAI | β | GPT-4, GPT-3.5, O1, O3, and more |
| Anthropic | β | Claude 3.7, Claude 3.5, Claude 4, and more |
| Gemini | β | Gemini 2.5 Flash, Gemini 2.0 Flash, Gemini 1.5 Pro/Flash |
| DeepSeek | β | DeepSeek Chat, DeepSeek Reasoner |
| Perplexity | β | Sonar models |
| Mistral | β | Mixtral, Mistral-large |
| Groq | β | Llama, Mixtral, Gemma models |
| Azure OpenAI | βοΈ | GPT-4o, GPT-4o-mini |
| Cohere | βοΈ | Command models |
| Together AI | βοΈ | Together-large, Together-medium |
| AI21 | βοΈ | Jurassic models |
| Bedrock | βοΈ | Amazon Titan, Bedrock-text |
| HuggingFace | βοΈ | GPT-2, GPT-J, BLOOM, MPT |
| Vertex AI | βοΈ | Text-bison, Code-bison |
β = Enabled by default, βοΈ = Available but disabled by default
- Python 3.8+
- Django 5.2.6
- SQLite (included with Django)
- Required packages: anthropic, openai, google-generativeai, perplexityai, and more (see requirements.txt)
- Clone the repository:
git clone <repository-url>
cd agents- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Configure API keys:
# Edit config.py with your API keys
# Required keys for enabled providers:
OPENAI_API_KEY = "your-openai-key"
ANTHROPIC_API_KEY = "your-anthropic-key"
GEMINI_API_KEY = "your-gemini-key"
DEEPSEEK_API_KEY = "your-deepseek-key"
PERPLEXITY_API_KEY = "your-perplexity-key"
MISTRAL_API_KEY = "your-mistral-key"
GROQ_API_KEY = "your-groq-key"- Run database migrations:
python manage.py migrate- Create a superuser (optional):
python manage.py createsuperuser- Start the server:
python manage.py runserver- Access the application:
- Open your browser to
http://localhost:8000/api/login/ - Register a new account or login with existing credentials
- Start using the playground!
- Open your browser to
Edit config.py to add your API keys:
# Required for enabled providers
OPENAI_API_KEY = "your-openai-key"
ANTHROPIC_API_KEY = "your-anthropic-key"
GEMINI_API_KEY = "your-gemini-key"
DEEPSEEK_API_KEY = "your-deepseek-key"
PERPLEXITY_API_KEY = "your-perplexity-key"
MISTRAL_API_KEY = "your-mistral-key"
GROQ_API_KEY = "your-groq-key"
# Optional providers (disabled by default)
AZURE_OPENAI_KEY = "your-azure-key"
AZURE_OPENAI_ENDPOINT = "your-azure-endpoint"
COHERE_API_KEY = "your-cohere-key"
TOGETHER_API_KEY = "your-together-key"
AI21_API_KEY = "your-ai21-key"
BEDROCK_API_KEY = "your-bedrock-key"
HF_API_TOKEN = "your-huggingface-token"
VERTEX_PROJECT = "your-vertex-project"
VERTEX_LOCATION = "your-vertex-location"Key authentication settings in agents_django/settings.py:
# Authentication URLs
LOGIN_URL = '/api/login/'
LOGIN_REDIRECT_URL = '/api/playground/'
LOGOUT_REDIRECT_URL = '/api/login/'
# Password validation
AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
# ... more validators
]-
Register an Account:
- Go to
http://localhost:8000/api/register/ - Fill in username and password
- Password must be at least 8 characters with complexity requirements
- Go to
-
Login:
- Use your credentials to login at
http://localhost:8000/api/login/ - You'll be redirected to the playground
- Use your credentials to login at
-
Use the Playground:
- Check your credits displayed in the top-right corner
- Enter your message in the main textarea
- Optional: Add a guide to steer the response
- Select providers you want to query (or leave empty for all enabled)
- Enable rating if you want other agents to rate the responses
- Click "Ask Agents" to get responses (costs 1 credit per query)
- View results as they stream in from different providers
- Continue conversations: Use the follow-up input at the bottom of each result card to continue chatting with that specific agent (costs 1 credit per follow-up)
-
Purchase Credits:
- Go to
http://localhost:8000/api/buy-tokens/ - Choose a package: Starter (10 credits), Professional (50 credits), or Enterprise (200 credits)
- Credits are added to your account immediately
- Go to
- Each result card has a follow-up input section at the bottom
- Type your follow-up message and click "Send" to continue the conversation with that specific agent
- The agent receives full conversation history (original response + all previous exchanges) as context
- Each follow-up costs 1 credit and updates your balance in real-time
- Multiple follow-ups create a threaded conversation within the result card
- Visual separations clearly distinguish between exchanges
- Click "Edit models" to choose specific models for each selected provider
- Compare different models from the same provider
When "Enable rating" is checked:
- Other enabled agents will rate each response
- You'll see an overall rating (1-10 scale)
- Detailed breakdown shows which agents rated and their individual scores
- Each rater's model and weight are displayed
- Enabled agents appear normally in the dropdown
- Disabled agents are greyed out and italicized
- Visual indicators help you understand provider availability
- Authentication System: Django's built-in auth with custom views
- Credit System: User profiles with credit tracking and purchase flow
- Agent Registry: Manages all agent instances and their connections
- Agent Classes: Individual implementations for each provider (14 providers)
- Rating System: Concurrent rating by multiple agents with weighted scores
- Async Job System: Thread-based concurrent provider requests
- Database Models: PlaygroundQuery and ProviderResult for query history
- Django Views: Web interface and API endpoints
- Templates: Responsive frontend with Markdown rendering and modern CSS
- CSRF Protection: All forms protected against CSRF attacks
- Session Security: Proper Django session management
- Password Validation: Strong password requirements
- Route Protection: All playground routes require authentication
- Input Sanitization: Django's built-in protection against XSS
Each provider has its own Agent class that:
- Handles API communication
- Implements the
_ask()method for queries - Provides model selection via
MODELSLiteral - Can act as both a responder and a rater
- User submits query via
/api/playground/start/ - Server creates a job with unique
job_idand spawns threads for each provider - Client polls
/api/playground/status/<job_id>/for updates - Results stream back as each provider completes
- When all providers finish, 1 credit is deducted
- Query and results are saved to database
- User types follow-up message in result card's input area
- JavaScript builds conversation history from original response + all previous exchanges
- History is formatted and sent as
guideparameter to/api/playground/start/ - Single agent processes the follow-up with full context
- Client polls
/api/playground/status/<job_id>/for the response - Response is displayed inline within the result card
- 1 credit is deducted and balance updates in real-time
- Conversation thread grows with each exchange
- User submits query with
rated=True - Each agent generates its response
- Other enabled agents rate each response concurrently
- Ratings are weighted based on provider quality (OpenAI: 1.10, Anthropic: 1.05, etc.)
- Detailed breakdown with rater names, models, and weights is included
The project includes a comprehensive test suite with 60+ test cases:
# Run all tests
python manage.py test
# Run specific test file
python manage.py test tests.test_auth
# Run specific test method
python manage.py test tests.test_auth.AuthenticationTestCase.test_login_page_loads
# Run with coverage
python run_tests.py coverage
# Run specific test categories
python run_tests.py auth # Authentication tests
python run_tests.py playground # Playground tests
python run_tests.py agents # Agent system teststests/
βββ test_auth.py # Authentication & user management
βββ test_playground.py # Playground functionality
βββ test_rating_system.py # Rating system & agents
βββ test_followup_conversations.py # Follow-up conversation feature
βββ test_api_providers.py # API provider tests
βββ test_async_utils.py # Async utility tests
βββ __init__.py # Test configuration
βββ conftest.py # Pytest configuration
- Authentication Tests: Login, registration, logout, session management
- Playground Tests: UI functionality, agent interactions, error handling
- Agent System Tests: Rating calculations, provider mappings, mock responses
- Follow-Up Conversation Tests: Context history building, credit deduction, multi-turn dialogues, UI elements
- Integration Tests: Complete user flows, cross-component interactions
- Performance Tests: Load times, response times
- Security Tests: CSRF protection, authentication requirements
Current test coverage includes:
- β Login/logout functionality
- β User registration with validation
- β Protected route access
- β Agent system initialization
- β Rating calculations
- β Error handling
- β UI component rendering
- Create Agent Class:
# agents/agent_newprovider.py
from __future__ import annotations
from .agent import Agent
from .annotations import (
ActorRole, AgentResponse, AnnotatedQueryWithGuide,
CleanAgentResponse, Literal, QueryGuide
)
from config import NEWPROVIDER_API_KEY
class NewProvider(Agent):
MODELS = Literal['model-1', 'model-2', 'model-3']
# Initialize the provider's client
self.client = ProviderClient(api_key=NEWPROVIDER_API_KEY)
def __init__(self,
model: MODELS | None = None,
role: ActorRole | None = None,
is_on: bool = False,
rating_weight: float = 1.0) -> None:
super().__init__(
model=model or 'model-1',
role=role or 'user',
is_on=is_on,
rating_weight=rating_weight
)
def _ask(self,
/,
message: AnnotatedQueryWithGuide,
guide: QueryGuide | None = None) -> AgentResponse:
"""Send request to the provider and return raw response."""
def _clean_response(self, resp: AgentResponse) -> CleanAgentResponse:
"""Extract text content from the raw response."""- Add to Registry:
# api/views.py
from agents.agent_newprovider import NewProvider
# Add to agents list
NewProvider(is_on=True, rating_weight=1.0),- Add Tests:
# tests/test_playground.py
@patch('agents.agent_newprovider.NewProvider._ask')
def test_new_provider_response(self, mock_ask):
mock_ask.return_value = "Test response"
# Test implementation- Follow PEP 8 guidelines
- Use type hints where appropriate
- Write comprehensive tests for new features
- Document all public methods and classes
POST /api/login/ # User login
POST /api/register/ # User registration
POST /api/logout/ # User logout
GET /api/playground/ # Main playground interface
POST /api/playground/start/ # Start async job
GET /api/playground/status/<job_id>/ # Poll for results
GET /api/playground/history/ # Fetch query history
POST /api/playground/history/clear/ # Clear all history
POST /api/playground/history/delete/<id>/ # Delete specific query
POST /api/playground/start/ Parameters:
message(required): Your queryguide(optional): Guide for steering responsesagents[](optional): Selected providers (format: "provider::model" or "provider")rated(optional): Enable rating system
Response:
{
"job_id": "abc123..."
}GET /api/playground/status/<job_id>/ Response:
{
"results": [
{
"provider": "openai",
"model": "gpt-4",
"text": "Response text...",
"rating": 8.5,
"rating_breakdown": [
{
"rater": "Anthropic",
"rater_model": "claude-3-5-sonnet",
"rating": 8,
"weight": 1.05
}
],
"rated_by": ["Anthropic", "Groq"],
"time": 2.34
}
],
"done": false,
"credits": 9
}GET /api/playground/history/ Parameters:
limit(optional): Number of queries to retrieve (default: 25, max: 100)
Response:
{
"history": [
{
"id": 123,
"timestamp": "2026-01-10T18:00:00Z",
"message": "What is AI?",
"guide": "",
"providers": [{"provider": "openai", "model": "gpt-4"}],
"providerCount": 1,
"results": [
{
"provider": "openai",
"model": "gpt-4",
"text": "AI is...",
"rating": 8.5,
"time": 2.34
}
],
"completed": true
}
]
}POST /api/playground/history/clear/ Response:
{
"success": true,
"deleted": 15
}POST /api/playground/history/delete/<query_id>/ Response:
{
"success": true
}GET/POST /api/buy-tokens/ # Purchase credit packages
-
"No enabled/configured providers selected"
- Check your API keys in
config.py - Ensure providers are enabled in
api/views.py(setis_on=True) - Verify API keys are valid and not placeholder "xxx" values
- Check your API keys in
-
"Page not found at /api/"
- Ensure Django server is running
- Check URL configuration in
agents_django/urls.py
-
"NoReverseMatch" errors
- Restart Django development server after URL changes
- Check template URL references
-
Registration password errors
- Password must be at least 8 characters
- Cannot be too similar to username
- Cannot be a common password
- Cannot be entirely numeric
-
"You have no credits left"
- Purchase credits at
/api/buy-tokens/ - Or manually add credits via Django admin
- Purchase credits at
-
Test failures
- Ensure test database is properly created
- Check Django settings for test configuration
- Verify all dependencies are installed
-
Async job not completing
- Check server logs for provider errors
- Verify API keys are valid
- Some providers may timeout or fail silently
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with Django and modern web technologies
- Inspired by the need to compare multiple LLM providers
- Thanks to all the AI providers for their amazing APIs
- Authentication system powered by Django's built-in auth framework
- Live Demo:
http://localhost:8000/api/login/ - Documentation: This README
- Issue Tracker: GitHub Issues
- API Reference: See API section above
Made with β€οΈ for the AI community