Agents Playground

A secure, multi-provider LLM playground with user authentication that allows you to interact with various AI models, compare responses, and get ratings from different agents.

🚀 Features

🔐 Authentication System

User Registration & Login: Secure account creation and authentication
Session Management: Proper Django session handling with CSRF protection
Protected Routes: All playground features require authentication
Modern UI: Beautiful login/register pages with gradient backgrounds
Password Validation: Django's built-in password strength requirements

🤖 AI Features

Multi-Provider Support: Connect with multiple LLM providers (OpenAI, Anthropic, Gemini, DeepSeek, Groq, and more)
Model Selection: Choose specific models for each provider
Response Comparison: Compare responses from different agents side-by-side
Follow-Up Conversations: Continue conversations with individual agents directly from result cards
Context-Aware Threads: Each follow-up includes full conversation history for coherent multi-turn dialogues
Rating System: Get automatic ratings from other agents on response quality
Per-Rater Breakdown: See which agents rated each response and their individual scores
Async Job Processing: Non-blocking concurrent requests to multiple providers
Real-time Updates: Stream results as they arrive from different providers
Modern UI: Clean, responsive interface with Markdown rendering and smooth interactions
Django Integration: Built with Django for easy deployment and management

💳 Credit System

User Credits: Each user starts with 10 free credits
Pay-per-Query: 1 credit deducted per successful query (including follow-ups)
Credit Packages: Purchase additional credits (Starter, Professional, Enterprise)
Credit Tracking: Real-time credit balance display
Query History: Database storage of all queries and results with API-based retrieval

📜 History Management

Database-Backed History: All queries stored in database, accessible across devices
API-Based Retrieval: History fetched via REST API instead of localStorage
Configurable Limit: View last 10, 25, 50, or 100 queries
Full Query Context: Access to original messages, guides, providers, and results
Clear History: Delete all or individual queries from database
Auto-Save: Automatically saves queries with results to database

🧪 Testing Suite

Comprehensive Tests: 60+ test cases covering all functionality
Authentication Tests: Login, registration, session management
Agent System Tests: Rating calculations, provider mappings
Follow-Up Conversation Tests: Context history, credit deduction, multi-turn dialogues
Integration Tests: End-to-end user flows
Performance Tests: Load time and response time validation
Security Tests: CSRF protection, authentication requirements

📋 Supported Providers

Provider	Status	Models
OpenAI	✅	GPT-4, GPT-3.5, O1, O3, and more
Anthropic	✅	Claude 3.7, Claude 3.5, Claude 4, and more
Gemini	✅	Gemini 2.5 Flash, Gemini 2.0 Flash, Gemini 1.5 Pro/Flash
DeepSeek	✅	DeepSeek Chat, DeepSeek Reasoner
Perplexity	✅	Sonar models
Mistral	✅	Mixtral, Mistral-large
Groq	✅	Llama, Mixtral, Gemma models
Azure OpenAI	⚙️	GPT-4o, GPT-4o-mini
Cohere	⚙️	Command models
Together AI	⚙️	Together-large, Together-medium
AI21	⚙️	Jurassic models
Bedrock	⚙️	Amazon Titan, Bedrock-text
HuggingFace	⚙️	GPT-2, GPT-J, BLOOM, MPT
Vertex AI	⚙️	Text-bison, Code-bison

✅ = Enabled by default, ⚙️ = Available but disabled by default

🛠️ Installation

Prerequisites

Python 3.8+
Django 5.2.6
SQLite (included with Django)
Required packages: anthropic, openai, google-generativeai, perplexityai, and more (see requirements.txt)

Quick Start

Clone the repository:

git clone <repository-url>
cd agents

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Configure API keys:

# Edit config.py with your API keys
# Required keys for enabled providers:
OPENAI_API_KEY = "your-openai-key"
ANTHROPIC_API_KEY = "your-anthropic-key"
GEMINI_API_KEY = "your-gemini-key"
DEEPSEEK_API_KEY = "your-deepseek-key"
PERPLEXITY_API_KEY = "your-perplexity-key"
MISTRAL_API_KEY = "your-mistral-key"
GROQ_API_KEY = "your-groq-key"

Run database migrations:

python manage.py migrate

Create a superuser (optional):

python manage.py createsuperuser

Start the server:

python manage.py runserver

Access the application:
- Open your browser to http://localhost:8000/api/login/
- Register a new account or login with existing credentials
- Start using the playground!

⚙️ Configuration

API Keys Setup

Edit config.py to add your API keys:

# Required for enabled providers
OPENAI_API_KEY = "your-openai-key"
ANTHROPIC_API_KEY = "your-anthropic-key"
GEMINI_API_KEY = "your-gemini-key"
DEEPSEEK_API_KEY = "your-deepseek-key"
PERPLEXITY_API_KEY = "your-perplexity-key"
MISTRAL_API_KEY = "your-mistral-key"
GROQ_API_KEY = "your-groq-key"

# Optional providers (disabled by default)
AZURE_OPENAI_KEY = "your-azure-key"
AZURE_OPENAI_ENDPOINT = "your-azure-endpoint"
COHERE_API_KEY = "your-cohere-key"
TOGETHER_API_KEY = "your-together-key"
AI21_API_KEY = "your-ai21-key"
BEDROCK_API_KEY = "your-bedrock-key"
HF_API_TOKEN = "your-huggingface-token"
VERTEX_PROJECT = "your-vertex-project"
VERTEX_LOCATION = "your-vertex-location"

Django Settings

Key authentication settings in agents_django/settings.py:

# Authentication URLs
LOGIN_URL = '/api/login/'
LOGIN_REDIRECT_URL = '/api/playground/'
LOGOUT_REDIRECT_URL = '/api/login/'

# Password validation
AUTH_PASSWORD_VALIDATORS = [
    {
        'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
    },
    # ... more validators
]

📖 Usage

Getting Started

Register an Account:
- Go to http://localhost:8000/api/register/
- Fill in username and password
- Password must be at least 8 characters with complexity requirements
Login:
- Use your credentials to login at http://localhost:8000/api/login/
- You'll be redirected to the playground
Use the Playground:
- Check your credits displayed in the top-right corner
- Enter your message in the main textarea
- Optional: Add a guide to steer the response
- Select providers you want to query (or leave empty for all enabled)
- Enable rating if you want other agents to rate the responses
- Click "Ask Agents" to get responses (costs 1 credit per query)
- View results as they stream in from different providers
- Continue conversations: Use the follow-up input at the bottom of each result card to continue chatting with that specific agent (costs 1 credit per follow-up)
Purchase Credits:
- Go to http://localhost:8000/api/buy-tokens/
- Choose a package: Starter (10 credits), Professional (50 credits), or Enterprise (200 credits)
- Credits are added to your account immediately

Advanced Features

Follow-Up Conversations

Each result card has a follow-up input section at the bottom
Type your follow-up message and click "Send" to continue the conversation with that specific agent
The agent receives full conversation history (original response + all previous exchanges) as context
Each follow-up costs 1 credit and updates your balance in real-time
Multiple follow-ups create a threaded conversation within the result card
Visual separations clearly distinguish between exchanges

Model Selection

Click "Edit models" to choose specific models for each selected provider
Compare different models from the same provider

Rating System

When "Enable rating" is checked:

Other enabled agents will rate each response
You'll see an overall rating (1-10 scale)
Detailed breakdown shows which agents rated and their individual scores
Each rater's model and weight are displayed

Provider Status

Enabled agents appear normally in the dropdown
Disabled agents are greyed out and italicized
Visual indicators help you understand provider availability

🏗️ Architecture

Core Components

Authentication System: Django's built-in auth with custom views
Credit System: User profiles with credit tracking and purchase flow
Agent Registry: Manages all agent instances and their connections
Agent Classes: Individual implementations for each provider (14 providers)
Rating System: Concurrent rating by multiple agents with weighted scores
Async Job System: Thread-based concurrent provider requests
Database Models: PlaygroundQuery and ProviderResult for query history
Django Views: Web interface and API endpoints
Templates: Responsive frontend with Markdown rendering and modern CSS

Security Features

CSRF Protection: All forms protected against CSRF attacks
Session Security: Proper Django session management
Password Validation: Strong password requirements
Route Protection: All playground routes require authentication
Input Sanitization: Django's built-in protection against XSS

Agent System

Each provider has its own Agent class that:

Handles API communication
Implements the _ask() method for queries
Provides model selection via MODELS Literal
Can act as both a responder and a rater

Async Job Flow

User submits query via /api/playground/start/
Server creates a job with unique job_id and spawns threads for each provider
Client polls /api/playground/status/<job_id>/ for updates
Results stream back as each provider completes
When all providers finish, 1 credit is deducted
Query and results are saved to database

Follow-Up Conversation Flow

User types follow-up message in result card's input area
JavaScript builds conversation history from original response + all previous exchanges
History is formatted and sent as guide parameter to /api/playground/start/
Single agent processes the follow-up with full context
Client polls /api/playground/status/<job_id>/ for the response
Response is displayed inline within the result card
1 credit is deducted and balance updates in real-time
Conversation thread grows with each exchange

Rating Flow

User submits query with rated=True
Each agent generates its response
Other enabled agents rate each response concurrently
Ratings are weighted based on provider quality (OpenAI: 1.10, Anthropic: 1.05, etc.)
Detailed breakdown with rater names, models, and weights is included

🧪 Testing

Running Tests

The project includes a comprehensive test suite with 60+ test cases:

# Run all tests
python manage.py test

# Run specific test file
python manage.py test tests.test_auth

# Run specific test method
python manage.py test tests.test_auth.AuthenticationTestCase.test_login_page_loads

# Run with coverage
python run_tests.py coverage

# Run specific test categories
python run_tests.py auth      # Authentication tests
python run_tests.py playground  # Playground tests
python run_tests.py agents      # Agent system tests

Test Structure

tests/
├── test_auth.py                    # Authentication & user management
├── test_playground.py              # Playground functionality
├── test_rating_system.py           # Rating system & agents
├── test_followup_conversations.py  # Follow-up conversation feature
├── test_api_providers.py           # API provider tests
├── test_async_utils.py             # Async utility tests
├── __init__.py                     # Test configuration
└── conftest.py                     # Pytest configuration

Test Categories

Authentication Tests: Login, registration, logout, session management
Playground Tests: UI functionality, agent interactions, error handling
Agent System Tests: Rating calculations, provider mappings, mock responses
Follow-Up Conversation Tests: Context history building, credit deduction, multi-turn dialogues, UI elements
Integration Tests: Complete user flows, cross-component interactions
Performance Tests: Load times, response times
Security Tests: CSRF protection, authentication requirements

Test Results

Current test coverage includes:

✅ Login/logout functionality
✅ User registration with validation
✅ Protected route access
✅ Agent system initialization
✅ Rating calculations
✅ Error handling
✅ UI component rendering

🔧 Development

Adding a New Provider

Create Agent Class:

# agents/agent_newprovider.py
from __future__ import annotations

from .agent import Agent
from .annotations import (
    ActorRole, AgentResponse, AnnotatedQueryWithGuide, 
    CleanAgentResponse, Literal, QueryGuide
)
from config import NEWPROVIDER_API_KEY

class NewProvider(Agent):
    MODELS = Literal['model-1', 'model-2', 'model-3']
    
    # Initialize the provider's client
    self.client = ProviderClient(api_key=NEWPROVIDER_API_KEY)

    def __init__(self,
                 model: MODELS | None = None,
                 role: ActorRole | None = None,
                 is_on: bool = False,
                 rating_weight: float = 1.0) -> None:
        
        super().__init__(
            model=model or 'model-1',
            role=role or 'user',
            is_on=is_on,
            rating_weight=rating_weight
        )
    
    def _ask(self,
             /,
             message: AnnotatedQueryWithGuide,
             guide: QueryGuide | None = None) -> AgentResponse:
        """Send request to the provider and return raw response."""
    
    def _clean_response(self, resp: AgentResponse) -> CleanAgentResponse:
        """Extract text content from the raw response."""

Add to Registry:

# api/views.py
from agents.agent_newprovider import NewProvider

# Add to agents list
NewProvider(is_on=True, rating_weight=1.0),

Add Tests:

# tests/test_playground.py
@patch('agents.agent_newprovider.NewProvider._ask')
def test_new_provider_response(self, mock_ask):
    mock_ask.return_value = "Test response"
    # Test implementation

Code Style

Follow PEP 8 guidelines
Use type hints where appropriate
Write comprehensive tests for new features
Document all public methods and classes

📚 API Reference

Authentication Endpoints

POST /api/login/          # User login
POST /api/register/       # User registration
POST /api/logout/         # User logout

Playground Endpoints

GET  /api/playground/                        # Main playground interface
POST /api/playground/start/                  # Start async job
GET  /api/playground/status/<job_id>/        # Poll for results
GET  /api/playground/history/                # Fetch query history
POST /api/playground/history/clear/          # Clear all history
POST /api/playground/history/delete/<id>/    # Delete specific query

POST /api/playground/start/ Parameters:

message (required): Your query
guide (optional): Guide for steering responses
agents[] (optional): Selected providers (format: "provider::model" or "provider")
rated (optional): Enable rating system

Response:

{
  "job_id": "abc123..."
}

GET /api/playground/status/<job_id>/ Response:

{
  "results": [
    {
      "provider": "openai",
      "model": "gpt-4",
      "text": "Response text...",
      "rating": 8.5,
      "rating_breakdown": [
        {
          "rater": "Anthropic",
          "rater_model": "claude-3-5-sonnet",
          "rating": 8,
          "weight": 1.05
        }
      ],
      "rated_by": ["Anthropic", "Groq"],
      "time": 2.34
    }
  ],
  "done": false,
  "credits": 9
}

GET /api/playground/history/ Parameters:

limit (optional): Number of queries to retrieve (default: 25, max: 100)

Response:

{
  "history": [
    {
      "id": 123,
      "timestamp": "2026-01-10T18:00:00Z",
      "message": "What is AI?",
      "guide": "",
      "providers": [{"provider": "openai", "model": "gpt-4"}],
      "providerCount": 1,
      "results": [
        {
          "provider": "openai",
          "model": "gpt-4",
          "text": "AI is...",
          "rating": 8.5,
          "time": 2.34
        }
      ],
      "completed": true
    }
  ]
}

POST /api/playground/history/clear/ Response:

{
  "success": true,
  "deleted": 15
}

POST /api/playground/history/delete/<query_id>/ Response:

{
  "success": true
}

Credit Management Endpoint

GET/POST /api/buy-tokens/  # Purchase credit packages

🐛 Troubleshooting

Common Issues

"No enabled/configured providers selected"
- Check your API keys in config.py
- Ensure providers are enabled in api/views.py (set is_on=True)
- Verify API keys are valid and not placeholder "xxx" values
"Page not found at /api/"
- Ensure Django server is running
- Check URL configuration in agents_django/urls.py
"NoReverseMatch" errors
- Restart Django development server after URL changes
- Check template URL references
Registration password errors
- Password must be at least 8 characters
- Cannot be too similar to username
- Cannot be a common password
- Cannot be entirely numeric
"You have no credits left"
- Purchase credits at /api/buy-tokens/
- Or manually add credits via Django admin
Test failures
- Ensure test database is properly created
- Check Django settings for test configuration
- Verify all dependencies are installed
Async job not completing
- Check server logs for provider errors
- Verify API keys are valid
- Some providers may timeout or fail silently

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with Django and modern web technologies
Inspired by the need to compare multiple LLM providers
Thanks to all the AI providers for their amazing APIs
Authentication system powered by Django's built-in auth framework

🔗 Links

Live Demo: http://localhost:8000/api/login/
Documentation: This README
Issue Tracker: GitHub Issues
API Reference: See API section above

Made with ❤️ for the AI community

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
agents		agents
agents_django		agents_django
api		api
static		static
static2		static2
tests		tests
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
config.py		config.py
logger.py		logger.py
manage.py		manage.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_tests.py		run_tests.py
test_markdown_full.py		test_markdown_full.py

Folders and files

Latest commit

History

Repository files navigation

Agents Playground

🚀 Features

🔐 Authentication System

🤖 AI Features

💳 Credit System

📜 History Management

🧪 Testing Suite

📋 Supported Providers

🛠️ Installation

Prerequisites

Quick Start

⚙️ Configuration

API Keys Setup

Django Settings

📖 Usage

Getting Started

Advanced Features

Follow-Up Conversations

Model Selection

Rating System

Provider Status

🏗️ Architecture

Core Components

Security Features

Agent System

Async Job Flow

Follow-Up Conversation Flow

Rating Flow

🧪 Testing

Running Tests

Test Structure

Test Categories

Test Results

🔧 Development

Adding a New Provider

Code Style

📚 API Reference

Authentication Endpoints

Playground Endpoints

Credit Management Endpoint

🐛 Troubleshooting

Common Issues

📄 License

🙏 Acknowledgments

🔗 Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages