Skip to content

AggelosAr/agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

166 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Agents Playground

A secure, multi-provider LLM playground with user authentication that allows you to interact with various AI models, compare responses, and get ratings from different agents.

πŸš€ Features

πŸ” Authentication System

  • User Registration & Login: Secure account creation and authentication
  • Session Management: Proper Django session handling with CSRF protection
  • Protected Routes: All playground features require authentication
  • Modern UI: Beautiful login/register pages with gradient backgrounds
  • Password Validation: Django's built-in password strength requirements

πŸ€– AI Features

  • Multi-Provider Support: Connect with multiple LLM providers (OpenAI, Anthropic, Gemini, DeepSeek, Groq, and more)
  • Model Selection: Choose specific models for each provider
  • Response Comparison: Compare responses from different agents side-by-side
  • Follow-Up Conversations: Continue conversations with individual agents directly from result cards
  • Context-Aware Threads: Each follow-up includes full conversation history for coherent multi-turn dialogues
  • Rating System: Get automatic ratings from other agents on response quality
  • Per-Rater Breakdown: See which agents rated each response and their individual scores
  • Async Job Processing: Non-blocking concurrent requests to multiple providers
  • Real-time Updates: Stream results as they arrive from different providers
  • Modern UI: Clean, responsive interface with Markdown rendering and smooth interactions
  • Django Integration: Built with Django for easy deployment and management

πŸ’³ Credit System

  • User Credits: Each user starts with 10 free credits
  • Pay-per-Query: 1 credit deducted per successful query (including follow-ups)
  • Credit Packages: Purchase additional credits (Starter, Professional, Enterprise)
  • Credit Tracking: Real-time credit balance display
  • Query History: Database storage of all queries and results with API-based retrieval

πŸ“œ History Management

  • Database-Backed History: All queries stored in database, accessible across devices
  • API-Based Retrieval: History fetched via REST API instead of localStorage
  • Configurable Limit: View last 10, 25, 50, or 100 queries
  • Full Query Context: Access to original messages, guides, providers, and results
  • Clear History: Delete all or individual queries from database
  • Auto-Save: Automatically saves queries with results to database

πŸ§ͺ Testing Suite

  • Comprehensive Tests: 60+ test cases covering all functionality
  • Authentication Tests: Login, registration, session management
  • Agent System Tests: Rating calculations, provider mappings
  • Follow-Up Conversation Tests: Context history, credit deduction, multi-turn dialogues
  • Integration Tests: End-to-end user flows
  • Performance Tests: Load time and response time validation
  • Security Tests: CSRF protection, authentication requirements

πŸ“‹ Supported Providers

Provider Status Models
OpenAI βœ… GPT-4, GPT-3.5, O1, O3, and more
Anthropic βœ… Claude 3.7, Claude 3.5, Claude 4, and more
Gemini βœ… Gemini 2.5 Flash, Gemini 2.0 Flash, Gemini 1.5 Pro/Flash
DeepSeek βœ… DeepSeek Chat, DeepSeek Reasoner
Perplexity βœ… Sonar models
Mistral βœ… Mixtral, Mistral-large
Groq βœ… Llama, Mixtral, Gemma models
Azure OpenAI βš™οΈ GPT-4o, GPT-4o-mini
Cohere βš™οΈ Command models
Together AI βš™οΈ Together-large, Together-medium
AI21 βš™οΈ Jurassic models
Bedrock βš™οΈ Amazon Titan, Bedrock-text
HuggingFace βš™οΈ GPT-2, GPT-J, BLOOM, MPT
Vertex AI βš™οΈ Text-bison, Code-bison

βœ… = Enabled by default, βš™οΈ = Available but disabled by default

πŸ› οΈ Installation

Prerequisites

  • Python 3.8+
  • Django 5.2.6
  • SQLite (included with Django)
  • Required packages: anthropic, openai, google-generativeai, perplexityai, and more (see requirements.txt)

Quick Start

  1. Clone the repository:
git clone <repository-url>
cd agents
  1. Create a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure API keys:
# Edit config.py with your API keys
# Required keys for enabled providers:
OPENAI_API_KEY = "your-openai-key"
ANTHROPIC_API_KEY = "your-anthropic-key"
GEMINI_API_KEY = "your-gemini-key"
DEEPSEEK_API_KEY = "your-deepseek-key"
PERPLEXITY_API_KEY = "your-perplexity-key"
MISTRAL_API_KEY = "your-mistral-key"
GROQ_API_KEY = "your-groq-key"
  1. Run database migrations:
python manage.py migrate
  1. Create a superuser (optional):
python manage.py createsuperuser
  1. Start the server:
python manage.py runserver
  1. Access the application:
    • Open your browser to http://localhost:8000/api/login/
    • Register a new account or login with existing credentials
    • Start using the playground!

βš™οΈ Configuration

API Keys Setup

Edit config.py to add your API keys:

# Required for enabled providers
OPENAI_API_KEY = "your-openai-key"
ANTHROPIC_API_KEY = "your-anthropic-key"
GEMINI_API_KEY = "your-gemini-key"
DEEPSEEK_API_KEY = "your-deepseek-key"
PERPLEXITY_API_KEY = "your-perplexity-key"
MISTRAL_API_KEY = "your-mistral-key"
GROQ_API_KEY = "your-groq-key"

# Optional providers (disabled by default)
AZURE_OPENAI_KEY = "your-azure-key"
AZURE_OPENAI_ENDPOINT = "your-azure-endpoint"
COHERE_API_KEY = "your-cohere-key"
TOGETHER_API_KEY = "your-together-key"
AI21_API_KEY = "your-ai21-key"
BEDROCK_API_KEY = "your-bedrock-key"
HF_API_TOKEN = "your-huggingface-token"
VERTEX_PROJECT = "your-vertex-project"
VERTEX_LOCATION = "your-vertex-location"

Django Settings

Key authentication settings in agents_django/settings.py:

# Authentication URLs
LOGIN_URL = '/api/login/'
LOGIN_REDIRECT_URL = '/api/playground/'
LOGOUT_REDIRECT_URL = '/api/login/'

# Password validation
AUTH_PASSWORD_VALIDATORS = [
    {
        'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
    },
    {
        'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
    },
    # ... more validators
]

πŸ“– Usage

Getting Started

  1. Register an Account:

    • Go to http://localhost:8000/api/register/
    • Fill in username and password
    • Password must be at least 8 characters with complexity requirements
  2. Login:

    • Use your credentials to login at http://localhost:8000/api/login/
    • You'll be redirected to the playground
  3. Use the Playground:

    • Check your credits displayed in the top-right corner
    • Enter your message in the main textarea
    • Optional: Add a guide to steer the response
    • Select providers you want to query (or leave empty for all enabled)
    • Enable rating if you want other agents to rate the responses
    • Click "Ask Agents" to get responses (costs 1 credit per query)
    • View results as they stream in from different providers
    • Continue conversations: Use the follow-up input at the bottom of each result card to continue chatting with that specific agent (costs 1 credit per follow-up)
  4. Purchase Credits:

    • Go to http://localhost:8000/api/buy-tokens/
    • Choose a package: Starter (10 credits), Professional (50 credits), or Enterprise (200 credits)
    • Credits are added to your account immediately

Advanced Features

Follow-Up Conversations

  • Each result card has a follow-up input section at the bottom
  • Type your follow-up message and click "Send" to continue the conversation with that specific agent
  • The agent receives full conversation history (original response + all previous exchanges) as context
  • Each follow-up costs 1 credit and updates your balance in real-time
  • Multiple follow-ups create a threaded conversation within the result card
  • Visual separations clearly distinguish between exchanges

Model Selection

  • Click "Edit models" to choose specific models for each selected provider
  • Compare different models from the same provider

Rating System

When "Enable rating" is checked:

  • Other enabled agents will rate each response
  • You'll see an overall rating (1-10 scale)
  • Detailed breakdown shows which agents rated and their individual scores
  • Each rater's model and weight are displayed

Provider Status

  • Enabled agents appear normally in the dropdown
  • Disabled agents are greyed out and italicized
  • Visual indicators help you understand provider availability

πŸ—οΈ Architecture

Core Components

  • Authentication System: Django's built-in auth with custom views
  • Credit System: User profiles with credit tracking and purchase flow
  • Agent Registry: Manages all agent instances and their connections
  • Agent Classes: Individual implementations for each provider (14 providers)
  • Rating System: Concurrent rating by multiple agents with weighted scores
  • Async Job System: Thread-based concurrent provider requests
  • Database Models: PlaygroundQuery and ProviderResult for query history
  • Django Views: Web interface and API endpoints
  • Templates: Responsive frontend with Markdown rendering and modern CSS

Security Features

  • CSRF Protection: All forms protected against CSRF attacks
  • Session Security: Proper Django session management
  • Password Validation: Strong password requirements
  • Route Protection: All playground routes require authentication
  • Input Sanitization: Django's built-in protection against XSS

Agent System

Each provider has its own Agent class that:

  • Handles API communication
  • Implements the _ask() method for queries
  • Provides model selection via MODELS Literal
  • Can act as both a responder and a rater

Async Job Flow

  1. User submits query via /api/playground/start/
  2. Server creates a job with unique job_id and spawns threads for each provider
  3. Client polls /api/playground/status/<job_id>/ for updates
  4. Results stream back as each provider completes
  5. When all providers finish, 1 credit is deducted
  6. Query and results are saved to database

Follow-Up Conversation Flow

  1. User types follow-up message in result card's input area
  2. JavaScript builds conversation history from original response + all previous exchanges
  3. History is formatted and sent as guide parameter to /api/playground/start/
  4. Single agent processes the follow-up with full context
  5. Client polls /api/playground/status/<job_id>/ for the response
  6. Response is displayed inline within the result card
  7. 1 credit is deducted and balance updates in real-time
  8. Conversation thread grows with each exchange

Rating Flow

  1. User submits query with rated=True
  2. Each agent generates its response
  3. Other enabled agents rate each response concurrently
  4. Ratings are weighted based on provider quality (OpenAI: 1.10, Anthropic: 1.05, etc.)
  5. Detailed breakdown with rater names, models, and weights is included

πŸ§ͺ Testing

Running Tests

The project includes a comprehensive test suite with 60+ test cases:

# Run all tests
python manage.py test

# Run specific test file
python manage.py test tests.test_auth

# Run specific test method
python manage.py test tests.test_auth.AuthenticationTestCase.test_login_page_loads

# Run with coverage
python run_tests.py coverage

# Run specific test categories
python run_tests.py auth      # Authentication tests
python run_tests.py playground  # Playground tests
python run_tests.py agents      # Agent system tests

Test Structure

tests/
β”œβ”€β”€ test_auth.py                    # Authentication & user management
β”œβ”€β”€ test_playground.py              # Playground functionality
β”œβ”€β”€ test_rating_system.py           # Rating system & agents
β”œβ”€β”€ test_followup_conversations.py  # Follow-up conversation feature
β”œβ”€β”€ test_api_providers.py           # API provider tests
β”œβ”€β”€ test_async_utils.py             # Async utility tests
β”œβ”€β”€ __init__.py                     # Test configuration
└── conftest.py                     # Pytest configuration

Test Categories

  • Authentication Tests: Login, registration, logout, session management
  • Playground Tests: UI functionality, agent interactions, error handling
  • Agent System Tests: Rating calculations, provider mappings, mock responses
  • Follow-Up Conversation Tests: Context history building, credit deduction, multi-turn dialogues, UI elements
  • Integration Tests: Complete user flows, cross-component interactions
  • Performance Tests: Load times, response times
  • Security Tests: CSRF protection, authentication requirements

Test Results

Current test coverage includes:

  • βœ… Login/logout functionality
  • βœ… User registration with validation
  • βœ… Protected route access
  • βœ… Agent system initialization
  • βœ… Rating calculations
  • βœ… Error handling
  • βœ… UI component rendering

πŸ”§ Development

Adding a New Provider

  1. Create Agent Class:
# agents/agent_newprovider.py
from __future__ import annotations

from .agent import Agent
from .annotations import (
    ActorRole, AgentResponse, AnnotatedQueryWithGuide, 
    CleanAgentResponse, Literal, QueryGuide
)
from config import NEWPROVIDER_API_KEY

class NewProvider(Agent):
    MODELS = Literal['model-1', 'model-2', 'model-3']
    
    # Initialize the provider's client
    self.client = ProviderClient(api_key=NEWPROVIDER_API_KEY)

    def __init__(self,
                 model: MODELS | None = None,
                 role: ActorRole | None = None,
                 is_on: bool = False,
                 rating_weight: float = 1.0) -> None:
        
        super().__init__(
            model=model or 'model-1',
            role=role or 'user',
            is_on=is_on,
            rating_weight=rating_weight
        )
    
    def _ask(self,
             /,
             message: AnnotatedQueryWithGuide,
             guide: QueryGuide | None = None) -> AgentResponse:
        """Send request to the provider and return raw response."""
    
    def _clean_response(self, resp: AgentResponse) -> CleanAgentResponse:
        """Extract text content from the raw response."""
  1. Add to Registry:
# api/views.py
from agents.agent_newprovider import NewProvider

# Add to agents list
NewProvider(is_on=True, rating_weight=1.0),
  1. Add Tests:
# tests/test_playground.py
@patch('agents.agent_newprovider.NewProvider._ask')
def test_new_provider_response(self, mock_ask):
    mock_ask.return_value = "Test response"
    # Test implementation

Code Style

  • Follow PEP 8 guidelines
  • Use type hints where appropriate
  • Write comprehensive tests for new features
  • Document all public methods and classes

πŸ“š API Reference

Authentication Endpoints

POST /api/login/          # User login
POST /api/register/       # User registration
POST /api/logout/         # User logout

Playground Endpoints

GET  /api/playground/                        # Main playground interface
POST /api/playground/start/                  # Start async job
GET  /api/playground/status/<job_id>/        # Poll for results
GET  /api/playground/history/                # Fetch query history
POST /api/playground/history/clear/          # Clear all history
POST /api/playground/history/delete/<id>/    # Delete specific query

POST /api/playground/start/ Parameters:

  • message (required): Your query
  • guide (optional): Guide for steering responses
  • agents[] (optional): Selected providers (format: "provider::model" or "provider")
  • rated (optional): Enable rating system

Response:

{
  "job_id": "abc123..."
}

GET /api/playground/status/<job_id>/ Response:

{
  "results": [
    {
      "provider": "openai",
      "model": "gpt-4",
      "text": "Response text...",
      "rating": 8.5,
      "rating_breakdown": [
        {
          "rater": "Anthropic",
          "rater_model": "claude-3-5-sonnet",
          "rating": 8,
          "weight": 1.05
        }
      ],
      "rated_by": ["Anthropic", "Groq"],
      "time": 2.34
    }
  ],
  "done": false,
  "credits": 9
}

GET /api/playground/history/ Parameters:

  • limit (optional): Number of queries to retrieve (default: 25, max: 100)

Response:

{
  "history": [
    {
      "id": 123,
      "timestamp": "2026-01-10T18:00:00Z",
      "message": "What is AI?",
      "guide": "",
      "providers": [{"provider": "openai", "model": "gpt-4"}],
      "providerCount": 1,
      "results": [
        {
          "provider": "openai",
          "model": "gpt-4",
          "text": "AI is...",
          "rating": 8.5,
          "time": 2.34
        }
      ],
      "completed": true
    }
  ]
}

POST /api/playground/history/clear/ Response:

{
  "success": true,
  "deleted": 15
}

POST /api/playground/history/delete/<query_id>/ Response:

{
  "success": true
}

Credit Management Endpoint

GET/POST /api/buy-tokens/  # Purchase credit packages

πŸ› Troubleshooting

Common Issues

  1. "No enabled/configured providers selected"

    • Check your API keys in config.py
    • Ensure providers are enabled in api/views.py (set is_on=True)
    • Verify API keys are valid and not placeholder "xxx" values
  2. "Page not found at /api/"

    • Ensure Django server is running
    • Check URL configuration in agents_django/urls.py
  3. "NoReverseMatch" errors

    • Restart Django development server after URL changes
    • Check template URL references
  4. Registration password errors

    • Password must be at least 8 characters
    • Cannot be too similar to username
    • Cannot be a common password
    • Cannot be entirely numeric
  5. "You have no credits left"

    • Purchase credits at /api/buy-tokens/
    • Or manually add credits via Django admin
  6. Test failures

    • Ensure test database is properly created
    • Check Django settings for test configuration
    • Verify all dependencies are installed
  7. Async job not completing

    • Check server logs for provider errors
    • Verify API keys are valid
    • Some providers may timeout or fail silently

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Built with Django and modern web technologies
  • Inspired by the need to compare multiple LLM providers
  • Thanks to all the AI providers for their amazing APIs
  • Authentication system powered by Django's built-in auth framework

πŸ”— Links

  • Live Demo: http://localhost:8000/api/login/
  • Documentation: This README
  • Issue Tracker: GitHub Issues
  • API Reference: See API section above

Made with ❀️ for the AI community

About

Trying to create an agentic python aggregator.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors