Cloud9 x JetBrains Hackathon - Track 2: Automated Scouting Report Generator
Professional League of Legends esports scouting platform that transforms GRID API match data into actionable strategic intelligence.
Key Philosophy: Data-driven intelligence without manual VOD review.
Impact: 6 hours of manual analysis reduced to 2 minutes of AI-powered scouting reports.
Dataset: 15,628+ player performances | 55 teams | 6,740 team compositions analyzed
Status: Production-ready with comprehensive test coverage and documentation
- Automated Scouting Reports: Generate comprehensive opponent analysis in seconds
- Multi-Dimensional Playstyle Analysis: 6-dimensional metrics (aggression, farm priority, teamfight, roaming, early impact, late scaling)
- Threat Grading System: S/A/B/C/D player rankings based on percentile benchmarks across 5 roles
- Win Condition Detection: Automatic pattern recognition (early game dominance, late scaling, carry dependency, vision control)
- Counter-Strategy Generation: Data-driven recommendations to exploit opponent weaknesses
- Champion Pool Analysis: Player tendencies, pick rates, and signature champions
- SoloQ Integration: Real-time Riot API integration for pocket picks and matchup trends
- LLM-Powered Reports: Ollama integration (Qwen 2.5 7B) for natural language strategic insights
- RAG Knowledge Base: Retrieval-augmented generation with ChromaDB and semantic search (4 collections, 25+ contexts per report)
- LoRA Fine-Tuning: Custom-trained model on professional scouting reports (available on HuggingFace: ravlad/qwen2.5_7b_Finetuning)
- Graceful Degradation: Automatic fallback to template-based reports if LLM unavailable
- Semantic Embeddings: sentence-transformers/all-MiniLM-L6-v2 (384-dim, local inference)
- Offline-First Design: All analysis runs locally on pre-fetched data (reproducible, deterministic)
- Clean 3-Layer Architecture: Strict separation between data ingestion, analytics engine, and presentation
- Professional Web UI: Flask-based interface with vanilla JavaScript, Chart.js radar charts, and real-time visualizations
- Comprehensive Testing: Test suite covering unit, integration, and edge cases
- Automated Data Pipeline: Incremental GRID API updates with rate limiting and profile synchronization
# Clone repository
git clone <repository-url>
cd draftc9
# Install dependencies
pip install -r requirements.txt
# Set Python path (Windows)
set PYTHONPATH=src
# Set Python path (Unix/Mac)
export PYTHONPATH=src# Install Ollama: https://ollama.com/
# Run one-command setup (downloads from HuggingFace)
python src/scouting/ml/setup_scouting_model.py
# This will:
# 1. Download qwen-lol-scouting-q6_k.gguf from HuggingFace
# 2. Create Ollama model from GGUF
# 3. Verify installation# Start web application
python src/main.py
# Open http://localhost:5000# Fetch latest matches from GRID API (requires GRID_API_KEY)
python src/data/automation/update_data.py
# Sync player profiles from rosters
python src/data/automation/update_data.py --sync-profilesDraftC9 is built on five independent, reproducible systems:
- File: src/scouting/engine.py
- Purpose: Pure statistical analysis of team/player patterns
- Input: Team name, number of matches
- Output: Playstyle metrics, win conditions, counter-strategies
- No ML models: All metrics computed from real match data
- Files: src/scouting/rag/
- Components: ChromaDB vector store, sentence-transformers embeddings
- Collections: team_knowledge, player_knowledge, meta_knowledge, strategy_knowledge
- Retrieval: Semantic search with temporal relevance boosting
- Model: Qwen 2.5 7B fine-tuned on professional scouting reports
- HuggingFace: ravlad/qwen2.5_7b_Finetuning
- Quantization: Q6_K (6-bit, ~7GB)
- Purpose: Natural language report generation
- Setup: One-command via setup_scouting_model.py
- Files: src/data/ingestion/riot_api.py, src/scouting/soloq/pattern_analysis.py
- API: Riot Games API (Account-v1, Match-v5)
- Features: Pocket pick detection, matchup win rates, recent champion practice
- Rate Limiting: Token bucket with adaptive backoff
- File: src/data/automation/update_data.py
- Source: GRID Esports API
- Mode: Incremental updates (no duplicates)
- Output: CSV datasets, player profiles JSON
Complete documentation is available in docs/INDEX.md.
Key Documents:
- Architecture Overview - System design and data flow
- LoRA Guide - Fine-tuned model setup
- RAG System - Knowledge base setup and usage
- SoloQ Integration - Riot API integration details
- Data Automation - Dataset updates and roster management
- Metrics Documentation - All calculated metrics
To verify the system works correctly:
1. Scouting Engine
# Test team analysis
python -c "
from scouting.engine import ScoutingEngine
engine = ScoutingEngine()
report = engine.analyze_opponent('Gen.G', matches=20)
print(f'Analyzed {len(report[\"players\"])} players')
print(f'Detected {len(report[\"win_conditions\"])} win conditions')
"2. RAG System
# Test semantic retrieval
python -c "
from scouting.rag.retriever import RAGRetriever
retriever = RAGRetriever()
results = retriever.retrieve_team_context('Gen.G', 'draft patterns')
print(f'Retrieved {len(results)} contexts')
"3. LoRA Model
# Test Ollama integration
ollama run qwen-lol-scouting "What is Gen.G's playstyle?"4. SoloQ Integration
# Requires RIOT_API_KEY in environment
python -c "
from data.ingestion.riot_api import RiotAPIClient
from scouting.soloq.pattern_analysis import analyze_player_patterns
client = RiotAPIClient()
matches = client.get_recent_matches_by_riot_id('HiDe', 'OnBush', count=20)
patterns = analyze_player_patterns(matches)
print(f'Analyzed {len(matches)} SoloQ matches')
"5. Web Interface
# Start server and visit http://localhost:5000
python src/main.py
# Test API endpoint
curl http://localhost:5000/api/teamsAll metrics are computed from real data. No placeholders or fake values:
- Team Metrics: Aggregated from draft_dataset_final.csv
- Player Stats: Aggregated from player_scouting_dataset.csv
- Win Conditions: Pattern detection on historical match outcomes
- SoloQ Data: Real-time from Riot API
Main settings:
- src/scouting/config.py - Scouting engine configuration
- src/data/config/config.py - Data pipeline configuration
.env- API keys (create from.env.example, not committed)
Backend
- Python 3.11+
- Flask (web framework)
- pandas, numpy (data processing)
AI/ML
- Ollama (local LLM inference)
- Qwen 2.5 7B (base model)
- LoRA fine-tuning (PEFT, Unsloth)
- sentence-transformers (embeddings)
- ChromaDB (vector database)
Data
- GRID Esports API (professional match data)
- Riot Games API (SoloQ data)
- CSV datasets (15,628+ rows)
Frontend
- Vanilla JavaScript (no frameworks)
- Chart.js (radar charts)
- HTML/CSS
draftc9/
├── src/
│ ├── scouting/ # Analytics engine
│ │ ├── engine.py # Main orchestrator
│ │ ├── analyzers/ # Team, player, win condition analysis
│ │ ├── ml/ # Model loading, LoRA integration
│ │ ├── rag/ # RAG retrieval system
│ │ ├── soloq/ # SoloQ pattern analysis
│ │ └── reporting/ # LLM report generation
│ ├── data/ # Data pipeline
│ │ ├── ingestion/ # GRID & Riot API clients
│ │ ├── processing/ # Dataset builders
│ │ └── automation/ # Update scripts
│ ├── web/ # Flask application
│ │ ├── routes/ # API endpoints
│ │ ├── services/ # Business logic
│ │ ├── static/ # JS, CSS
│ │ └── templates/ # HTML
│ └── data_files/ # Datasets (not in git)
│ ├── datasets/ # CSV & JSON
│ ├── rag/ # ChromaDB store
│ └── models/ # GGUF files
├── docs/ # Documentation
└── tests/ # Test suite
pytest # Run all tests
pytest --cov=src # With coverage
pytest tests/test_engine.py # Specific testblack src/ tests/ # Format code
ruff check src/ tests/ # LintMIT License - see LICENSE file for details.
Version: 1.0.0 Last Updated: 2026-01-26 Hackathon: Cloud9 x JetBrains - Track 2