A comprehensive performance analysis dashboard for RHAIIS (Red Hat AI Inference server) benchmarks. This dashboard provides interactive visualizations and analysis of AI model performance across different accelerators, versions, and configurations.
- Interactive Performance Plots: Compare throughput, latency, and efficiency metrics
- Cost Analysis: Calculate cost per million tokens with cloud provider pricing
- Performance Rankings: Identify top performers by throughput and latency
- Regression Analysis: Track performance changes between versions
- Pareto Tradeoff Analysis: Visualize performance trade-offs between competing objectives
- Runtime Configuration Tracking: View inference server arguments used
- Multi-Accelerator Support: Compare H200, MI300X, and TPU performance
- Multi-Version Support: Compare MLPerf v5.0 and v5.1 submissions
- Benchmark Comparisons: Analyze performance across different models and scenarios
- Normalized Result Analysis: Compare systems with different accelerator counts
- Dataset Representation: View token length distributions for evaluation datasets
- Offline vs Server Comparison: Analyze performance degradation between scenarios
- Cross-Version Analysis: Track how systems perform across MLPerf versions
- Throughput: Output tokens per second
- Latency: Time to First Token (TTFT) and Inter-Token Latency (ITL)
- Efficiency: Throughput per tensor parallelism unit
- Cost Efficiency: Cost per million tokens across cloud providers
- Error Rates: Request success/failure analysis
- Concurrency Performance: Performance at different load levels
performance-dashboard/
├── dashboard.py # Main dashboard application
├── dashboard_styles.py # CSS styling file
├── mlperf_datacenter.py # MLPerf dashboard module
├── pyproject.toml # Project metadata and dependencies
├── requirements.txt # Python dependencies
├── Dockerfile.openshift # Container build configuration
├── .pre-commit-config.yaml # Pre-commit hooks configuration
├── Makefile # Development commands
├── consolidated_dashboard.csv # RHAIIS benchmark data. Get the latest csv data from the AWS S3 bucket.
├── mlperf-data/ # MLPerf data files
│ ├── mlperf-5.1.csv # MLPerf v5.1 submission data
│ ├── mlperf-5.0.csv # MLPerf v5.0 submission data
│ ├── summaries/ # Dataset summaries (version controlled)
│ │ ├── README.md # Dataset summary documentation
│ │ ├── deepseek-r1.csv # DeepSeek-R1 token length summary
│ │ ├── llama3-1-8b-datacenter.csv # Llama 3.1 8B token length summary
│ │ └── llama2-70b-99.csv # Llama 2 70B token length summary
│ └── original/ # Original datasets (NOT version controlled)
│ ├── README.md # Download and usage instructions
│ └── generate_dataset_summaries.py # Script to generate CSV summaries
├── manual_runs/scripts/ # Data processing scripts
│ └── import_manual_run_jsons.py # Import manual benchmark results
├── deploy/ # OpenShift deployment files
│ ├── openshift-deployment.yaml # Application deployment
│ ├── openshift-service.yaml # Service configuration
│ └── openshift-route.yaml # Route/ingress configuration
├── tests/ # Test suite
│ ├── test_data_processing.py # Data processing unit tests
│ ├── test_import_script.py # Import script tests
│ ├── test_integration.py # Integration tests
│ ├── test_mlperf_datacenter.py # MLPerf module tests
│ ├── conftest.py # Shared fixtures
│ └── README.md # Test documentation
└── docs/ # Documentation
└── CODE_QUALITY.md # Code quality guidelines
-
Clone the repository:
git clone https://github.com/openshift-psap/performance-dashboard.git cd performance-dashboard -
Set up Python environment:
python -m venv venv source venv/bin/activate pip install -r requirements.txt -
Add your data:
- RHAIIS data: Place your
consolidated_dashboard.csvin the root directory - MLPerf data: MLPerf CSV files are included in
mlperf-data/directory - Use the utilities in
manual_runs/scripts/to process new benchmark data
- RHAIIS data: Place your
-
Run the dashboard:
streamlit run dashboard.py
-
Access: Open http://localhost:8501 in your browser
- Use the sidebar to switch between "RHAIIS Dashboard" and "MLPerf Dashboard" views
For a complete development environment with linting, formatting, and code quality tools:
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install development dependencies from pyproject.toml
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit installAvailable development commands:
make format- Auto-format code (Black, Ruff)make lint- Run linting checksmake type-check- Run static type checkingmake test- Run tests with coveragemake ci-local- Run all CI checks locallymake clean- Clean temporary files
Code Quality:
- All code is checked with ruff, black, mypy
- Pre-commit hooks enforce code standards
- Tests must pass before merging
- Documentation required for public functions
See Code Quality Documentation for detailed information.
-
Build the container:
podman build -f Dockerfile.openshift -t performance-dashboard . -
Run locally:
podman run -p 8501:8501 performance-dashboard
- OpenShift CLI (
oc) installed and configured - Access to an OpenShift cluster with permissions to create projects
- Container registry access (quay.io or internal registry)
- Latest CSV data file in the project directory
-
Create the namespace/project:
oc new-project rhaiis-dashboard --display-name="RHAIIS Performance Dashboard" -
Prepare your data:
# Ensure you have the latest consolidated_dashboard.csv in the root directory # You can download it from the AWS S3 bucket or generate it using the scripts # MLPerf data files are included in mlperf-data/ directory # Dataset summaries are in mlperf-data/summaries/
-
Build and push the container image:
# Build the container image with your data podman build -f Dockerfile.openshift -t quay.io/your-username/rhaiis-dashboard:latest . # Push to your container registry podman push quay.io/your-username/rhaiis-dashboard:latest
-
Update the image reference in deployment:
-
Deploy all components:
# Deploy the application, service, and route oc apply -f deploy/openshift-deployment.yaml oc apply -f deploy/openshift-service.yaml oc apply -f deploy/openshift-route.yaml -
Access the dashboard:
# Get the dashboard URL echo "Dashboard URL: http://$(oc get route rhaiis-dashboard -n rhaiis-dashboard -o jsonpath='{.spec.host}')"
When you have new data or code changes:
-
Rebuild the image with updated data:
podman build -f Dockerfile.openshift -t quay.io/your-username/rhaiis-dashboard:latest . podman push quay.io/your-username/rhaiis-dashboard:latest -
Restart the deployment to use the new image:
oc rollout restart deployment/rhaiis-dashboard -n rhaiis-dashboard
-
From manual JSON results:
python scripts/import_manual_run_jsons.py benchmark.json \ --model "RedHatAI/Llama-3.3-70B-Instruct-FP8-dynamic" \ --version "vLLM-0.10.1" \ --tp 8 \ --accelerator "H200" \ --runtime-args "tensor-parallel-size: 8; max-model-len: 8192"
-
Consolidate data: Merge new results with existing CSV file
The dashboard supports multiple MLPerf Inference versions:
- v5.1: Latest submission results (
mlperf-data/mlperf-5.1.csv) - v5.0: Previous version results (
mlperf-data/mlperf-5.0.csv)
These files are version controlled and included in the repository.
The "Dataset Representation" section uses lightweight CSV summaries of token length distributions:
Available summaries (in mlperf-data/summaries/):
deepseek-r1.csv- DeepSeek-R1 evaluation datasetllama3-1-8b-datacenter.csv- Llama 3.1 8B CNN datasetllama2-70b-99.csv- Llama 2 70B Open Orca dataset
Original dataset files are stored in mlperf-data/original/ and are NOT version controlled due to their size.
To download and add a new dataset:
- Download the dataset to
mlperf-data/original/ - Update
generate_dataset_summaries.pywith a new processor function - Run the script to generate the summary
- Update
mlperf_datacenter.pyto map the model name to the summary file
See mlperf-data/original/README.md and mlperf-data/summaries/README.md for detailed instructions.
The project includes a comprehensive test suite with unit and integration tests.
Run all tests:
pytest tests/Run with coverage:
pytest tests/ --cov=. --cov-report=htmlQuick test command:
make testTest Categories:
- Data Processing Tests - Core data manipulation functions
- Import Script Tests - JSON import and parsing
- Integration Tests - End-to-end workflows
See tests/README.md for detailed test documentation.
STREAMLIT_SERVER_HEADLESS=true: Headless mode for productionSTREAMLIT_SERVER_PORT=8501: Server portSTREAMLIT_SERVER_ADDRESS=0.0.0.0: Listen address
- CSV Format: Must include columns for model, version, accelerator, TP, metrics
- Runtime Args: Semicolon-separated key-value pairs
- Benchmark Profiles: Support for different prompt/output token configurations
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Set up development environment:
pip install -e ".[dev]" - Install pre-commit hooks:
pre-commit install - Make changes and test locally:
pytest tests/ - Run code quality checks:
make ci-local - Update documentation as needed
- Submit a merge request
Development Workflow:
# 1. Create feature branch
git checkout -b feature/my-feature
# 2. Make changes
# ... edit code ...
# 3. Run tests
pytest tests/
# 4. Format and lint
make format
make lint
# 5. Commit (pre-commit hooks will run)
git add .
git commit -m "Add feature"
# 6. Push and create a Pull request against main
git push origin feature/my-feature