This project provides tools to benchmark inference performance for LM Studio models. It measures various metrics including:
- Inference latency
- Tokens per second
- Memory usage
- Response generation time
- Python 3.8+
- LM Studio running locally
- Required Python packages (see requirements.txt)
pip install -r requirements.txtpython main.py --model "your_model_name" --prompt "your_test_prompt"- Measures inference latency across different prompt lengths
- Supports multiple model comparisons
- Generates detailed performance reports
- Memory usage tracking
- Configurable test parameters
MIT License