Skip to content


load test LLMs with Locust

Locust, an open source load testing tool, is popular framework for load testing HTTP and other protocols. Its developer friendly approach lets you to define your tests in regular Python code.

Locust tests can be run from command line or using its web-based UI. Throughput, response times and errors can be viewed in real time and/or exported for later analysis.

In this code repository, we provide an example of how to perform load testing on the LLM API to evaluate your production requirements. The code is developed within a SageMaker Notebook and utilizes the command line interface to conduct load testing on both the SageMaker and Bedrock LLM API.

Once the is properly configured, you can initiate the load test by executing a command in the command line. This allows you to test the system with varying levels of throughput, depending on your specific requirements.

locust --headless --users 30 --spawn-rate 30 --run-time 120 --csv ./benchmark_metric/benchmark_u30

Type Name # reqs # fails Avg Min Max Med req/s failures/s
[Send] Prompt 390 0 (0.00%) 9232 2037 10282 9800 3.25 0.00
Aggregated 390 0 (0.00%) 9232 2037 10282 9800 3.25 0.00

See our examples of load testing the SageMaker model with sagemaker_jumpstart_loadtest.ipynb and the Bedrock model with bedrock_loadtest.ipynb.


See CONTRIBUTING for more information.


This library is licensed under the MIT-0 License. See the LICENSE file.


No description, website, or topics provided.



Code of conduct

Security policy





No releases published


No packages published