Simulate multiple user requests to the llm interface, supporting local large models (such as open-source models deployed via Ollama) and online models (such as OpenAI, Claude, etc.)
1.Modify the config in model_stress_test.py
2.Execute script
python3 model_stress_test.py