Skip to content

Conversation

@Bslabe123
Copy link
Collaborator

@Bslabe123 Bslabe123 commented Mar 12, 2025

fixes: #11

@achandrasekar
Copy link
Collaborator

cc @liu-cong

@Bslabe123
Copy link
Collaborator Author

Bslabe123 commented Mar 12, 2025

@liu-cong

Tested against non-existent model host envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local to trigger ClientConnectionError, see logs:

python3 b.py --save-json-results --host=envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local --port=8081 --dataset=ShareGPT_V3_unfiltered_cleaned_split.json --tokenizer=meta-llama/Llama-2-7b-hf --request-rate=80.0 --backend=vllm --num-prompts=5 --max-input-length=1024 --max-output-length=2048 --file-prefix=benchmark-catalog --models=meta-llama/Llama-2-7b-hf --pm-namespace=epp-before-metrics-run2 --pm-job=model-serv
Namespace(backend='vllm', sax_model='', file_prefix='benchmark-catalog', endpoint='generate', host='envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local', port=8081, dataset='ShareGPT_V3_unfiltered_cleaned_split.json', models='meta-llama/Llama-2-7b-hf', traffic_split=None, stream_request=False, request_timeout=10800.0, tokenizer='meta-llama/Llama-2-7b-hf', best_of=1, use_beam_search=False, num_prompts=5, max_input_length=1024, max_output_length=2048, top_k=32000, request_rate=80.0, seed=1741819503, trust_remote_code=False, machine_cost=None, use_dummy_text=False, save_json_results=True, output_bucket=None, output_bucket_filepath=None, save_aggregated_result=False, additional_metadata_metrics_to_save=None, scrape_server_metrics=False, pm_namespace='epp-before-metrics-run2', pm_job='model-serv')
Models to benchmark: ['meta-llama/Llama-2-7b-hf']
No traffic split specified. Defaulting to uniform traffic split.
Starting Prometheus Server on port 9090
ClientConnectorError: Cannot connect to host envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local:8081 ssl:False [Name or service not known]
ClientConnectorError: Cannot connect to host envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local:8081 ssl:False [Name or service not known]
ClientConnectorError: Cannot connect to host envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local:8081 ssl:False [Name or service not known]
ClientConnectorError: Cannot connect to host envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local:8081 ssl:False [Name or service not known]
ClientConnectorError: Cannot connect to host envoy-epp-before-metrics-run2-inference-gateway-0cb11840.envoy-gateway-system.svc.cluster.local:8081 ssl:False [Name or service not known]
====Result for Model: weighted====
Errors: {'ClientConnectorError': 5, 'TimeoutError': 0, 'ContentTypeError': 0, 'ClientOSError': 0, 'ServerDisconnectedError': 0, 'unknown_error': 0}
Total time: 0.07 s
Successful/total requests: 0/5
Requests/min: 4242.76
Output_tokens/min: 0.00
Input_tokens/min: 0.00
Tokens/min: 0.00
Average seconds/token (includes waiting time on server): 0.00
Average milliseconds/request (includes waiting time on server): 0.00
Average milliseconds/output_token (includes waiting time on server): 0.00
Average input length: 0.00
Average output length: 0.00
====Result for Model: meta-llama/Llama-2-7b-hf====
Errors: {'ClientConnectorError': 5, 'TimeoutError': 0, 'ContentTypeError': 0, 'ClientOSError': 0, 'ServerDisconnectedError': 0, 'unknown_error': 0}
Total time: 0.07 s
Successful/total requests: 0/0
Requests/min: 0.00
Output_tokens/min: 0.00
Input_tokens/min: 0.00
Tokens/min: 0.00
Average seconds/token (includes waiting time on server): 0.00
Average milliseconds/request (includes waiting time on server): 0.00
Average milliseconds/output_token (includes waiting time on server): 0.00
Average input length: 0.00
Average output length: 0.00

@achandrasekar achandrasekar merged commit d699be9 into main Mar 21, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug report: The tool doesn't handle 503 backend error properly

2 participants