Skip to content

Kaleido subprocess Segmentation fault #20

@rajeshitshoulders

Description

@rajeshitshoulders

Hi, I'm getting below error when trying to start vidur simulator in ubuntu 20.04 on python 3.10 venv, also i tested with mambo

INFO 07-09 16:17:21 config.py:21] trace_request_length_generator_decode_scale_factor: 1
INFO 07-09 16:17:21 config.py:21] trace_request_length_generator_prefill_scale_factor: 1
INFO 07-09 16:17:21 config.py:21] trace_request_length_generator_trace_file: ./data/processed_traces/arxiv_summarization_stats_llama2_tokenizer_filtered_v2.csv
INFO 07-09 16:17:21 config.py:21] vllm_scheduler_max_tokens_in_batch: 4096
INFO 07-09 16:17:21 config.py:21] vllm_scheduler_watermark_blocks_fraction: 0.01
INFO 07-09 16:17:21 config.py:21] write_chrome_trace: true
INFO 07-09 16:17:21 config.py:21] write_json_trace: false
INFO 07-09 16:17:21 config.py:21] write_metrics: true
INFO 07-09 16:17:21 config.py:21] zipf_request_length_generator_scramble: false
INFO 07-09 16:17:21 config.py:21] zipf_request_length_generator_theta: 0.4
INFO 07-09 16:17:21 config.py:21]
INFO 07-09 16:17:21 trace_request_length_generator.py:81] Loaded request length trace file ./data/processed_traces/arxiv_summarization_stats_llama2_tokenizer_filtered_v2.csv with 28257 requests
INFO 07-09 16:17:22 simulator.py:56] Starting simulation with cluster: Cluster({'id': 0, 'num_replicas': 1}) and 127 requests
INFO 07-09 16:17:24 simulator.py:76] Simulation ended at: 51.67980373407166s
INFO 07-09 16:17:24 simulator.py:79] Writing output
Exception ignored in atexit callback: <bound method Simulator._write_output of <vidur.simulator.Simulator object at 0xfffd0f347ac0>>
Traceback (most recent call last):
File "/home/nvidia/vidur/vidur/simulator.py", line 81, in _write_output
self._metric_store.plot()
File "/home/nvidia/vidur/vidur/metrics/metrics_store.py", line 34, in wrapper
return func(self, *args, **kwargs)
File "/home/nvidia/vidur/vidur/metrics/metrics_store.py", line 499, in plot
self._store_request_metrics(dir_plot_path)
File "/home/nvidia/vidur/vidur/metrics/metrics_store.py", line 403, in _store_request_metrics
dataseries.plot_histogram(base_plot_path, dataseries._y_name)
File "/home/nvidia/vidur/vidur/metrics/data_series.py", line 295, in plot_histogram
fig.write_image(f"{path}/{plot_name}.png")
File "/home/nvidia/.vidru/lib/python3.10/site-packages/plotly/basedatatypes.py", line 3841, in write_image
return pio.write_image(self, *args, **kwargs)
File "/home/nvidia/.vidru/lib/python3.10/site-packages/plotly/io/_kaleido.py", line 266, in write_image
img_data = to_image(
File "/home/nvidia/.vidru/lib/python3.10/site-packages/plotly/io/_kaleido.py", line 143, in to_image
img_bytes = scope.transform(
File "/home/nvidia/.vidru/lib/python3.10/site-packages/kaleido/scopes/plotly.py", line 153, in transform
response = self._perform_transform(
File "/home/nvidia/.vidru/lib/python3.10/site-packages/kaleido/scopes/base.py", line 293, in _perform_transform
self._ensure_kaleido()
File "/home/nvidia/.vidru/lib/python3.10/site-packages/kaleido/scopes/base.py", line 198, in _ensure_kaleido
raise ValueError(message)
ValueError: Failed to start Kaleido subprocess. Error stream:

/home/nvidia/.vidru/lib/python3.10/site-packages/kaleido/executable/kaleido: line 11: 257246 Segmentation fault /home/nvidia/.vidru/lib/python3.10/site-packages/kaleido/executable/bin/kaleido $@

I tried with different version of Kaliedo and ploty, still no luck.

any help would be greatly appreciated

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions