-
Notifications
You must be signed in to change notification settings - Fork 191
Description
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
2025-03-12 11:39:28,826 INFO worker.py:1841 -- Started a local Ray instance.
0%| | 0/2 [00:00<?, ?it/s]Exception in thread Thread-2 (launch_request):
Traceback (most recent call last):
File "/home/ljw/miniconda3/envs/llm_demo/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/home/ljw/miniconda3/envs/llm_demo/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/workstation/usrs/zjp/llmperf/llmperf/token_benchmark_ray.py", line 126, in launch_request
request_metrics[common_metrics.REQ_OUTPUT_THROUGHPUT] = num_output_tokens / request_metrics[common_metrics.E2E_LAT]
ZeroDivisionError: division by zero
0%| | 0/2 [00:16<?, ?it/s]
\Results for token benchmark for DeepSeek-R1-Distill-Llama-70B queried with the openai api.
Traceback (most recent call last):
File "/workstation/usrs/zjp/llmperf/llmperf/token_benchmark_ray.py", line 478, in
run_token_benchmark(
File "/workstation/usrs/zjp/llmperf/llmperf/token_benchmark_ray.py", line 319, in run_token_benchmark
summary, individual_responses = get_token_throughput_latencies(
File "/workstation/usrs/zjp/llmperf/llmperf/token_benchmark_ray.py", line 167, in get_token_throughput_latencies
ret = metrics_summary(completed_requests, start_time, end_time)
File "/workstation/usrs/zjp/llmperf/llmperf/token_benchmark_ray.py", line 218, in metrics_summary
df_without_errored_req = df[df[common_metrics.ERROR_CODE].isna()]
File "/home/ljw/miniconda3/envs/llm_demo/lib/python3.10/site-packages/pandas/core/frame.py", line 4102, in getitem
indexer = self.columns.get_loc(key)
File "/home/ljw/miniconda3/envs/llm_demo/lib/python3.10/site-packages/pandas/core/indexes/range.py", line 417, in get_loc
raise KeyError(key)
KeyError: 'error_code'
(OpenAIChatCompletionsClient pid=2572041) Warning Or Error: list index out of range
(OpenAIChatCompletionsClient pid=2572041) -1