-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Description
I tried the demo code, but got an error like this:
File "/opt/miniconda3/lib/python3.8/site-packages/vllm/entrypoints/llm.py", line 55, in __init__
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/opt/miniconda3/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 145, in from_engine_args
engine = cls(*engine_configs, distributed_init_method, devices,
File "/opt/miniconda3/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 87, in __init__
worker_cls = ray.remote(
File "/opt/miniconda3/lib/python3.8/site-packages/ray/_private/worker.py", line 2879, in _make_remote
ray_option_utils.validate_actor_options(options, in_options=False)
File "/opt/miniconda3/lib/python3.8/site-packages/ray/_private/ray_option_utils.py", line 308, in validate_actor_options
actor_options[k].validate(k, v)
File "/opt/miniconda3/lib/python3.8/site-packages/ray/_private/ray_option_utils.py", line 38, in validate
raise ValueError(possible_error_message)
ValueError: The precision of the fractional quantity of resource node:172.17.0.8 cannot go beyond 0.0001
And my code is below:
from vllm import LLM, SamplingParams
prompts = [
"Hello, my name is",
"The president of the United States is",
"The capital of France is",
"The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
llm = LLM(model="my local path", tensor_parallel_size=4)
outputs = llm.generate(prompts, sampling_params)
# Print the outputs.
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
MotzWanted
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working