Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The openai deployment model takes twice as long to deploy as fastapi's approach to offline inference. #5154

Closed
LIUKAI0815 opened this issue May 31, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@LIUKAI0815
Copy link

LIUKAI0815 commented May 31, 2024

Your current environment

vllm=0.4.2

馃悰 Describe the bug

Same input and output, openai is slow

openai:  python -m vllm.entrypoints.openai.api_server  

fastapi:
from vllm import LLM, SamplingParams

# Sample prompts.
prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]
# Create a sampling params object.
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

# Create an LLM.
llm = LLM(model="facebook/opt-125m")
# Generate texts from the prompts. The output is a list of RequestOutput objects
# that contain the prompt, generated text, and other information.
outputs = llm.generate(prompts, sampling_params)
# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
@LIUKAI0815 LIUKAI0815 added the bug Something isn't working label May 31, 2024
@DarkLight1337
Copy link
Collaborator

DarkLight1337 commented May 31, 2024

To better understand the issue, please show how you timed the code. You might have included the startup time of the offline inference LLM which would make it an unfair comparison.

@LIUKAI0815
Copy link
Author

I made the wrong comparison

@DarkLight1337 DarkLight1337 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants