[Feature]: Decouple the benchmark script from the components of vLLM #5586

zhyncs · 2024-06-17T03:52:49Z

🚀 The feature, motivation and pitch

Currently, the benchmark script of vLLM supports multiple backends, and the overall functionality is also relatively rich.

And it relies on backend_request_func and get_tokenizer. The backend_request_func is independent and is a separate file but if we want to use get_tokenizer, we need to clone the repository or install Python package.

vllm/benchmarks/benchmark_serving.py

Lines 37 to 42 in 845a3f2

    
           from backend_request_func import (ASYNC_REQUEST_FUNCS, RequestFuncInput, 
        
                                             RequestFuncOutput) 
        
           from tqdm.asyncio import tqdm 
        
           from transformers import PreTrainedTokenizerBase 
        
           from vllm.transformers_utils.tokenizer import get_tokenizer

vllm/vllm/transformers_utils/tokenizer.py

Line 57 in 845a3f2

def get_tokenizer(

When we typically use the vLLM script to benchmark other backends, we do not want to rely on vLLM components. We don't want to clone the repository or install a Python package.

May I submit a PR to extract the function get_tokenizer into backend_request_func? Do you think this is okay or do you have any other suggestions? Thanks.

@ywang96 @simon-mo

Alternatives

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

ywang96 · 2024-06-17T04:45:59Z

Hey @zhyncs! First of all thanks for the feedback on benchmark_serving.py and I'm glad you liked its functionality.

Yes I think it totally makes sense to decouple serving benchmark out of vLLM since it doesn't really have any dependencies on the library itself other than reusuing get_tokenizer. Like you said, the only thing you need to do should be just copying codes for get_tokenizer into backend_request_func so we won't need to install vLLM itself at all.

Happy to review your PR!

zhyncs added the feature request label Jun 17, 2024

zhyncs mentioned this issue Jun 17, 2024

[Misc] use AutoTokenizer for benchmark serving when vLLM not installed #5588

Merged

ywang96 closed this as completed in #5588 Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Decouple the benchmark script from the components of vLLM #5586

[Feature]: Decouple the benchmark script from the components of vLLM #5586

zhyncs commented Jun 17, 2024

ywang96 commented Jun 17, 2024 •

edited

Loading

[Feature]: Decouple the benchmark script from the components of vLLM #5586

[Feature]: Decouple the benchmark script from the components of vLLM #5586

Comments

zhyncs commented Jun 17, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

ywang96 commented Jun 17, 2024 • edited Loading

ywang96 commented Jun 17, 2024 •

edited

Loading