Skip to content

Conversation

WoosukKwon
Copy link
Collaborator

This PR fixes test_worker.py and renames to test_model_runner.py.

@WoosukKwon WoosukKwon changed the title Fix worker test Fix broken worker test Dec 3, 2023
@WoosukKwon WoosukKwon merged commit cd3aa15 into main Dec 3, 2023
@WoosukKwon WoosukKwon deleted the fix-worker-test branch December 3, 2023 06:17
@WoosukKwon
Copy link
Collaborator Author

@simon-mo thanks for the quick review!

xjpang pushed a commit to xjpang/vllm that referenced this pull request Dec 4, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
jinyouzhi pushed a commit to jinyouzhi/vllm that referenced this pull request Sep 12, 2025
… branch aice/v1.22.0 (vllm-project#1900)

Supported Qwen3, Bert and Roberta based Rerank and Score on HPU,
including online inference and offline inference.
E.g. 
BAAI/bge-reranker-base, BAAI/bge-reranker-large,
BAAI/bge-reranker-v2-m3, cross-encoder/quora-roberta-base,
cross-encoder/ms-marco-MiniLM-L-6-v2,
tomaarsen/Qwen3-Reranker-0.6B-seq-cls, Qwen/Qwen3-Reranker-0.6B

Qwen3 online serving: 
vllm serve Qwen/Qwen3-Reranker-0.6B --hf_overrides '{"architectures":
["Qwen3ForSequenceClassification"],"classifier_from_token": ["no",
"yes"],"is_original_qwen3_reranker": true}'

client: 
curl http://127.0.0.1:8000/score \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "text_1": "ping",
    "text_2": "pong",
    "model": "Qwen/Qwen3-Reranker-0.6B"
  }'

curl http://127.0.0.1:8000/rerank \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "ping",
    "documents": ["pong"],
    "model": "Qwen/Qwen3-Reranker-0.6B"
  }'

---------

Signed-off-by: gyou2021 <ganmei.you@intel.com>
Signed-off-by: inkcherry <bo.o.li@intel.com>
Co-authored-by: wang.yuqi <noooop@126.com>
Co-authored-by: inkcherry <bo.o.li@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants