[Bug] EdgeCraftRAG failed on ARC with vLLM

### Priority

P2-High

### OS type

Ubuntu

### Hardware type

GPU-Arc

### Installation method

- [ ] Pull docker images from hub.docker.com
- [x] Build docker images from source
- [ ] Other
- [ ] N/A

### Deploy method

- [ ] Docker
- [x] Docker Compose
- [ ] Kubernetes Helm Charts
- [ ] Kubernetes GMC
- [ ] Other
- [ ] N/A

### Running nodes

Single Node

### What's the version?

main branch

### Description

EdgeCraftRAG failed on ARC with vLLM
https://github.com/opea-project/GenAIExamples/actions/runs/14635240084/job/41064885071?pr=1877#step:6:11383

### Reproduce steps

bash test_compose_vllm_on_arc.sh

### Raw log

```shell
[ query ] HTTP status is not 200. Received status was 500
  /usr/lib/python3.10/importlib/util.py:247: DeprecationWarning: The `openvino.runtime` module is deprecated and will be removed in the 2026.0 release. Please replace `openvino.runtime` with `openvino`.
    self.__spec__.loader.exec_module(self)
  Traceback (most recent call last):
    File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
      exec(code, run_globals)
    File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 774, in <module>
      uvloop.run(run_server(args))
    File "/usr/local/lib/python3.10/dist-packages/uvloop/__init__.py", line 82, in run
      return loop.run_until_complete(wrapper())
    File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
    File "/usr/local/lib/python3.10/dist-packages/uvloop/__init__.py", line 61, in wrapper
      return await main
    File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 740, in run_server
      async with build_async_engine_client(args) as engine_client:
    File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__
      return await anext(self.gen)
    File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 118, in build_async_engine_client
      async with build_async_engine_client_from_engine_args(
    File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__
      return await anext(self.gen)
    File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 210, in build_async_engine_client_from_engine_args
      engine_config = engine_args.create_engine_config()
    File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 1044, in create_engine_config
      model_config = self.create_model_config()
    File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 970, in create_model_config
      return ModelConfig(
    File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 289, in __init__
      self.encoder_config = self._get_encoder_config()
    File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 402, in _get_encoder_config
      return get_sentence_transformer_tokenizer_config(
    File "/usr/local/lib/python3.10/dist-packages/vllm/transformers_utils/config.py", line 394, in get_sentence_transformer_tokenizer_config
      encoder_dict = get_hf_file_to_dict(config_name, model, revision)
    File "/usr/local/lib/python3.10/dist-packages/vllm/transformers_utils/config.py", line 280, in get_hf_file_to_dict
      if file_or_path_exists(model=model,
    File "/usr/local/lib/python3.10/dist-packages/vllm/transformers_utils/config.py", line 98, in file_or_path_exists
      return file_exists(model,
    File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
      return fn(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/hf_api.py", line 2958, in file_exists
      get_hf_file_metadata(url, token=token)
    File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
      return fn(*args, **kwargs)
    File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1401, in get_hf_file_metadata
      r = _request_wrapper(
    File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 285, in _request_wrapper
      response = _request_wrapper(
    File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 308, in _request_wrapper
      response = get_session().request(method=method, url=url, **params)
    File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 575, in request
      prep = self.prepare_request(req)
    File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 484, in prepare_request
      p.prepare(
    File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 367, in prepare
      self.prepare_url(url, params)
    File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 438, in prepare_url
      raise MissingSchema(
  requests.exceptions.MissingSchema: Invalid URL '/Qwen/Qwen2-7B-Instruct/resolve/main/sentence_bert_config.json': No scheme supplied. Perhaps you meant https:///Qwen/Qwen2-7B-Instruct/resolve/main/sentence_bert_config.json?
```

### Attachments

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] EdgeCraftRAG failed on ARC with vLLM #1880

Priority

OS type

Hardware type

Installation method

Deploy method

Running nodes

What's the version?

Description

Reproduce steps

Raw log

Attachments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] EdgeCraftRAG failed on ARC with vLLM #1880

Description

Priority

OS type

Hardware type

Installation method

Deploy method

Running nodes

What's the version?

Description

Reproduce steps

Raw log

Attachments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions