Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huggingface server fails to start on OpenShift cluster #3562

Closed
eyalcha opened this issue Mar 31, 2024 · 3 comments · Fixed by #3576
Closed

Huggingface server fails to start on OpenShift cluster #3562

eyalcha opened this issue Mar 31, 2024 · 3 comments · Fixed by #3576
Labels

Comments

@eyalcha
Copy link
Contributor

eyalcha commented Mar 31, 2024

/kind bug

Probably need to add something like that for OpenShift

# For OpenShift deployment to allow mkdir
ENV HUGGINGFACE_HUB_CACHE="/tmp/huggingface/hub"
There was a problem when trying to write in your cache folder (/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/huggingfaceserver/huggingfaceserver/__main__.py", line 69, in <module>
    model.load()
  File "/huggingfaceserver/huggingfaceserver/model.py", line 114, in load
    model_config = AutoConfig.from_pretrained(model_id_or_path)
  File "/prod_venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1100, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/prod_venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 634, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/prod_venv/lib/python3.10/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
    resolved_config_file = cached_file(
  File "/prod_venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 385, in cached_file
    resolved_file = hf_hub_download(
  File "/prod_venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
  File "/prod_venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1418, in hf_hub_download
    os.makedirs(os.path.dirname(blob_path), exist_ok=True)
  File "/usr/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/usr/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/usr/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  [Previous line repeated 1 more time]
  File "/usr/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/.cache'
@spolti
Copy link
Contributor

spolti commented Apr 1, 2024

@terrytangyuan @israel-hdez f.y.i

@eyalcha
Copy link
Contributor Author

eyalcha commented Apr 2, 2024

This works for now

  containers:
    - name: kserve-container
      image: huggingfaceserver:replace
      args:
        - --model_name={{.Name}}
      env:
        - name: HUGGINGFACE_HUB_CACHE
          value: /tmp/huggingface/hub
      resources:
        requests:
          cpu: "1"
          memory: 2Gi
        limits:
          cpu: "1"
          memory: 2Gi

terrytangyuan added a commit to terrytangyuan/kserve that referenced this issue Apr 4, 2024
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
@terrytangyuan
Copy link
Member

Thanks. Sent a PR to add this #3576

yuzisun pushed a commit that referenced this issue Apr 9, 2024
…#3576)

* Set writable cache folder to avoid permission issue. Fixes #3562

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Update huggingface_server.Dockerfile

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Empty-Commit

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
tjandy98 pushed a commit to tjandy98/kserve that referenced this issue Apr 10, 2024
…e#3562 (kserve#3576)

* Set writable cache folder to avoid permission issue. Fixes kserve#3562

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Update huggingface_server.Dockerfile

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

* Empty-Commit

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants