Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to supply a token to the hugging face inference server run time? #3693

Open
empath-nirvana opened this issue May 15, 2024 · 2 comments

Comments

@empath-nirvana
Copy link

/kind feature

Describe the solution you'd like
I'd like to be able to provide a token so i can download private hugging face models

I figured out how to add a secret but there's nothing in the code that actually checks environment variables for tokens or will use tokens.

the AutoConfig from_pretrained_model function seems to support a token kwarg, so it should be pretty simple to add.

@yuzisun
Copy link
Member

yuzisun commented May 27, 2024

Cc @andyi2it

@andyi2it
Copy link
Contributor

andyi2it commented Jun 13, 2024

@empath-nirvana You should be able to do that using env variables.

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: huggingface-llama2
spec:
  predictor:
    model:
      modelFormat:
        name: huggingface
      args:
      - --model_name=<model_name>
      - --model_id=<private_model>
      - --backend=huggingface
      - --task=text_generation
      env:
      - name: HF_TOKEN
        value: <token> or # (envFromSecret)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants