Is there a way to supply a token to the hugging face inference server run time? #3693

empath-nirvana · 2024-05-15T21:48:19Z

/kind feature

Describe the solution you'd like
I'd like to be able to provide a token so i can download private hugging face models

I figured out how to add a secret but there's nothing in the code that actually checks environment variables for tokens or will use tokens.

the AutoConfig from_pretrained_model function seems to support a token kwarg, so it should be pretty simple to add.

yuzisun · 2024-05-27T12:17:02Z

Cc @andyi2it

andyi2it · 2024-06-13T07:07:08Z

@empath-nirvana You should be able to do that using env variables.

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: huggingface-llama2
spec:
  predictor:
    model:
      modelFormat:
        name: huggingface
      args:
      - --model_name=<model_name>
      - --model_id=<private_model>
      - --backend=huggingface
      - --task=text_generation
      env:
      - name: HF_TOKEN
        value: <token> or # (envFromSecret)

oss-prow-bot bot added the kind/feature label May 15, 2024

yuzisun added the kserve/huggingface label May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to supply a token to the hugging face inference server run time? #3693

Is there a way to supply a token to the hugging face inference server run time? #3693

empath-nirvana commented May 15, 2024

yuzisun commented May 27, 2024

andyi2it commented Jun 13, 2024 •

edited

Is there a way to supply a token to the hugging face inference server run time? #3693

Is there a way to supply a token to the hugging face inference server run time? #3693

Comments

empath-nirvana commented May 15, 2024

yuzisun commented May 27, 2024

andyi2it commented Jun 13, 2024 • edited

andyi2it commented Jun 13, 2024 •

edited