Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for 0-shot classification pipeline with TEI on HF inference endpoints #221

Open
MoritzLaurer opened this issue Mar 28, 2024 · 1 comment

Comments

@MoritzLaurer
Copy link

Feature request

It would be amazing if 0-shot text classifiers that are designed to work with the HF zeroshot pipeline were supported by TEI and HF inference endpoints.

I tried a deployment like this:

from huggingface_hub import create_inference_endpoint

endpoint = create_inference_endpoint(
    name="roberta-base-zeroshot-v2-0-test4",  #"roberta-emotions-test2",  #"roberta-base-zeroshot-v2-0-test",
    repository="MoritzLaurer/roberta-base-zeroshot-v2.0",  #"SamLowe/roberta-base-go_emotions",  #"MoritzLaurer/roberta-base-zeroshot-v2.0",
    namespace="MoritzLaurer",
    framework="pytorch",
    task="zero-shot-classification",  #"zero-shot-classification",  #"text-classification",
    accelerator="gpu",
    vendor="aws",
    region="us-east-1",
    type="protected",
    min_replica=0,
    max_replica=1,
    instance_type="g5.2xlarge",  # options: https://github.com/huggingface/hf-endpoints/issues/1090#issuecomment-1909482979
    instance_size="medium",
    custom_image={
        "health_route": "/health",
        # params: https://github.com/huggingface/text-embeddings-inference?tab=readme-ov-file#docker
        "env": {
            #"MAX_BATCH_TOKENS": "16384",
            #"MAX_CONCURRENT_REQUESTS": "512",
            #"DTYPE": "float16",
            "MODEL_ID": "/repository",
        },
        "url": "ghcr.io/huggingface/text-embeddings-inference:86-1.2",  # options: https://github.com/huggingface/text-embeddings-inference?tab=readme-ov-file#docker-images
    },
)

But local inference with the TEI endpoint seems to ignore the 0-shot pipeline parameters:

import requests

API_URL = endpoint.url   #+ "/predict" #"https://p541kpj2wbtdybru.us-east-1.aws.endpoints.huggingface.cloud"
headers = {
	"Accept" : "application/json",
	"Authorization": f"Bearer {os.getenv('HF_TOKEN')}",
	"Content-Type": "application/json" 
}

def query(payload):
	response = requests.post(API_URL, headers=headers, json=payload)
	return response.json()

output = query({
	"inputs": "I like you. I love you",
	# https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.ZeroShotClassificationPipeline.__call__
	"parameters": {
		"hypothesis_template": "This example is {}",
		"candidate_labels": ["joy", "anger", "sadness", "surprise", "fear", "love", "hate"],
		"multi_label": False,
	}
})

print(output)
#[{'score': 0.57349926, 'label': 'entailment'},
#{'score': 0.4265007, 'label': 'not_entailment'}]

The output is the output from the model as if it were used as a normal classifier. The expectation would be that the
task="zero-shot-classification" flag changes how the input is processed internally in accordance with the 0-shot pipeline and output probabilities for each class in "candidate_labels".

In the endpoint playground, the model deployed with the TEI container produces the following error:
Screenshot 2024-03-28 at 10 33 17

The same model with the same deployment code works, if I do not use a custom_image with a TEI container. I imagine that this is because the 0-shot pipeline is not supported by TEI? (Not sure if changes to inference endpoints would be required for this as well)

Note: One API call with one text an 8 candidate labels requires 8 forward-passes in the model (one for each label), given how the 0-shot pipeline and NLI-based 0-shot models work. Not sure to what extent this complicates things for TEI and things like continuous batching.

Motivation

Zeroshot classifiers are downloaded millions of times via the HF Hub and are part of the default models in the HF inference endpoint catalogue.
See also this internal thread on upcoming new 0-shot classifiers.

@OlivierDehaene

Your contribution

Happy to contribute to this feature

@delibae
Copy link

delibae commented Apr 13, 2024

I too am interested in seeing this feature realized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants