Adding support for 0-shot classification pipeline with TEI on HF inference endpoints #221

MoritzLaurer · 2024-03-28T09:43:25Z

Feature request

It would be amazing if 0-shot text classifiers that are designed to work with the HF zeroshot pipeline were supported by TEI and HF inference endpoints.

I tried a deployment like this:

from huggingface_hub import create_inference_endpoint

endpoint = create_inference_endpoint(
    name="roberta-base-zeroshot-v2-0-test4",  #"roberta-emotions-test2",  #"roberta-base-zeroshot-v2-0-test",
    repository="MoritzLaurer/roberta-base-zeroshot-v2.0",  #"SamLowe/roberta-base-go_emotions",  #"MoritzLaurer/roberta-base-zeroshot-v2.0",
    namespace="MoritzLaurer",
    framework="pytorch",
    task="zero-shot-classification",  #"zero-shot-classification",  #"text-classification",
    accelerator="gpu",
    vendor="aws",
    region="us-east-1",
    type="protected",
    min_replica=0,
    max_replica=1,
    instance_type="g5.2xlarge",  # options: https://github.com/huggingface/hf-endpoints/issues/1090#issuecomment-1909482979
    instance_size="medium",
    custom_image={
        "health_route": "/health",
        # params: https://github.com/huggingface/text-embeddings-inference?tab=readme-ov-file#docker
        "env": {
            #"MAX_BATCH_TOKENS": "16384",
            #"MAX_CONCURRENT_REQUESTS": "512",
            #"DTYPE": "float16",
            "MODEL_ID": "/repository",
        },
        "url": "ghcr.io/huggingface/text-embeddings-inference:86-1.2",  # options: https://github.com/huggingface/text-embeddings-inference?tab=readme-ov-file#docker-images
    },
)

But local inference with the TEI endpoint seems to ignore the 0-shot pipeline parameters:

import requests

API_URL = endpoint.url   #+ "/predict" #"https://p541kpj2wbtdybru.us-east-1.aws.endpoints.huggingface.cloud"
headers = {
	"Accept" : "application/json",
	"Authorization": f"Bearer {os.getenv('HF_TOKEN')}",
	"Content-Type": "application/json" 
}

def query(payload):
	response = requests.post(API_URL, headers=headers, json=payload)
	return response.json()

output = query({
	"inputs": "I like you. I love you",
	# https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.ZeroShotClassificationPipeline.__call__
	"parameters": {
		"hypothesis_template": "This example is {}",
		"candidate_labels": ["joy", "anger", "sadness", "surprise", "fear", "love", "hate"],
		"multi_label": False,
	}
})

print(output)
#[{'score': 0.57349926, 'label': 'entailment'},
#{'score': 0.4265007, 'label': 'not_entailment'}]

The output is the output from the model as if it were used as a normal classifier. The expectation would be that the
task="zero-shot-classification" flag changes how the input is processed internally in accordance with the 0-shot pipeline and output probabilities for each class in "candidate_labels".

In the endpoint playground, the model deployed with the TEI container produces the following error:

The same model with the same deployment code works, if I do not use a custom_image with a TEI container. I imagine that this is because the 0-shot pipeline is not supported by TEI? (Not sure if changes to inference endpoints would be required for this as well)

Note: One API call with one text an 8 candidate labels requires 8 forward-passes in the model (one for each label), given how the 0-shot pipeline and NLI-based 0-shot models work. Not sure to what extent this complicates things for TEI and things like continuous batching.

Motivation

Zeroshot classifiers are downloaded millions of times via the HF Hub and are part of the default models in the HF inference endpoint catalogue.
See also this internal thread on upcoming new 0-shot classifiers.

@OlivierDehaene

Your contribution

Happy to contribute to this feature

The text was updated successfully, but these errors were encountered:

delibae · 2024-04-13T13:55:35Z

I too am interested in seeing this feature realized.

OlivierDehaene mentioned this issue Apr 15, 2024

Support for zero-shot classification models #238

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for 0-shot classification pipeline with TEI on HF inference endpoints #221

Adding support for 0-shot classification pipeline with TEI on HF inference endpoints #221

MoritzLaurer commented Mar 28, 2024

delibae commented Apr 13, 2024

Adding support for 0-shot classification pipeline with TEI on HF inference endpoints #221

Adding support for 0-shot classification pipeline with TEI on HF inference endpoints #221

Comments

MoritzLaurer commented Mar 28, 2024

Feature request

Motivation

Your contribution

delibae commented Apr 13, 2024