404 for Multi-modal docs #1853

RonanKMcGovern · 2024-05-03T08:33:28Z

System Info

NA

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

It is unclear how to query TGI for multi-modal models.

The links to LLaVA Next and IDEFICS2 give 404:

https://huggingface.co/docs/text-generation-inference/HuggingFaceM4/idefics-9b-instruct

https://huggingface.co/docs/text-generation-inference/llava-hf/llava-v1.6-mistral-7b-hf

@Narsil @VictorSanh

Expected behavior

When querying transformers, an <image> placeholder is used and the images are passed as a separate input argument to the prompt. This doesn't appear to be the case with TGI, which just expects a prompt input.

Something like this:

curl https://yd64jhjr8ylu54-8080.proxy.runpod.net/generate \
    -X POST \
    -d '{"inputs": "User: ![](http://images.cocodataset.org/val2017/000000219578.jpg)Tell me about this image<end_of_utterance>\\nAssistant:","parameters":{"max_new_tokens":20}}' \
    -H 'Content-Type: application/json'

works, although it fails when trying to do two images (the model ignores the second image):

curl https://yd64jhjr8ylu54-8080.proxy.runpod.net/generate \
    -X POST \
    -d '{"inputs": "User: ![](http://images.cocodataset.org/val2017/000000219578.jpg)Tell me about this image, and also about this second image: ![](http://images.cocodataset.org/val2017/000000039769.jpg)<end_of_utterance>\\nAssistant:","parameters":{"max_new_tokens":50}}' \
    -H 'Content-Type: application/json'

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

404 for Multi-modal docs #1853

404 for Multi-modal docs #1853

RonanKMcGovern commented May 3, 2024

404 for Multi-modal docs #1853

404 for Multi-modal docs #1853

Comments

RonanKMcGovern commented May 3, 2024

System Info

Information

Tasks

Reproduction

Expected behavior