Add Audio inputs available in apply_chat_template

### Feature request

Hello, I would like to request support for audio processing in the apply_chat_template function.

### Motivation

With the rapid advancement of multimodal models, audio processing has become increasingly crucial alongside image and text inputs. Models like Qwen2-Audio, Phi-4-multimodal, and various models now support audio understanding, making this feature essential for modern AI applications.

Supporting audio inputs would enable:
```python
messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are a helpful assistant."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg"},
            {"type": "audio", "audio": "https://huggingface.co/microsoft/Phi-4-multimodal-instruct/resolve/main/examples/what_is_shown_in_this_image.wav"},
            {"type": "text", "text": "Follow the instruction in the audio with this image."}
        ]
    }
]
```
This enhancement would significantly expand the capabilities of the library to handle the full spectrum of multimodal inputs that state-of-the-art models now support, keeping the transformers library at the forefront of multimodal AI development.

### Your contribution

I've tested this implementation with several multimodal models and it works well for processing audio inputs alongside images and text. I'd be happy to contribute this code to the repository if there's interest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Audio inputs available in apply_chat_template #36769

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Audio inputs available in apply_chat_template #36769

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions