Skip to content

Add support for Chatterbox #1434

@brianferri

Description

@brianferri

Model description

Chatterbox is a multilingual, zero-shot Text-to-Speech (TTS) model designed for flexible voice synthesis across a wide range of languages without requiring task-specific fine-tuning. It is built on a 0.5B parameter Llama-based architecture.

Find some more information on deepwiki

Prerequisites

  • The model is supported in Transformers (i.e., listed here)
  • The model can be exported to ONNX with Optimum (i.e., listed here)

Additional information

The main repo along with some examples can be found here.
Some attention to it has already been made in the onnx-community, providing onnx exports for the pre-trained model here

Your contribution

With appropriate guidance I can contribute directly to the implementation of this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    new modelRequest a new model

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions