Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(draft) tts: Sesame support #12549

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

pminev
Copy link
Contributor

@pminev pminev commented Mar 24, 2025

I've started a draft for sesame support. For now I only translated the safetensor models to gguf, so the next steps are to start implementing them. Not entirely sure if I've translated them 100% correctly, but will check them in the process and fix it.
I also had to add some properties to the safetensor config file:

{
    "architectures": [
        "LlamaForCausalLM"
    ],
    "n_layers": 128,
    "num_hidden_layers": 128,
    "n_layer": 128,
    "num_layers": 128,
    "num_attention_heads": 128,
^added
------------------------------------
    "backbone_flavor": "llama-1B",
    "decoder_flavor": "llama-100M",
    "text_vocab_size": 128256,
    "audio_vocab_size": 2051,
    "audio_num_codebooks": 32
}

Since they have 2 models in the model.safetensor I split them and then run convert_to_gguf

python3 split_hf.py

python3 convert_hf_to_gguf.py my-models/csm/backbone --outfile my-models/csm/gguf/csm-backbone-1B-f16.gguf --outtype f16
 
python3 convert_hf_to_gguf.py my-models/csm/decoder --outfile my-models/csm/gguf/csm-decoder-1B-f16.gguf --outtype f16

@pminev pminev marked this pull request as draft March 24, 2025 17:14
@github-actions github-actions bot added examples python python script changes labels Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant