Add support for parler-tts #25

adrianlyjak · 2024-04-14T17:25:00Z

It would be pretty cool to support Parler-TTS, an open source TTS with fairly good quality and voice customization.

If running the model local, the inference code is currently just python. Perhaps there could be a bridge and this would be a node.js only feature.

I don't know how much work it would be to make the model work with javascript+ONNX, but that would also be pretty cool, and likely useful to others to do better on device TTS.

xujialiu · 2024-05-20T06:11:52Z

I'd like to help with this project.
However, I only can work in python lang. If I develop the python API like class MyAI below:

from pathlib import Path
from myai import MyAI

model = "tts-1"

client = MyAI(api_key=model)

speech_file_path = Path(".") / "speech.mp3"

response = client.create(
    model=model
    voice="alloy",
    input="Today is a wonderful day to build something people love!",
)

with open(speech_file_path, "wb") as f:
    f.write(response.content)

This API can easily convert to node.js api using following codes:

// myai.js
import fetch from 'node-fetch';
import fs from 'fs';
import path from 'path';

class MyAI {
  constructor({ model }) {
    this.model = model;
  }

  async create({ model, voice, input }) {
    const response = await fetch('https://api.yourservice.com/v1/audio/speech', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer YOUR_API_KEY`, // 如果需要认证
      },
      body: JSON.stringify({
        model,
        voice,
        input,
      }),
    });

    if (!response.ok) {
      throw new Error(`Error: ${response.statusText}`);
    }

    const arrayBuffer = await response.arrayBuffer();
    return Buffer.from(arrayBuffer);
  }
}

export default MyAI;

And then, the MyAI class can be used through APIs like openai ones:

// main.js
import fs from 'fs';
import path from 'path';
import MyAI from './myai.js';

const client = new MyAI({ model: 'tts-1' });

const speechFilePath = path.resolve('./speech.mp3');

async function main() {
  try {
    const buffer = await client.create({
      model: 'tts-1',
      voice: 'alloy',
      input: 'Today is a wonderful day to build something people love!',
    });

    await fs.promises.writeFile(speechFilePath, buffer);
    console.log(`Speech file saved at ${speechFilePath}`);
  } catch (error) {
    console.error('Error generating speech:', error);
  }
}

main();

I need to mention, the javascript code is actually written by GPT-4o, and I did not learnt javascript. If this process can work, I'd like to help developing the Python API.

Thanks.

adrianlyjak · 2024-05-22T14:11:26Z

Thanks for looking @xujialiu ! I wouldn't expect you to be able to implement parler-tts to JS. Converting parler-tts to javascript will be a non-trivial. The model code, dependencies, and weights needs to be converted to something that can be run from javascript, or this needs to be turned into a desktop only feature with some sort of python shim layer. This alone is a rather huge undertaking. On top of that, the models are also rather large and need to be downloaded, giving some sort of feedback about download progress. On top of this, on many devices they will run slower than real-time

adrianlyjak added the enhancement New feature or request label Apr 14, 2024

adrianlyjak mentioned this issue Apr 21, 2024

audio is regenerated frequently #28

Closed

adrianlyjak added the model-support label Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for parler-tts #25

Add support for parler-tts #25

adrianlyjak commented Apr 14, 2024

xujialiu commented May 20, 2024

adrianlyjak commented May 22, 2024

Add support for parler-tts #25

Add support for parler-tts #25

Comments

adrianlyjak commented Apr 14, 2024

xujialiu commented May 20, 2024

adrianlyjak commented May 22, 2024