Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for parler-tts #25

Open
adrianlyjak opened this issue Apr 14, 2024 · 2 comments
Open

Add support for parler-tts #25

adrianlyjak opened this issue Apr 14, 2024 · 2 comments
Labels
enhancement New feature or request model-support

Comments

@adrianlyjak
Copy link
Owner

It would be pretty cool to support Parler-TTS, an open source TTS with fairly good quality and voice customization.

If running the model local, the inference code is currently just python. Perhaps there could be a bridge and this would be a node.js only feature.

I don't know how much work it would be to make the model work with javascript+ONNX, but that would also be pretty cool, and likely useful to others to do better on device TTS.

@xujialiu
Copy link

I'd like to help with this project.
However, I only can work in python lang. If I develop the python API like class MyAI below:

from pathlib import Path
from myai import MyAI

model = "tts-1"

client = MyAI(api_key=model)

speech_file_path = Path(".") / "speech.mp3"

response = client.create(
    model=model
    voice="alloy",
    input="Today is a wonderful day to build something people love!",
)

with open(speech_file_path, "wb") as f:
    f.write(response.content)

This API can easily convert to node.js api using following codes:

// myai.js
import fetch from 'node-fetch';
import fs from 'fs';
import path from 'path';

class MyAI {
  constructor({ model }) {
    this.model = model;
  }

  async create({ model, voice, input }) {
    const response = await fetch('https://api.yourservice.com/v1/audio/speech', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer YOUR_API_KEY`, // 如果需要认证
      },
      body: JSON.stringify({
        model,
        voice,
        input,
      }),
    });

    if (!response.ok) {
      throw new Error(`Error: ${response.statusText}`);
    }

    const arrayBuffer = await response.arrayBuffer();
    return Buffer.from(arrayBuffer);
  }
}

export default MyAI;

And then, the MyAI class can be used through APIs like openai ones:

// main.js
import fs from 'fs';
import path from 'path';
import MyAI from './myai.js';

const client = new MyAI({ model: 'tts-1' });

const speechFilePath = path.resolve('./speech.mp3');

async function main() {
  try {
    const buffer = await client.create({
      model: 'tts-1',
      voice: 'alloy',
      input: 'Today is a wonderful day to build something people love!',
    });

    await fs.promises.writeFile(speechFilePath, buffer);
    console.log(`Speech file saved at ${speechFilePath}`);
  } catch (error) {
    console.error('Error generating speech:', error);
  }
}

main();

I need to mention, the javascript code is actually written by GPT-4o, and I did not learnt javascript. If this process can work, I'd like to help developing the Python API.

Thanks.

@adrianlyjak
Copy link
Owner Author

Thanks for looking @xujialiu ! I wouldn't expect you to be able to implement parler-tts to JS. Converting parler-tts to javascript will be a non-trivial. The model code, dependencies, and weights needs to be converted to something that can be run from javascript, or this needs to be turned into a desktop only feature with some sort of python shim layer. This alone is a rather huge undertaking. On top of that, the models are also rather large and need to be downloaded, giving some sort of feedback about download progress. On top of this, on many devices they will run slower than real-time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request model-support
Projects
None yet
Development

No branches or pull requests

2 participants