Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration into mobile apps #66

Open
satvikpendem opened this issue Jan 30, 2024 · 1 comment
Open

Integration into mobile apps #66

satvikpendem opened this issue Jan 30, 2024 · 1 comment

Comments

@satvikpendem
Copy link

Is there a way that this can be integrated as a realtime text-to-speech engine for stuff like mobile phones? As well, is WhisperSpeech at that level of realtime inference yet or is there more to be done?

@jpc
Copy link
Contributor

jpc commented Jan 30, 2024

The inference for the currently released models is quite well optimized in PyTorch. If this is too slow then there are also smaller models available (tiny and base) which are quite a bit faster.

Unfortunately I am not an expert in how difficult it is to export these autoregressive models to CoreML or TensorFlow-Lite. They are quite a bit different from traditional image recognition models since we have to run the model hundreds of times to generate the whole sequence.

That said any tutorial on running OpenAI Whisper or Large Language Models on mobile devices would be a great resource since WhisperSpeech is works very much like Whisper or a small LLM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants