Skip to content

ai-anchorite/Voice-Pro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice-Pro for Pinokio

Voice-Pro repacked for Pinokio — 1-click install

This is a repackaged version of ABUS's Voice-Pro AI voice app, modified to run cleanly through Pinokio's launcher system. The original project's conda-based installer has been replaced with Pinokio scripts that handle virtual environments, dependency installation, and GPU setup automatically.

Credits

All credit for Voice-Pro goes to ABUS / abus-aikorea. This is a repackaging for Pinokio compatibility, licensed under GPL-3.0 per the original project.


📄 Original Voice-Pro README

Voice-Pro

The best AI speech recognition, translation, and multilingual dubbing solution 🚀

Dubbing Studio


🎙️ An AI-powered web application for speech recognition, translation, and dubbing

South Korea Flag 한국어 United Kingdom Flag English China Flag 中文简体 Taiwan Flag 中文繁體 Japan Flag 日本語 Germany Flag Deutsch Spain Flag Español Portugal Flag Português

Voice-Pro is a state-of-the-art web app that transforms multimedia content creation. It integrates YouTube video downloading, voice separation, speech recognition, translation, and text-to-speech into a single, powerful tool for creators, researchers, and multilingual professionals.

  • 🔊 Top-tier speech recognition: Whisper, Faster-Whisper, Whisper-Timestamped, WhisperX
  • 🎤 Zero-shot voice cloning: F5-TTS, E2-TTS, CosyVoice
  • 📢 Multilingual text-to-speech: Edge-TTS, kokoro (Paid version includes Azure TTS)
  • 🎥 YouTube processing & audio extraction: yt-dlp
  • 🌍 Instant translation for 100+ languages: Deep-Translator (Paid version includes Azure Translator)

A robust alternative to ElevenLabs, Voice-Pro empowers podcasters, developers, and creators with advanced voice solutions.

⚠️ Please Note

  • Due to WeConnect development work, Voice-Pro development and updates are not possible for the time being.
  • We have made all Voice-Pro code open source and completely free. Voice-Pro can now be freely distributed and modified by anyone.
  • It works well on Windows with NVIDIA GPU. Operation on Mac and Linux has not been verified.
  • Please leave your requests on the GitHub Issues or GitHub Discussions pages.

📰 News & History

version 3.2
  • We have been focusing on WeConnect development for the past few months and have not been able to manage Voice-Pro at all.
  • We have decided to open source all Voice-Pro code.
  • Voice-Pro is completely free and supports Windows, Mac, Linux.
  • WeConnect is an application for global cultural exchange.
  • Connect with people from all over the world for meaningful cultural exchanges, language learning, and international friendships.

ScreenShot 0 ScreenShot 1 ScreenShot 2 ScreenShot 3 ScreenShot 4

version 3.1
version 3.0
  • 🔥 Removed the AI Cover feature.
  • 🚀 Added support for m-bain/whisperX.
version 2.0
  • 🐍 Built with Python 3.10.15, Torch 2.5.1+cu124, and Gradio 5.14.0.
  • 🆓 Free trial supports media up to 60 seconds in length.
  • 🔥 Added the AI Cover feature.
  • 🎤 Introduced support for CosyVoice and kokoro.
  • ⏳ Initial run downloads CozyVoice2-0.5B (9GB), which may take over an hour depending on network speed.
  • 🎧 Voice samples for cloning will be continuously updated.
  • 📝 Added spaCy for natural sentence-by-sentence translation and TTS.
  • ☁️ Subscription version includes Microsoft Azure Translator and TTS.
  • 🏪 Subscription offers unlimited usage (no 60-second limit) during the subscription period, available via Shopify.

🎥 YouTube Showcase

Demo Video 1
Demo for Voice-Pro (v2.0)
Demo Video 2
F5-TTS: Voice Cloning
Demo Video 3
Live Transcription & Translation
Demo Video 4
Multi-Lingual Voice Cloning: Korean - German
Demo Video 5
Multi-Lingual Voice Cloning: English - Korean
Demo Video 6
Multi-Lingual Voice Cloning: Korean - Japanese
Demo Video 7
NVIDIA RTX Video Super-Resolution
Demo Video 8
AI Karaoke
Demo Video 5
Multi-Lingual Voice Cloning: English - Korean

⭐ Key Features

1. Dubbing Studio

  • YouTube video downloads & audio extraction
  • Voice separation with Demucs
  • Supports 100+ languages for speech recognition & translation

2. Speech Technologies

  • Speech-to-Text: Whisper, Faster-Whisper, Whisper-Timestamped, WhisperX
  • Text-to-Speech:
    • Edge-TTS: 100+ languages, 400+ voices
    • E2-TTS, F5-TTS, CosyVoice: Zero-shot cloning
    • kokoro: Ranked #2 in HuggingFace TTS Arena

3. Real-Time Translation

  • Instant speech recognition
  • Multilingual translation on the fly
  • Customizable audio inputs

🤖 WebUI

Dubbing Studio Tab

  • All-in-one hub: YouTube downloads, noise removal, subtitles, translation, & TTS
  • Supports all ffmpeg-compatible formats
  • Output options: WAV, FLAC, MP3
  • Subtitles & recognition for 100+ languages
  • TTS with speed, volume, & pitch controls

Multilingual Voice Conversion and Subtitle Generation Web UI Interface

Whisper Caption Tab

  • Subtitle-focused: 90+ languages
  • Video-integrated subtitle display
  • Word-level highlighting & denoise options

Translate Tab

  • Translation for 100+ languages
  • Supports subtitle files (ASS, SSA, SRT, etc.)
  • Real-time voice recognition & translation

WebUI for Real-Time Speech Recognition and Translation

Speech Generation Tab

  • Options: Edge-TTS, F5-TTS, CosyVoice, kokoro
  • Celeb voice podcasts & multilingual support

Podcast Production WebUI Using Voice-Cloning Technology

💻 System Requirements

  • OS: Windows 10/11 (64-bit)
  • GPU: NVIDIA with CUDA 12.4 (recommended)
  • VRAM: 4GB+ (8GB+ preferred)
  • RAM: 4GB+
  • Storage: 20GB+ free space
  • Internet: Required

🙏 Credits

©️ Copyright

by ABUS

About

[Windows+Nvidia] Ported/packaged for Pinokio -> WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors