📢 New Release for Windows & macOS Desktop, Welcome to Test and Provide Feedback [Documentation is a bit outdated, ongoing updates]
Krillin AI is a versatile audio and video localization and enhancement solution. This minimalist yet powerful tool integrates video translation, dubbing, and voice cloning, supporting both landscape and portrait formats to ensure perfect presentation on all major platforms (Bilibili, Xiaohongshu, Douyin, WeChat Video, Kuaishou, YouTube, TikTok, etc.). With an end-to-end workflow, Krillin AI can transform raw materials into polished, ready-to-use cross-platform content with just a few clicks.
🎯 One-Click Start: No complex environment setup required, automatic dependency installation, ready to use immediately, with a new desktop version for easier access!
📥 Video Acquisition: Supports downloading via yt-dlp or local file uploads.
📜 Accurate Recognition: High-accuracy speech recognition based on Whisper.
🧠 Intelligent Segmentation: Subtitle segmentation and alignment using LLM.
🔄 Terminology Replacement: One-click replacement of specialized vocabulary.
🌍 Professional Translation: LLM-based paragraph-level translation maintaining semantic coherence.
🎙️ Voice Cloning: Offers selected voice tones from CosyVoice or custom voice cloning.
🎬 Video Composition: Automatically handles landscape and portrait video and subtitle layout.
The image below shows the effect of the subtitle file generated after importing a 46-minute local video and executing it with one click, without any manual adjustments. There are no omissions or overlaps, the sentence breaks are natural, and the translation quality is very high.
subtitle_translation.mp4 |
tts.mp4 |
agi.mp4 |
All local models in the table below support automatic installation of executable files + model files; you just need to select, and KrillinAI will prepare everything for you.
Service Source | Supported Platforms | Model Options | Local/Cloud | Remarks |
---|---|---|---|---|
OpenAI Whisper | All Platforms | - | Cloud | Fast and effective |
FasterWhisper | Windows/Linux | tiny /medium /large-v2 (recommended medium+) |
Local | Faster, no cloud service costs |
WhisperKit | macOS (M-series chips only) | large-v2 |
Local | Native optimization for Apple chips |
Alibaba Cloud ASR | All Platforms | - | Cloud | Avoids network issues in mainland China |
✅ Compatible with all cloud/local large language model services that comply with OpenAI API specifications, including but not limited to:
- OpenAI
- DeepSeek
- Tongyi Qianwen
- Locally deployed open-source models
- Other API services compatible with OpenAI format
Input languages supported: Chinese, English, Japanese, German, Turkish, Korean, Russian, Malay (continuously increasing)
Translation languages supported: English, Chinese, Russian, Spanish, French, and 101 other languages
First, download the executable file that matches your device system from the Release, then follow the tutorial below to choose between the desktop version or non-desktop version, and place it in an empty folder. Downloading the software into an empty folder is recommended as it will generate some directories after running, making it easier to manage.
【For the desktop version, i.e., release files with "desktop" in the name, see here】
The desktop version is newly released to address the issue of novice users struggling to edit configuration files correctly, and there are still some bugs, ongoing updates.
- Double-click the file to start using it (the desktop version also requires configuration within the software).
【For the non-desktop version, i.e., release files without "desktop" in the name, see here】
The non-desktop version is the initial version, with a more complex configuration but stable functionality, suitable for server deployment as it provides a UI via web.
- Create a
config
folder within the directory, then create aconfig.toml
file inside theconfig
folder. Copy the contents of theconfig-example.toml
file from the source codeconfig
directory intoconfig.toml
, and fill in your configuration information accordingly. - Double-click or execute the executable file in the terminal to start the service.
- Open your browser and enter
http://127.0.0.1:8888
to start using it (replace 8888 with the port you filled in the configuration file).
【For the desktop version, i.e., release files with "desktop" in the name, see here】
Due to signing issues, the desktop version currently cannot be run directly by double-clicking or installed via DMG; you need to manually trust the application. The method is as follows:
- Open the terminal in the directory where the executable file (assuming the file name is KrillinAI_1.0.0_desktop_macOS_arm64) is located.
- Execute the following commands in order:
sudo xattr -cr ./KrillinAI_1.0.0_desktop_macOS_arm64
sudo chmod +x ./KrillinAI_1.0.0_desktop_macOS_arm64
./KrillinAI_1.0.0_desktop_macOS_arm64
【For the non-desktop version, i.e., release files without "desktop" in the name, see here】
This software is not signed, so when running on macOS, after completing the file configuration in the "Basic Steps," you also need to manually trust the application. The method is as follows:
- Open the terminal in the directory where the executable file (assuming the file name is KrillinAI_1.0.0_macOS_arm64) is located.
- Execute the following commands in order:
This will start the service.
sudo xattr -rd com.apple.quarantine ./KrillinAI_1.0.0_macOS_arm64 sudo chmod +x ./KrillinAI_1.0.0_macOS_arm64 ./KrillinAI_1.0.0_macOS_arm64
This project supports Docker deployment; please refer to the Docker Deployment Instructions.
If you encounter issues with video downloads failing,
please refer to the Cookie Configuration Instructions to configure your Cookie information.
The quickest and easiest configuration method:
- Set both
transcription_provider
andllm_provider
toopenai
, so you only need to fill inopenai.apikey
in the three configuration categories below:openai
,local_model
, andaliyun
. (app.proxy
,model
, andopenai.base_url
can be filled in as needed).
Using a local language recognition model (not yet supported on macOS) configuration method (balancing cost, speed, and quality):
- Set
transcription_provider
tofasterwhisper
andllm_provider
toopenai
, so you only need to fill inopenai.apikey
andlocal_model.faster_whisper
in the three configuration categories below:openai
andlocal_model
. The local model will be downloaded automatically. (app.proxy
andopenai.base_url
as above).
For the following usage scenarios, Alibaba Cloud configuration is required:
- If
llm_provider
is set toaliyun
, you need to use Alibaba Cloud's large model service, so you need to configure thealiyun.bailian
item. - If
transcription_provider
is set toaliyun
, or if the "Dubbing" feature is enabled when starting a task, you need to use Alibaba Cloud's speech service, so you need to fill in thealiyun.speech
item. - If the "Dubbing" feature is enabled and you have uploaded local audio for voice cloning, you will also need to use Alibaba Cloud's OSS cloud storage service, so you need to fill in the
aliyun.oss
item.
Alibaba Cloud configuration help: Alibaba Cloud Configuration Instructions.
Please visit Frequently Asked Questions.
- Do not submit useless files such as .vscode, .idea, etc.; please use .gitignore to filter them out.
- Do not submit config.toml; instead, submit config-example.toml.
- Join our QQ group for questions: 754069680.
- Follow our social media accounts, Bilibili, where we share quality content in the AI technology field daily.