Windows x64 one-click wrapper for Transkun piano audio-to-MIDI conversion. | 基于 Transkun: 钢琴音频到 MIDI 转换的 Windows x64 一键式封装程序。
本项目仅对 Transkun 做 Windows 一键运行封装,不修改其核心算法。所有功劳归于原作者。详细信息请参照项目中的USER_GUIDE与AGENTS文件。 | This project is a Windows one-click wrapper around Transkun and does not change Transkun core algorithms. All credit foes to the original author. For detailed information, please refer to the USER_GUIDE and AGENTS files in the project.
Transkun upstream: https://github.com/yujia-yan/transkun
Important
- 对于绝大多数用户(尤其是小白用户),请直接前往 Releases 下载一键使用包。
- 本仓库主要面向源码与开发者;高级用户可继续阅读下方 Quick Start。
- 下载并解压后,直接双击
Transkun GUI.exe即可运行。 - 初次运行时程序会自动创建运行环境;根据电脑配置不同,这一步可能需要数十分钟。
- 项目目录内提供
example.mp3样例音频,可直接用于测试。 - 生成的 MIDI 通常包含两份:
*_raw.mid(原始 MIDI)和*_baked.mid(已将延音应用到音符的版本),请按需使用。 - 本项目依赖高质量、纯净的原始钢琴音频输入;音频纯净度会直接决定转录效果。
- 任何其他乐器或人声都会严重影响最终转录效果。
- For most users (especially beginners), please go to Releases and download the one-click package.
- This repository is mainly source/developer-oriented; advanced users can continue with the Quick Start section below.
- After downloading and extracting, just double-click
Transkun GUI.exeto run. - On first launch, the app will automatically create the runtime environment; depending on your PC, this can take up to tens of minutes.
- A sample audio file
example.mp3is included for quick testing. - Output MIDI files usually include both
*_raw.mid(original MIDI) and*_baked.mid(sustain-applied MIDI). Use either as needed. - This project requires high-quality, clean solo piano audio input; input cleanliness directly determines transcription quality.
- Any other instruments or vocals will significantly degrade transcription results.
- Input quality directly determines transcription quality.
This project is wrapper-only for Windows one-click usage of Transkun.
Transkun core model/algorithm belongs to the upstream project: https://github.com/yujia-yan/transkun.
Thanks to the original Transkun authors and contributors.
This GitHub repository is source-only. Runtime binaries and model weights are published as Release assets. This section is the advanced/developer source setup path (not the first choice for most end users).
- Clone this repository.
- Download release assets into the project root:
runtime-windows-x64.7z(containsruntime/python,runtime/ffmpeg,runtime/wheels)2.0.pt(place intotranskun/pretrained/2.0.pt)- Optional:
Transkun GUI.exe
- Extract
runtime-windows-x64.7zsoruntime/exists beside this README. - Run:
- GUI:
Transkun GUI.exe(if downloaded) - CLI:
./run_transkun.ps1 ".\\example.mp3"
- GUI:
Outputs for song.wav:
output/song_raw.midoutput/song_baked.mid
- Included in source repo: code, scripts, docs, compliance files.
- Excluded from source repo:
runtime/binaries, pretrained.ptweights, generated outputs/logs/temp, and optional packaged GUI executable. - Release assets carry runtime and model files needed for runnable offline usage.
- Python runtime: 3.11.9 x64
- FFmpeg runtime:
7.1-full_build-www.gyan.dev - Core ML runtime:
torch 2.12.0+cpu,torchaudio 2.11.0+cpu
This repository and its release assets redistribute third-party software and model/runtime assets.
- Notices: THIRD_PARTY_NOTICES.md
- Key license texts: LICENSES/
- FFmpeg-specific compliance notes: FFMPEG_COMPLIANCE.md
- This project invokes FFmpeg via subprocess / command-line tools; it does not link FFmpeg libraries directly in project code.
- Release assets redistribute FFmpeg binaries, so FFmpeg license obligations still apply to distribution.
- Current FFmpeg build path is GPL-feature full build (
7.1-full_build-www.gyan.dev), and release documentation must include corresponding license/source-access information.
transkun/pretrained/2.0.ptis distributed as a release asset, not embedded in source repository history.- Code license and weight/license terms should be reviewed separately; do not assume model weights automatically follow this wrapper repository's root
LICENSE.
When replacing bundled binaries or upgrading wheel/runtime versions, update these files in the same change:
THIRD_PARTY_NOTICES.mdFFMPEG_COMPLIANCE.mdLICENSES/relevant texts- Release notes / release asset names