A comprehensive suite of Python tools for audio processing, transcription, and management.
- Audio Track Separation: Separate audio tracks into different stems (instruments) using Demucs
- YouTube Audio Download: Download audio from YouTube videos
- Audio Transcription: Transcribe audio files to text
- WAV to MP3 Conversion: Convert WAV files to MP3 format
- Audio Extraction: Extract audio from video files
- Clone the repository:
git clone https://github.com/yourusername/audio-manager.git
cd audio-manager- Create and activate a virtual environment (recommended):
python -m venv venv
# On Windows
venv\Scripts\activate
# On Unix or MacOS
source venv/bin/activate- Install the required dependencies:
pip install -r requirements.txtThe audio separator uses Demucs, a state-of-the-art audio source separation model from Facebook Research, to separate audio tracks into different stems (instruments).
Basic usage:
python audio_separator.py path/to/your/audio/file.mp3Advanced usage with options:
python audio_separator.py path/to/your/audio/file.mp3 --output_dir custom_output_folder --model htdemucs_ftinput_file: Path to the input audio file (required)--output_dir: Directory to save separated tracks (default: 'separated_tracks')--model: Model to use for separation (choices: htdemucs, htdemucs_ft, mdx, mdx_extra, default: htdemucs)
htdemucs: Default model, separates into:- Drums
- Bass
- Other instruments
- Vocals
htdemucs_ft: Fine-tuned version of htdemucs, better for specific genresmdx: MDX model, good for general purpose separationmdx_extra: Enhanced version of MDX model with improved quality
The script creates a directory (default: 'separated_tracks') containing the separated audio files. Each stem is saved as a separate WAV file with the following naming convention:
{track_name}_drums.wav{track_name}_bass.wav{track_name}_other.wav{track_name}_vocals.wav
Example output structure:
separated_tracks/
└── my_song/
├── my_song_drums.wav
├── my_song_bass.wav
├── my_song_other.wav
└── my_song_vocals.wav
Download audio from YouTube videos:
python youtube_audio_downloader.py "https://www.youtube.com/watch?v=VIDEO_ID"Transcribe audio files to text:
python transcribe_audio.py path/to/your/audio/file.mp3Convert WAV files to MP3 format:
python wav_to_mp3_converter.py path/to/your/audio/file.wavExtract audio from video files:
python extract_audio.py path/to/your/video/file.mp4audio-manager/
├── audio/ # Directory for audio files
├── separated_tracks/ # Output directory for separated tracks
├── transcription/ # Directory for transcription files
├── video/ # Directory for video files
├── tex/ # Directory for text files
├── audio_separator.py
├── youtube_audio_downloader.py
├── transcribe_audio.py
├── wav_to_mp3_converter.py
├── extract_audio.py
├── requirements.txt
└── README.md
See requirements.txt for a complete list of dependencies.
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.