Easily train a good VC model with voice data <= 10 mins!
-
Updated
Jun 12, 2024 - Python
Easily train a good VC model with voice data <= 10 mins!
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
Python Library for Audio Parameterization
Machine Learining models for Video and Audio Analysis using Deep Learning for Deepfake Detection.
Easily train a good VC model with voice data <= 10 mins!
Functions and scripts for analyzing waveforms, primarily audio. This is currently somewhat disorganized and unfinished.
An overlay to show BPM of one or multiple live sources
Real-time audio visualizations (spectrum, spectrogram, etc.)
🔉 spafe: Simplified Python Audio Features Extraction
Metadata, scripts and baselines for the MTG-Jamendo dataset
Simple audio transcription module
A music recognition program, which uses FFT and Python's scientific modules to create and distinguish audio fingerprints of songs. Encapsulated and Abstract, the module is plug and play, in this repo I displayed the results at a website I created for this purpose
The normalised aggregated power envelope (nape) is a representation of an audio signal calculated by summing the columns of the short-time Fourier transform (STFT).
MoSQITo is a unified and modular development framework of key sound quality metrics favoring reproducible science and efficient shared scripting among engineers, teachers and researchers community.
Audio segments analyzer semestral work at University of West Bohemia, Faculty of Applied Sciences, Department of Computer Science and Engineering
Chat using „enhanced“ end-to-end-enryption and modulation of audio signal in isolated device, ensuring privacy, anonymity and cybersecurity.
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Transcribes audio recordings and generates answers to questions using OpenAI's GPT-3.5 model. Streamlit-based interface for easy use.
an ASCII keyboard synthesizer developed in python, which creatively explores the 2 DSP techniques: cross correlation and convolution. Output is produced by performing cross correlation/convolution between musical clips and human speech recordings.
Add a description, image, and links to the audio-analysis topic page so that developers can more easily learn about it.
To associate your repository with the audio-analysis topic, visit your repo's landing page and select "manage topics."