Pipeline for generating images conditioned on input audio
-
Updated
Jul 25, 2024 - Python
Pipeline for generating images conditioned on input audio
Acoustic Transformer Models for Audio Classification
code for our paper DistilALHuBERT: A Distilled Parameter Sharing Audio Representation Model
unsupervised spoken utterances scoring
Speech Keyword detection using Wav2Vec Model
[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and t…
Phoneme segmentation using pre-trained speech models
Self-Supervised Speech Pre-training and Representation Learning Toolkit
so-vits-svc fork with realtime support, improved interface and more features.
Add a description, image, and links to the hubert topic page so that developers can more easily learn about it.
To associate your repository with the hubert topic, visit your repo's landing page and select "manage topics."