Skip to content
Change the repository type filter

All

    Repositories list

    • Transkun

      Public
      A simple yet effective Audio-to-Midi Automatic Piano Transcription system
      Python
      27000Updated Sep 28, 2024Sep 28, 2024
    • HARP

      Public
      A sample editing application allowing for hosted, asynchronous, remote processing of audio with machine learning by routing through Gradio endpoints.
      HTML
      7000Updated Sep 17, 2024Sep 17, 2024
    • Y-vector

      Public
      Y-vector: Multiscale Waveform Encoder for Speaker Embedding
      Python
      9000Updated Sep 15, 2024Sep 15, 2024
    • Codebase for "Transcription free filler word detection with Neural semi-CRFs" [ICASSP2023]
      Python
      3000Updated Sep 15, 2024Sep 15, 2024
    • Implementation of the paper "One-class Learning towards Generalized Voice Spoofing Detection"
      Jupyter Notebook
      32000Updated Sep 14, 2024Sep 14, 2024
    • pyharp

      Public
      Companion repository which facilitates the creation of Gradio endpoints which are accessible from within Digital Audio Workstations (DAWs) through HARP.
      Python
      4000Updated Sep 6, 2024Sep 6, 2024
    • SynthTab

      Public
      Official Repository for ICASSP 2024 Paper "SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription"
      Python
      4000Updated Aug 22, 2024Aug 22, 2024
    • MSOC

      Public
      Python
      1000Updated Jul 29, 2024Jul 29, 2024
    • BeatNet

      Public
      BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, tempo, and meter tracking system using CRNN and particle filtering. (ISMIR 2021's paper implementation).
      Python
      77000Updated May 29, 2024May 29, 2024
    • Cacophony

      Public
      Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
      Python
      4000Updated Apr 26, 2024Apr 26, 2024
    • lhvqt

      Public
      Frontend filterbank learning module with HVQT initialization capabilities.
      Python
      3000Updated Feb 27, 2024Feb 27, 2024
    • Invited talk at group meeting of AIR lab
      0000Updated Dec 5, 2023Dec 5, 2023
    • Official Implementation of our WASPAA 2023 paper "Mitigating Cross-Database Differences for Learning Unified HRTF Representation"
      Python
      3000Updated Dec 3, 2023Dec 3, 2023
    • Official implementation of the ICASSP 2023 paper "HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields"
      Python
      2000Updated Dec 3, 2023Dec 3, 2023
    • This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"
      Python
      18000Updated Nov 29, 2023Nov 29, 2023
    • This repository contains the implementation of an efficient joint beat, downbeat, tempo, and meter tracking system using a compact 1D probabilistic state space and a jump-back reward technique. ICASSP 2022.
      Python
      14000Updated Nov 28, 2023Nov 28, 2023
    • Official Implementation of our ICASSP 2024 paper "Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech"
      HTML
      2000Updated Nov 25, 2023Nov 25, 2023
    • Official implementation of the handbook chapter "Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation"
      Python
      1000Updated Oct 2, 2023Oct 2, 2023
    • amt-tools

      Public
      Machine learning tools and framework for automatic music transcription.
      Python
      4000Updated Jul 30, 2023Jul 30, 2023
    • harana

      Public
      A neural semi-CRF model for harmonic analysis
      Python
      1000Updated Jul 14, 2023Jul 14, 2023
    • The code for the TMM paper "Speech Driven Talking Face Generation from a Single Image and an Emotion Condition"
      Python
      32000Updated Apr 9, 2023Apr 9, 2023
    • samo

      Public
      Official Implementation of our ICASSP 2023 paper "SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING"
      Python
      10000Updated Apr 5, 2023Apr 5, 2023
    • Code for the paper "A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems".
      Python
      2000Updated Dec 14, 2022Dec 14, 2022
    • Open source code for the paper 'Music Source Separation with Generative Flow'
      Jupyter Notebook
      1100Updated Nov 18, 2022Nov 18, 2022
    • Code for the paper "Draw and Listen! A Sketch-based System for Music Inpainting", TISMIR 2022
      Python
      2000Updated Nov 4, 2022Nov 4, 2022
    • This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations.
      Python
      4000Updated Sep 4, 2022Sep 4, 2022
    • BachDuet

      Public
      BachDuet enables a human performer to improvise a duet counterpoint with a computer agent in real time.
      Python
      2000Updated Aug 8, 2022Aug 8, 2022
    • DyViSE

      Public
      Official implementation of our MMSP 2022 paper, "Dynamic vision-guided speaker embedding for audio-visual speaker diarization"
      Python
      2000Updated Jul 5, 2022Jul 5, 2022
    • SASV_PR

      Public
      Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"
      Python
      5000Updated Jun 24, 2022Jun 24, 2022
    • Code for the paper "Learning Sparse Analytic Filters for Piano Transcription".
      Python
      3000Updated Jun 22, 2022Jun 22, 2022