Convolutional Neural Net trained on over two hours of audio data, capable of differentiating between guitarists playing solos.
-
Updated
May 27, 2022 - Python
Convolutional Neural Net trained on over two hours of audio data, capable of differentiating between guitarists playing solos.
an open-source framework for detecting audio generated from generative systems
voXify is a Streamlit-powered speech-to-text web application, enabling to generate transcripts from various audio sources and download in PDF or Word format.
Generating unique one-shot audio samples with Stable Diffusion.
A Speech Recognition Framework for Banking Interactions using Convolutional Recurrent Dense Neural Networks and Language Models
TTS (FastPitch) for German
Building a speaker identification & verification pipeline for Vietnamese voices 😪
Automatic Speech Recognition using torchaudio
Code from the ASR tutorial https://towardsdatascience.com/audio-deep-learning-made-simple-sound-classification-step-by-step-cebc936bbe5
This project leverages Python, computer vision, and deep learning techniques, utilizing pre-trained models such as RetinaNet_ResNet-50 for image-based object detection. It is designed with a primary focus on enhancing security across various sectors. The RetinaNet_ResNet-50 model enables both image and video-based detection functionalities.
🤖 Telegram bot powered by Deep Learning. Automatically assesses the safety of audios and voice messages for people suffering from misophonia.
TorchAudio: Building Blocks for Audio and Speech Processing
In this notebook, we aim to recognize speech commands using classification. For this purpose, we used the SPEECHCOMMANDS dataset and the deep convolutional model M5. The code is written in Python and designed for the PyTorch platform.
Experiments in neural networks for audio generation.
🎶🎼 This repository contains some notebooks that were used to train Audio Classification models in pytorch using torchaudio.
The road sign recognition system of the Russian Federation, which uses an already prepared model for object detection and image segmentation in real time to improve road safety
This repository contains the code and methodology used for the BirdCLEF 2024 Kaggle competition, where I achieved a rank of 55th out of 974 participants, earning a bronze medal. The goal of this competition was to build a model that can accurately classify bird sounds.
Signal Separation API
Classifying Music Genre with Urban Sound Dataset, Preprocessing with Librosa and Torch audio, Model made in Tensorflow and PyTorch
The core of my graduation project that uses convolutional neural networks to extract the vocal part from a song by removing the sound of musical instruments. The project is rather academic, it did not achieve too great real results, but this is expected. I'm not going to develop it further.
Add a description, image, and links to the torchaudio topic page so that developers can more easily learn about it.
To associate your repository with the torchaudio topic, visit your repo's landing page and select "manage topics."