AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models
-
Updated
May 1, 2024 - Shell
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models
This repository contains my attempt to use two famous speech recognition frameworks (Kaldi, CMU Sphinx4) for Arabic Language using the publicly-available dataset "Arabic Corpus of Isolated Words"
Long audio alignment using Kaldi
Code:Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.
This is the repository for my version of Kaldi for Dummies example.
Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi
The voice assistant Sherlock is a project to create a proof of concept for an offline, open source voice assistant.
Transcribe voice data to text using Google Cloud Speech-to-Text
Add a description, image, and links to the automatic-speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the automatic-speech-recognition topic, visit your repo's landing page and select "manage topics."