SpeechDoctor This project detects VAD and ASR for an audio file. Implemented with OpenAI's Whisper and VOSK models for counting the nubmer of words and sentences, timestamps