oroszgy

György Orosz oroszgy

Freelance NLP engineer

169 followers · 234 following

@ec-doris
Budapest, Hungary
15:37 (UTC +01:00)
https://gyorgy.orosz.link
in/oroszgy

Achievements

x2 x2

Achievements

x2 x2

Highlights

Developer Program Member

Organizations

Stars

Speech

19 repositories

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 95,471 11,821 Updated Dec 15, 2025

ggml-org / whisper.cpp

Port of OpenAI's Whisper model in C/C++

C++ 47,238 5,258 Updated Mar 5, 2026

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,760 1,166 Updated Mar 3, 2026

huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Python 4,052 349 Updated Jan 8, 2025

ufal / whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Python 3,546 414 Updated Nov 12, 2025

Vaibhavs10 / insanely-fast-whisper

Jupyter Notebook 8,823 633 Updated Oct 25, 2025

rhasspy / piper

A fast, local neural text to speech system

C++ 10,633 913 Updated Aug 26, 2025

readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Python 2,809 268 Updated Jun 22, 2024

meinardmueller / synctoolbox

Sync Toolbox - Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (DTW)

Python 132 16 Updated Feb 6, 2026

feldberlin / timething

Timething is a library for aligning text transcripts with their audio recordings.

Jupyter Notebook 130 14 Updated Dec 3, 2024

cmusphinx / pocketsphinx

A small speech recognizer

C 4,277 729 Updated Mar 2, 2026

r4victor / afaligner

📈 A forced aligner intended for synchronization of narrated text

Python 102 14 Updated Aug 9, 2025

MahmoudAshraf97 / ctc-forced-aligner

Text to speech alignment using CTC forced alignment

Python 451 78 Updated Feb 23, 2026

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,856 3,355 Updated Mar 5, 2026

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 20,493 2,168 Updated Feb 22, 2026

jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 2,171 227 Updated Oct 29, 2025

echogarden-project / echogarden

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voic…

TypeScript 438 42 Updated Sep 1, 2025

KittenML / KittenTTS

State-of-the-art TTS model under 25MB 😻

Python 11,195 627 Updated Feb 24, 2026

microsoft / VibeVoice

Open-Source Frontier Voice AI

Python 23,615 2,611 Updated Feb 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

György Orosz oroszgy

Achievements

Achievements

Highlights

Organizations

Block or report oroszgy

Speech

openai / whisper

ggml-org / whisper.cpp

facebookresearch / seamless_communication

huggingface / distil-whisper

ufal / whisper_streaming

Vaibhavs10 / insanely-fast-whisper

rhasspy / piper

readbeyond / aeneas

meinardmueller / synctoolbox

feldberlin / timething

cmusphinx / pocketsphinx

r4victor / afaligner

MahmoudAshraf97 / ctc-forced-aligner

NVIDIA-NeMo / NeMo

m-bain / whisperX

jianfch / stable-ts

echogarden-project / echogarden

KittenML / KittenTTS

microsoft / VibeVoice