## Real-Time Speech Translation (Streaming Demo)

This notebook demonstrates a minimal real-time speech translation pipeline built using open-source speech recognition and machine translation models.

The emphasis here is on **streaming behavior** rather than offline accuracy. Audio is processed incrementally, partial transcriptions are generated as speech arrives, and translations are updated continuously. This setup reflects the constraints of real-world speech translation systems, where models operate on incomplete and evolving context.

The implementation is intentionally simple and self-contained so that the end-to-end behavior—buffering, latency, and hypothesis updates—can be inspected and modified easily.

### Scope and assumptions

- Speech input is assumed to be in **English**
- Translation is performed into a selected Indic language
- Transcription and translation operate on partial audio segments
- Latency is influenced by network conditions and model inference time.
- Running in a Colab environment imposes constraints that limit the extent of streaming optimizations.

### Prerequisites

1. Add your Hugging Face access token to Colab using the **Secrets** tab (key icon in the left sidebar).
2. The translation model (IndicTrans2) is gated. You must request access on Hugging Face:  
   https://huggingface.co/ai4bharat/indictrans2-en-indic-dist-200M
3. Set the Colab runtime to **GPU (T4)**.

Clone the application and install the requirements

In [None]:
!git clone https://github.com/CrazyCyberbug/Realtime-translation.git

Install requirements (you would  be prompted to restart the session)

In [None]:
!pip install -r Realtime-translation/requirements.txt


Set up all applictaion code into a package

In [None]:
!mkdir RealtimeTranslator
!cp -R Realtime-translation/* RealtimeTranslator/
!touch RealtimeTranslator/__init__.py
!rm -r Realtime-translation

Launch the application. During the first run, model weights are downloaded automatically; this step typically takes around 30–45 seconds.

In [None]:
from RealtimeTranslator.GradioApp import launch_app

launch_app()