
# Single Audio File Transcription Tutorial

This notebook mirrors the exact code that powers the command-line scripts in
`asr_toolkit/` so you can experiment interactively. We will load the sample
`samples/6_m.wav` recording from the samples/ directory and walk through the
transcription pipeline step by step.



> **Tip:** Make sure `torch`, `transformers`, `librosa`, and `tqdm` are installed
> in your environment. Run the cell below once if you still need these
> dependencies.



## 1. Configure input and output paths

We will point to the included `samples/6_m.wav` sample. Feel free to change the path
or provide your own audio file.


In [None]:
from pathlib import Path
import sys

PROJECT_ROOT = Path('..').resolve()
if str(PROJECT_ROOT) not in sys.path:
    sys.path.append(str(PROJECT_ROOT))
AUDIO_FILE = PROJECT_ROOT / 'samples/6_m.wav'
OUTPUT_DIR = PROJECT_ROOT / 'notebooks' / 'output'

print(f'Using audio file: {AUDIO_FILE}')
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)



## 2. Load the reusable toolkit

`AudioTranscriber` centralizes model loading and configuration. The same object
is used by the CLI scripts, so anything you prototype here can graduate to your
batch processing workflow without code changes.


In [None]:
from asr_toolkit import AudioTranscriber, TranscriptionConfig

config = TranscriptionConfig(
    model_name='openai/whisper-small',
)
transcriber = AudioTranscriber(config)
config


## 3. Transcribe a single file

This call returns a `TranscriptionResult` dataclass so we can examine metadata
such as the detected language and how long the inference run took. We also save
the transcript to disk to reuse it later.


In [None]:
result = transcriber.transcribe_file(AUDIO_FILE)
transcript_path = OUTPUT_DIR / f"{AUDIO_FILE.stem}_notebook.txt"
result.save_text(transcript_path)

summary = {
    'audio_file': str(AUDIO_FILE),
    'transcript_path': str(transcript_path),
    'language': result.language,
    'elapsed_seconds': round(result.elapsed_seconds, 2),
    'preview': result.text[:120] + '...'
}
summary



## 4. Inspect the full transcript

Use Python, markdown, or any other tooling to explore the text. The snippet
below prints the entire transcription so you can validate it or copy it into
another workflow.


In [None]:
print(result.text)


## 5. Reuse the saved transcript

The output directory mirrors the folder-based script, so you can diff or share
transcripts produced interactively and in bulk.


In [None]:
for path in OUTPUT_DIR.glob('*.txt'):
    print(path.name)