# Video Transcript Loaders

## Introduction

This documentation explains how to use various video transcript loaders to extract transcripts from video files using different configurations of the Whisper model. Whisper is a versatile speech recognition model developed by OpenAI that supports various functionalities including speech-to-text transcription, speech translation, and language identification.

The supported file formats are: `MP4`, `MKV`, `AVI`, `M4A`, `MP3`, `WEBM`, `MPGA`, `WAV`, `MPEG`, `OGG`

## Installation

Before using the loaders, ensure you have the necessary packages installed.

```bash
pip install --upgrade --quiet  langchain langchain-community openai-whisper
```

## Additional requirements

FFmpeg: Required for video format conversions

```bash
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using direct download:
Download from https://ffmpeg.org/download.html and add the executable to your PATH.

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg
```

## Examples

### AzureWhisperVideoSegmentLoader

A document loader that processes video files, converts them to .ogg,
and transcribes them using Azure OpenAI's API.

In [None]:
from langchain_community.document_loaders import AzureWhisperVideoSegmentLoader

video_path = "<video_path>"
api_key = "<api_key>"
api_version = "<api_version>"
azure_endpoint = "<azure_endpoint>"
deployment_id = "<deployment_id>"

loader = AzureWhisperVideoSegmentLoader(
    video_path=video_path, 
    deployment_id=deployment_id, 
    api_key=api_key, 
    api_version=api_version,
    azure_endpoint=azure_endpoint)

documents = loader.lazy_load()

In [None]:
for doc in documents:
    print(doc)
    print()

### AzureWhisperVideoParagraphLoader

A document loader that processes video files, converts them to .ogg, and transcribes them 
into paragraphs with predefined sentence size using Azure OpenAI's API.

In [None]:
from langchain_community.document_loaders import AzureWhisperVideoParagraphLoader

video_path = "<video_path>"
api_key = "<api_key>"
api_version = "<api_version>"
azure_endpoint = "<azure_endpoint>"
deployment_id = "<deployment_id>"
paragraph_sentence_size = 3

loader = AzureWhisperVideoParagraphLoader(
    video_path=video_path, 
    deployment_id=deployment_id, 
    api_key=api_key, 
    api_version=api_version,
    azure_endpoint=azure_endpoint,
    paragraph_sentence_size = paragraph_sentence_size)

documents = loader.lazy_load()

In [None]:
for doc in documents:
    print(doc)
    print()

### OpenAIWhisperVideoSegmentLoader

A document loader that processes video files, converts them to .ogg,
and transcribes them using OpenAI's API.

In [None]:
from langchain_community.document_loaders import OpenAIWhisperVideoSegmentLoader

video_path = "<video_path>"
api_key = "<api_key>"

loader = OpenAIWhisperVideoSegmentLoader(
    video_path=video_path, api_key=api_key
    )

documents = loader.lazy_load()

In [None]:
for doc in documents:
    print(doc)
    print()

### OpenAIWhisperVideoParagraphLoader

A document loader that processes video files, converts them to .ogg, and transcribes them 
into paragraphs with predefined sentence size using OpenAI's API.

In [None]:
from langchain_community.document_loaders import OpenAIWhisperVideoParagraphLoader

video_path = "<video_path>"
api_key = "<api_key>"
paragraph_sentence_size = 3

loader = OpenAIWhisperVideoParagraphLoader(
    video_path=video_path, api_key=api_key, paragraph_sentence_size = paragraph_sentence_size
    )

documents = loader.lazy_load()

In [None]:
for doc in documents:
    print(doc)
    print()

### LocalWhisperVideoSegmentLoader

A document loader that processes video files and transcribes them using Whisper locally.

In [None]:
from langchain_community.document_loaders import LocalWhisperVideoSegmentLoader

video_path = "<video_path>"

loader = LocalWhisperVideoSegmentLoader(video_path=video_path)
documents = loader.lazy_load()

In [None]:
for doc in documents:
    print(doc)
    print()

### LocalWhisperVideoSegmentLoader

A document loader that processes video files and transcribes them into paragraphs with predefined sentence size using Whisper locally.

In [None]:
from langchain_community.document_loaders import LocalWhisperVideoParagraphLoader

video_path = "<video_path>"
paragraph_sentence_size = 3

loader = LocalWhisperVideoParagraphLoader(video_path=video_path, paragraph_sentence_size = paragraph_sentence_size)
documents = loader.lazy_load()

In [None]:
for doc in documents:
    print(doc)
    print()