# Extract audio from video

Extract the audio track from video files for transcription, analysis, or processing.

## Problem

You have video files but need to work with just the audio track—for transcription, speaker analysis, or audio processing. Extracting audio manually with ffmpeg is tedious and doesn't integrate with your data pipeline.

| Source | Goal |
|--------|------|
| Lecture recordings | Transcribe for notes |
| Meeting videos | Extract for speaker ID |
| Video podcasts | Create audio-only version |

## Solution

**What's in this recipe:**
- Extract audio from video as a computed column
- Choose audio format (mp3, wav, flac)
- Chain with transcription for automatic video-to-text

You use the `extract_audio` function to create an audio column from video. This integrates seamlessly with transcription and other audio processing.

### Setup

In [None]:
%pip install -qU pixeltable

In [2]:
import pixeltable as pxt
from pixeltable.functions.video import extract_audio

In [3]:
# Create a fresh directory
pxt.drop_dir('audio_extract_demo', force=True)
pxt.create_dir('audio_extract_demo')

Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory 'audio_extract_demo'.


<pixeltable.catalog.dir.Dir at 0x16b60aa50>

### Extract audio from video

In [4]:
# Create table for videos
videos = pxt.create_table(
    'audio_extract_demo.videos',
    {'title': pxt.String, 'video': pxt.Video}
)

Created table 'videos'.


In [5]:
# Add computed column to extract audio as MP3
videos.add_computed_column(
    audio=extract_audio(videos.video, format='mp3')
)

Added 0 column values with 0 errors.


No rows affected.

In [6]:
# Insert a sample video
video_url = 'https://github.com/pixeltable/pixeltable/raw/main/docs/resources/bangkok.mp4'

videos.insert([{
    'title': 'Bangkok',
    'video': video_url
}])

Inserting rows into `videos`: 1 rows [00:00, 152.93 rows/s]
Inserted 1 row with 0 errors.


1 row inserted, 4 values computed.

In [7]:
# View results
videos.select(videos.title, videos.audio).collect()

title,audio
Bangkok,


### Chain with transcription

Add transcription as a follow-up computed column:

In [None]:
# Install whisper for transcription
%pip install -qU openai-whisper

In [9]:
from pixeltable.functions import whisper

# Add transcription of the extracted audio
videos.add_computed_column(
    transcription=whisper.transcribe(videos.audio, model='base.en')
)

Added 1 column value with 0 errors.


1 row updated, 1 value computed.

In [10]:
# Extract the transcript text
videos.add_computed_column(
    transcript=videos.transcription.text
)

Added 1 column value with 0 errors.


1 row updated, 1 value computed.

In [11]:
# View the full pipeline results
videos.select(videos.title, videos.transcript).collect()

title,transcript
Bangkok,


## Explanation

**Audio format options:**

| Format | Use case |
|--------|----------|
| `mp3` | Compressed, widely compatible |
| `wav` | Uncompressed, for processing |
| `flac` | Lossless compression |

**Pipeline flow:**

```
Video → extract_audio → Audio → whisper.transcribe → Transcript
```

Each step is a computed column. When you insert a new video:
1. Audio is extracted automatically
2. Whisper transcribes the audio
3. All results are cached for future queries

## See also

- [Transcribe audio](https://docs.pixeltable.com/howto/cookbooks/audio/audio-transcribe) - Audio-only transcription
- [Summarize podcasts](https://docs.pixeltable.com/howto/cookbooks/audio/audio-summarize-podcast) - Transcribe and summarize
- [Extract video frames](https://docs.pixeltable.com/howto/cookbooks/video/video-extract-frames) - Work with video frames