### FFmpeg
* 컴퓨터에 다운받아 사용할 수 있는 CLI 도구로 콘솔에서 실행 가능
* 동영상을 압축, 썸네일 추출, 오디오 추출
* mac에선 `brew install ffmpeg`를 통해 설치 가능
* `ffmpeg -i files/podcast.mp4 -vn files/audio.mp3`
    * -vn: 영상을 무시하고 소리만 추출

### subprocess
* 파이썬 코드에서 command를 실행이 가능하게 만드는 라이브러리
* subprocess.run(command)

### pydub
* 음성 파일을 분 단위로 쪼갤 수 있음

In [5]:
# 파이썬 코드에서 command를 실행하게 해주는 역할
import math
import subprocess
from pydub import AudioSegment

In [18]:
def extract_audio_from_video(video_path, audio_path):
    command = ["ffmpeg", "-i", video_path, "-vn", audio_path]
    subprocess.run(command)
    
def cut_audio_in_chunks(audio_path, chunk_size, chunks_folder):
    track = AudioSegment.from_mp3(audio_path)
    chunk_len = chunk_size * 60 * 1000
    chunks = math.ceil(len(track) / chunk_len)
    for i in range(chunks):
        start_time = i * chunk_len
        end_time = (i+1) * chunk_len
        chunk = track[start_time:end_time]
        chunk.export(f"{chunks_folder}/chunk_{i}.mp3")

In [4]:
extract_audio_from_video("./files/podcast.mp4", "./files/podcast.mp3")

ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
  built with Apple clang version 15.0.0 (clang-1500.0.40.1)
  configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/6.0_2 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags='-Wl,-ld_classic' --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enabl

In [8]:
track = AudioSegment.from_mp3("./files/podcast.mp3")
print(f"mp3 길이: 약 {int(track.duration_seconds / 60)}분")

mp3 길이: 약 12분


In [20]:
cut_audio_in_chunks("./files/podcast.mp3", 10, "./files/chunks")

In [25]:
import openai
import glob

def transcribe_chunks(chunk_folder, destination):
    files = glob.glob(f"{chunk_folder}/*.mp3")
    final_transcript = ""
    for file in files:
        with open(file, "rb") as audio_file, open(destination, "a") as text_file:
            transcript = openai.Audio.transcribe(
                "whisper-1", 
                audio_file
            )
            text_file.write(transcript["text"])
        
transcribe_chunks("./files/chunks", "./files/transcript.txt")