YouTube Video Transcriber and Summarizer (Single Cell)
This Google Colab notebook allows you to download the audio from a YouTube video, transcribe it using faster-whisper, and then summarize the transcript using OpenAI's GPT-4 model. All necessary installations and code are combined into one executable block.

Instructions:
Set Your OpenAI API Key: Before running the code, you must set your OPENAI_API_KEY using Colab's "Secrets" feature.

On the left sidebar in Colab, click the "key" icon (Secrets).

Click "Add new secret".

For "Name", enter OPENAI_API_KEY.

For "Value", paste your actual OpenAI API key.

Check "Notebook access" for this notebook.

Update YouTube URL: In the code cell below, replace the youtube_url variable with the actual URL of the YouTube video you want to summarize.

Configure Audio Removal: Set remove_audio_after_processing to True or False based on whether you want to keep the downloaded audio file.

Run the Cell: Click the "Play" button next to the code cell, or press Shift + Enter.

Hugging Face plays a role in this script through the faster_whisper library.

Specifically, faster_whisper uses the Hugging Face Hub to:

Download the Whisper AI model: When you initialize model = WhisperModel("base", ...) for the first time, faster_whisper connects to the Hugging Face Hub to download the "base" Whisper model files (like tokenizer.json, vocabulary.txt, model.bin, config.json). These are the large files that contain the actual transcription model.

So, while you don't directly interact with Hugging Face in your code, faster_whisper relies on it as the repository for the pre-trained Whisper models it uses for audio transcription. The warning you saw about HF_TOKEN is just a general message from the huggingface_hub library, indicating that it could use a token for certain operations, but it's not required for public model downloads like the "base" Whisper model.

In [5]:
# --- SECTION 1: Install Dependencies ---
# This installs yt-dlp (for downloading), faster-whisper (for transcription), and openai (for summarization).
!pip install yt-dlp faster-whisper openai

# Install FFmpeg: yt-dlp uses ffmpeg to extract audio from video files.
!apt-get install ffmpeg -y

# --- SECTION 2: Configuration - Set Your OpenAI API Key ---
import os
from google.colab import userdata

# Attempt to get the API key from Colab Secrets first
try:
    openai_api_key = userdata.get('OPENAI_API_KEY')
except userdata.SecretNotFoundError:
    raise ValueError(
        "OPENAI_API_KEY not found in Colab Secrets. "
        "Please set it following the instructions in the markdown cell above."
    )


# Fallback to environment variable if not found in secrets (e.g., if running outside Colab)
if not openai_api_key:
    openai_api_key = os.getenv('OPENAI_API_KEY')

if not openai_api_key:
    raise ValueError("OPENAI_API_KEY not found. Please set it in Colab Secrets or as an environment variable.")

os.environ["OPENAI_API_KEY"] = openai_api_key # Set it as an environment variable for the script
print("OpenAI API Key loaded.")


# --- SECTION 3: Application Logic ---
import openai
import subprocess
from faster_whisper import WhisperModel
import logging
import textwrap # Import textwrap for formatting output

# Configure logging to see more details
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def download_audio(url: str, output_filename="audio.mp3"):
    """
    Downloads audio from a given YouTube URL using yt-dlp and saves it as an MP3.
    Requires yt-dlp and ffmpeg to be installed.
    """
    logging.info(f"Attempting to download audio from: {url}")
    try:
        result = subprocess.run(
            ["yt-dlp", "--extract-audio", "--audio-format", "mp3", url, "-o", output_filename],
            capture_output=True,
            text=True,
            check=True # Raise an exception for non-zero exit codes
        )
        logging.info(f"Download stdout: {result.stdout}")
        logging.info(f"Download stderr: {result.stderr}")
        logging.info(f"Audio downloaded to: {output_filename}")
        return output_filename
    except subprocess.CalledProcessError as e:
        logging.error(f"Error during audio download: {e}")
        logging.error(f"Stdout: {e.stdout}")
        logging.error(f"Stderr: {e.stderr}")
        raise
    except FileNotFoundError:
        logging.error("yt-dlp or ffmpeg not found. Ensure they are installed in the Colab environment.")
        raise

def transcribe_audio(file_path: str) -> str:
    """
    Transcribes an audio file using the Faster Whisper model.
    """
    logging.info(f"Starting transcription of: {file_path}")
    try:
        # Load the Whisper model. "base" is a good starting point for speed.
        # You can try "small", "medium", or "large" for better accuracy if needed.
        # Using device="cpu" and compute_type="int8" for broader compatibility in Colab's free tier.
        model = WhisperModel("base", device="cpu", compute_type="int8")

        # Transcribe the audio file
        segments, info = model.transcribe(file_path, beam_size=5)

        transcript_parts = []
        for segment in segments:
            transcript_parts.append(segment.text)
            # You can uncomment the line below to see segment-by-segment transcription
            # logging.info(f"[ {segment.start:.2f}s -> {segment.end:.2f}s ] {segment.text}")

        full_transcript = " ".join(transcript_parts)
        logging.info("Transcription complete.")
        return full_transcript
    except Exception as e:
        logging.error(f"Error during audio transcription: {e}")
        raise

def summarize_text(text: str) -> str:
    """
    Summarizes the given text using OpenAI's GPT-4 model.
    Requires OPENAI_API_KEY to be set as an environment variable.
    """
    logging.info("Attempting to summarize text using OpenAI.")
    openai_api_key = os.getenv('OPENAI_API_KEY') # Retrieve from environment variable
    if not openai_api_key:
        raise ValueError("OPENAI_API_KEY environment variable not set. Please set it.")

    try:
        client = openai.OpenAI(api_key=openai_api_key)
        prompt = f"Summarize the following podcast transcript:\n\n{text}"
        response = client.chat.completions.create(
            model="gpt-4", # Using GPT-4 as requested
            messages=[{"role": "user", "content": prompt}],
            temperature=0.3
        )
        summary = response.choices[0].message.content
        logging.info("Text summarization complete.")
        return summary
    except openai.APIError as e:
        logging.error(f"OpenAI API error: {e}")
        raise
    except Exception as e:
        logging.error(f"An unexpected error occurred during summarization: {e}")
        raise

# --- SECTION 4: Run the Summarizer ---

# IMPORTANT: Replace this with the actual YouTube video URL you want to process
youtube_url = "https://www.youtube.com/watch?v=AOi-wYOqs3E" # Example
# Define the output audio file path
audio_file_path = "downloaded_audio.mp3"
transcript_file_path = "transcript.txt" # Path for the transcript file
summary_file_path = "summary.txt" # NEW: Path for the summary file

# OPTION: Set to True to remove the audio file after processing, False to keep it.
remove_audio_after_processing = False

try:
    # 1. Download audio
    downloaded_file = download_audio(youtube_url, audio_file_path)

    # 2. Transcribe audio
    transcript = transcribe_audio(downloaded_file)

    # Save transcript to a file
    with open(transcript_file_path, "w", encoding="utf-8") as f:
        f.write(transcript)
    logging.info(f"Transcript saved to: {transcript_file_path}")

    # 3. Summarize transcript
    summary = summarize_text(transcript)

    # NEW: Save summary to a file
    with open(summary_file_path, "w", encoding="utf-8") as f:
        f.write(summary)
    logging.info(f"Summary saved to: {summary_file_path}")

    print("\n--- Summary ---")
    # Wrap the summary text for better readability
    wrapped_summary = textwrap.fill(summary, width=80) # Adjust width as needed
    print(wrapped_summary)

except Exception as e:
    print(f"\nAn error occurred during the process: {e}")
finally:
    # Optional cleanup for the audio file based on the new variable
    if remove_audio_after_processing and os.path.exists(audio_file_path):
        os.remove(audio_file_path)
        logging.info(f"Cleaned up {audio_file_path}")
    else:
        logging.info(f"Audio file ({audio_file_path}) will remain in your Colab environment.")

    # Transcript file will always remain
    logging.info(f"Transcript file ({transcript_file_path}) will remain in your Colab environment.")

    # Summary file will always remain
    logging.info(f"Summary file ({summary_file_path}) will remain in your Colab environment.")

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.
OpenAI API Key loaded.


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer.json: 0.00B [00:00, ?B/s]

vocabulary.txt: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

model.bin:   0%|          | 0.00/145M [00:00<?, ?B/s]


--- Summary ---
The transcript provided does not contain any content to summarize as it only
mentions the playing of outro music.
