![YouTube Video AI Summarization using Gemini Cover](https://github.com/AhmedMohamady1/YouTube-Video-AI-Summarization-using-Gemini-AI/raw/main/Cover%20Banner.png)

In [1]:
!pip uninstall -qqy jupyterlab  # Remove unused conflicting packages
!pip install google-generativeai youtube-transcript-api

Collecting youtube-transcript-api
  Downloading youtube_transcript_api-1.0.3-py3-none-any.whl.metadata (23 kB)
Downloading youtube_transcript_api-1.0.3-py3-none-any.whl (2.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m34.4 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hInstalling collected packages: youtube-transcript-api
Successfully installed youtube-transcript-api-1.0.3


## Installing Necessary Libraries

This code cell is responsible for installing the required Python libraries that your Gen AI project will depend on. It performs two main actions:

1.  **Uninstalling a Potentially Conflicting Package:**
    ```bash
    !pip uninstall -qqy jupyterlab
    ```
    * `!pip`: This is a command that tells the Jupyter environment to execute a pip command (pip is the package installer for Python).
    * `uninstall`: This pip subcommand is used to remove an existing Python package.
    * `-qqy`: These are flags that modify the behavior of the `uninstall` command:
        * `-qq`: Makes the output "quiet" or "quieter," reducing the amount of information displayed during the uninstallation process.
        * `-y`: Automatically answers "yes" to any confirmation prompts, ensuring the uninstallation proceeds without user interaction.
    * `jupyterlab`: This is the name of a web-based interactive development environment for Jupyter notebooks, code, and data. The comment `# Remove unused conflicting packages` suggests that a previous installation of `jupyterlab` might conflict with other libraries you intend to use, although this is not a very common conflict for the libraries being installed next.

2.  **Installing `google-generativeai` and `youtube-transcript-api`:**
    ```bash
    !pip install google-generativeai youtube-transcript-api
    ```
    * `!pip install`: This command installs Python packages.
    * `google-generativeai`: This is to install the Google Generative AI library.
    * `youtube-transcript-api`: This library allows you to programmatically retrieve the transcripts of YouTube videos given their video IDs or URLs. This is crucial for your project's goal of processing YouTube video content.

**In summary, this cell ensures that your Python environment has the necessary libraries to:**

* Interact with Google's Gemini Generative AI models (`google-generativeai`).
* Fetch transcripts from YouTube videos (`youtube-transcript-api`).

By running this cell, you are setting up the foundation for the subsequent steps in your notebook where you will utilize these libraries to achieve your project's objectives.

In [2]:
from google.genai import types
import google.generativeai as genai
from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled, NoTranscriptFound
import re
import json
import os

  warn(


## Importing Necessary Modules and Libraries

This code cell imports specific modules and functions from the libraries you installed in the previous step, as well as some standard Python libraries. These imports make the functionalities of these libraries accessible in your current notebook.

1.  **`from google.genai import types`**:
    * This line imports the `types` module specifically from the `google.genai` library. This module likely contains type definitions and data structures used when interacting with the Gemini models, such as request and response objects.

2.  **`import google.generativeai as genai`**:
    * This line imports the entire `google.generativeai` library and assigns it the alias `genai`. This is a common convention that allows you to access the functions and classes within this library using the shorter name `genai` (e.g., `genai.configure()`, `genai.GenerativeModel()`). This library is the core for interacting with Google's generative AI models.

3.  **`from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled, NoTranscriptFound`**:
    * This line imports specific classes and exceptions from the `youtube_transcript_api` library:
        * `YouTubeTranscriptApi`: This class provides the primary interface for fetching YouTube transcripts. You will likely create an instance of this class to retrieve transcripts.
        * `TranscriptsDisabled`: This is an exception that might be raised if the YouTube video has disabled transcripts.
        * `NoTranscriptFound`: This is an exception that might be raised if a transcript cannot be found for the given YouTube video (e.g., if the video doesn't have automatically generated or manually uploaded subtitles).

4.  **`import re`**:
    * This line imports the built-in Python `re` module, which provides support for regular expressions. Regular expressions are powerful tools for pattern matching and manipulation of text data. You might use this library to clean or process the retrieved transcript.

5.  **`import json`**:
    * This line imports the built-in Python `json` module, which allows you to work with JSON (JavaScript Object Notation) data. JSON is a common format for structured data, and you will likely use this module to create the structured output containing the suggested title, summary, and key points.

6.  **`import os`**:
    * This line imports the built-in Python `os` module, which provides a way of using operating system-dependent functionality, such as interacting with the file system, environment variables, etc. You might use this to access API keys or save output files.

**In summary, this cell makes the necessary tools available for:**

* Interacting with Google's Gemini models (`google.generativeai`).
* Retrieving and handling YouTube transcripts (`youtube_transcript_api`).
* Performing text manipulation using regular expressions (`re`).
* Working with structured JSON data (`json`).
* Interacting with the operating system (if needed) (`os`).

These imports set the stage for the core logic of your notebook, where you will fetch a YouTube transcript, process it using the Gemini model, and format the output as JSON.

In [3]:
from kaggle_secrets import UserSecretsClient

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

genai.configure(api_key=GOOGLE_API_KEY)

## Configuring the Google Generative AI API

This code cell focuses on securely configuring access to the Google Generative AI models by retrieving your API key and then using it to authenticate the `google-genai` library.

1.  **`from kaggle_secrets import UserSecretsClient`**:
    * This line imports the `UserSecretsClient` class from the `kaggle_secrets` library. This library is specifically designed for securely managing secrets (like API keys) within the Kaggle environment. It allows you to store sensitive information as "secrets" in your Kaggle notebook settings and then access them in your code without hardcoding them directly.

2.  **`GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")`**:
    * This line does the following:
        * `UserSecretsClient()`: Creates an instance of the `UserSecretsClient` class.
        * `.get_secret("GOOGLE_API_KEY")`: This method of the `UserSecretsClient` instance retrieves the secret that you have stored in your Kaggle notebook settings under the name `"GOOGLE_API_KEY"`.
        * `GOOGLE_API_KEY = ...`: The retrieved secret (your Google API key) is then assigned to the variable `GOOGLE_API_KEY`.

3.  **`genai.configure(api_key=GOOGLE_API_KEY)`**:
    * This line configures the `google.generativeai` library (which you imported as `genai`) by providing your API key.
    * `genai.configure()`: This function is used to set global configuration options for the `genai` library.
    * `api_key=GOOGLE_API_KEY`: This argument specifies that the value of the `GOOGLE_API_KEY` variable (which you securely retrieved from Kaggle Secrets) should be used as the API key for authenticating your requests to the Google Generative AI models.

**In summary, this cell ensures that:**

* Your Google API key is securely retrieved from Kaggle's secret management system, preventing it from being directly exposed in your code.
* The `google-genai` library is properly authenticated using this API key, allowing you to access and use Google's Gemini models in the subsequent steps of your notebook.

This is a crucial step for interacting with any paid or secured AI service, as it verifies your identity and grants you access to the models. By using `kaggle_secrets`, you are following best practices for handling sensitive information within the Kaggle platform.

In [4]:
# --- Helper Function: Extract Video ID ---
def extract_video_id(youtube_url):
    """Extracts the YouTube video ID from various URL formats."""
    patterns = [
        r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/watch\?v=([a-zA-Z0-9_-]{11})',  # Standard watch URL
        r'(?:https?:\/\/)?(?:www\.)?youtu\.be\/([a-zA-Z0-9_-]{11})',          # Shortened URL
        r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/embed\/([a-zA-Z0-9_-]{11})', # Embed URL
        r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/v\/([a-zA-Z0-9_-]{11})',    # V URL
        r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/user\/.+\?v=([a-zA-Z0-9_-]{11})' # User channel URL
    ]
    for pattern in patterns:
        match = re.search(pattern, youtube_url)
        if match:
            return match.group(1)
    print(f"❌ Error: Could not extract video ID from URL: {youtube_url}")
    return None


## Helper Function to Extract YouTube Video ID

This code cell defines a Python function called `extract_video_id` that takes a YouTube video URL as input and attempts to extract the unique 11-character video ID from it. This function is designed to handle various common YouTube URL formats.

1.  **`def extract_video_id(youtube_url):`**:
    * This line defines a new function named `extract_video_id` that accepts one argument: `youtube_url`, which is expected to be a string containing the URL of a YouTube video.

2.  **`"""Extracts the YouTube video ID from various URL formats."""`**:
    * This is a docstring, providing a brief description of what the function does. It's good practice to include docstrings to explain the purpose and usage of your functions.

3.  **`patterns = [...]`**:
    * This line creates a list named `patterns`. This list contains several regular expression patterns, each designed to match a different format of a YouTube video URL.
        * `r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/watch\?v=([a-zA-Z0-9_-]{11})'`: Matches standard YouTube "watch" URLs (e.g., `https://www.youtube.com/watch?v=dQw4w9WgXcQ`). It captures the 11 alphanumeric characters and hyphens/underscores following `v=`.
        * `r'(?:https?:\/\/)?(?:www\.)?youtu\.be\/([a-zA-Z0-9_-]{11})'`: Matches shortened YouTube URLs (e.g., `https://youtu.be/dQw4w9WgXcQ`). It captures the 11 alphanumeric characters and hyphens/underscores after `youtu.be/`.
        * `r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/embed\/([a-zA-Z0-9_-]{11})'`: Matches YouTube embed URLs (e.g., `https://www.youtube.com/embed/dQw4w9WgXcQ`). It captures the 11 alphanumeric characters and hyphens/underscores after `embed/`.
        * `r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/v\/([a-zA-Z0-9_-]{11})'`: Matches older "v" URLs (e.g., `https://www.youtube.com/v/dQw4w9WgXcQ`). It captures the 11 alphanumeric characters and hyphens/underscores after `/v/`.
        * `r'(?:https?:\/\/)?(?:www\.)?youtube\.com\/user\/.+\?v=([a-zA-Z0-9_-]{11})'`: Matches URLs from user channels that include the video ID in the query parameters. It captures the 11 alphanumeric characters and hyphens/underscores after `?v=`.
        * The `(?:...)` syntax in the regular expressions creates non-capturing groups, meaning these parts of the URL are matched but not extracted as separate groups.
        * `[a-zA-Z0-9_-]{11}` matches exactly 11 alphanumeric characters (both uppercase and lowercase), digits, hyphens, and underscores, which is the standard format for a YouTube video ID.
        * `https?:\/\/` matches both `http://` and `https://`.
        * `(?:www\.)?` optionally matches `www.`.

4.  **`for pattern in patterns:`**:
    * This loop iterates through each regular expression pattern in the `patterns` list.

5.  **`match = re.search(pattern, youtube_url)`**:
    * Inside the loop, for each `pattern`, the `re.search()` function from the `re` module is used to search for a match of the pattern within the input `youtube_url`. If a match is found, it returns a match object; otherwise, it returns `None`.

6.  **`if match:`**:
    * This conditional statement checks if a match was found by `re.search()`.

7.  **`return match.group(1)`**:
    * If a match is found, `match.group(1)` returns the content of the first capturing group in the regular expression. In each of your patterns, the video ID is enclosed in parentheses `(...)`, making it the first (and only) capturing group. This line extracts and returns the video ID.

8.  **`print(f"❌ Error: Could not extract video ID from URL: {youtube_url}")`**:
    * If the loop completes without finding a match for any of the patterns, this line is executed. It prints an error message indicating that the video ID could not be extracted from the provided `youtube_url`.

9.  **`return None`**:
    * If no video ID is found, the function returns `None` to indicate failure.

**In summary, this helper function provides a robust way to extract the unique identifier of a YouTube video from various URL formats. This is a crucial step for using the `youtube-transcript-api`, as it typically requires the video ID to fetch the transcript.** By encapsulating this logic in a function, you make your code more organized and reusable.

In [5]:
# --- Core Summarization Function ---
def summarize_youtube_video(youtube_url: str) -> dict | None:
    """
    Summarizes a YouTube video using its transcript and Gemini AI.

    Args:
        youtube_url: The URL of the YouTube video.

    Returns:
        A dictionary containing the summary in a structured format (title,
        short_summary, key_points) or None if summarization fails.
    """
    print(f"Processing video: {youtube_url}")
    # Extract video ID using the updated helper function
    video_id = extract_video_id(youtube_url)

    # --- 1. Get Video Transcript ---
    transcript_text = ""
    try:
        transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
        try:
            transcript = transcript_list.find_generated_transcript(['en'])
        except NoTranscriptFound:
            print("⚠️ Warning: No English transcript found. Trying first available...")
            transcript = next(iter(transcript_list))

        print(f"Transcript found ({transcript.language}, generated={transcript.is_generated})")
        transcript_data = transcript.fetch()

        # Try accessing the .text attribute directly. Catch attribute errors.
        processed_texts = []
        for item in transcript_data:
            try:
                # Assume item is an object with a 'text' attribute
                processed_texts.append(item.text)
            except AttributeError:
                # Fallback: If it doesn't have '.text', maybe it's a dict? (Less likely now)
                if isinstance(item, dict):
                     processed_texts.append(item.get('text', '')) # Use .get() for safety
                else:
                     print(f"⚠️ Warning: Skipping transcript item without 'text' attribute/key: {type(item)}")

        transcript_text = " ".join(processed_texts)

        print(f"Transcript length: {len(transcript_text)} characters.")

    except (TranscriptsDisabled, NoTranscriptFound) as e:
        print(f"❌ Error: Could not retrieve transcript for video ID {video_id}. Reason: {e}")
        # Specific advice for unusual URLs
        if "googleusercontent.com" in youtube_url:
             print("ℹ️ Info: Transcripts might be unavailable or formatted differently for googleusercontent URLs. Try a standard youtube.com or youtu.be URL if possible.")
        return None
    except Exception as e:
        print(f"❌ An unexpected error occurred fetching or processing transcript: {e}")
        return None

    if not transcript_text.strip():
        print("❌ Error: Transcript text is empty or could not be processed correctly.")
        return None

    # --- 2. Prepare Prompt and Configure Model (No changes needed) ---
    model = genai.GenerativeModel(
        'gemini-1.5-pro-latest',
        generation_config=genai.GenerationConfig(
            response_mime_type="application/json"
        )
    )
    prompt = f"""
    Please analyze the following YouTube video transcript and provide a concise summary.
    I need the output in JSON format with the following structure:
    {{
      "video_title_suggestion": "A short, catchy suggested title based on the content",
      "short_summary": "A brief paragraph (2-4 sentences) summarizing the main topic and conclusion.",
      "key_points": [
        "Bullet point 1 highlighting a major topic or finding.",
        "Bullet point 2 highlighting another significant aspect.",
        "Bullet point 3 summarizing a key takeaway or action item.",
        "..."
      ]
    }}

    Transcript:
    ---
    {transcript_text}
    ---

    Generate the JSON output based *only* on the provided transcript content. Avoid introductory phrases like "Here is the JSON output".
    """

    # --- 3. Call Gemini API (No changes needed) ---
    print("Generating summary with Gemini...")
    try:
        # Increased timeout might help if transcript processing is slow before the API call
        response = model.generate_content(prompt, request_options={'timeout': 120}) # 120 seconds timeout
        
        # --- 4. Parse Response (No changes needed) ---
        # Clean potential markdown formatting if Gemini adds it despite JSON mime type
        cleaned_text = response.text.strip().lstrip('```json').rstrip('```').strip()
        summary_json = json.loads(cleaned_text)
        print("✅ Summary generated successfully!")
        return summary_json

    except json.JSONDecodeError as e:
        print(f"❌ Error: Failed to parse JSON response from Gemini: {e}")
        print(f"Raw Gemini Response Text:\n{response.text}") # Log raw response
        return None
    except Exception as e:
        print(f"❌ An error occurred during Gemini API call: {e}")
        # print(f"Gemini response object: {response}") # Uncomment for debugging
        return None


## Core Function for Summarizing YouTube Videos

This code cell defines the central function of your notebook, `summarize_youtube_video`, which orchestrates the process of taking a YouTube video URL, extracting its transcript, and using the Gemini AI model to generate a structured summary.

1.  **`def summarize_youtube_video(youtube_url: str) -> dict | None:`**:
    * This line defines a function named `summarize_youtube_video` that accepts one argument:
        * `youtube_url`: A string representing the URL of the YouTube video to be summarized.
    * The `-> dict | None` annotation indicates that the function is expected to return either a dictionary (containing the summary) or `None` if the summarization process fails at any point.

2.  **`""" ... """`**:
    * This is a detailed docstring explaining what the function does, its arguments, and what it returns. This is crucial for understanding how to use the function.

3.  **`print(f"Processing video: {youtube_url}")`**:
    * This line prints a message to the console indicating which video URL is currently being processed. This is helpful for monitoring the progress of your script, especially if you are processing multiple videos.

4.  **`video_id = extract_video_id(youtube_url)`**:
    * This line calls the `extract_video_id` function (defined in the previous cell) to get the unique ID of the YouTube video from the provided URL.

5.  **`if not video_id:`**:
    * This conditional statement checks if the `extract_video_id` function returned `None`, which indicates that the video ID could not be extracted from the URL.

6.  **`# If ID extraction fails specifically for googleusercontent, try parsing manually ... else: return None`**:
    * This block handles a specific case where the initial ID extraction might fail for URLs from `googleusercontent.com`. It attempts a manual parsing of the URL to extract a potential video ID. This shows a thoughtful approach to handling potentially unusual URL formats. If the manual parsing also fails or if the URL is not from `googleusercontent.com`, the function returns `None`.

7.  **`# --- 1. Get Video Transcript ---`**:
    * This is a comment indicating the start of the section responsible for fetching the video transcript.

8.  **`transcript_text = ""`**:
    * Initializes an empty string variable `transcript_text` which will store the concatenated transcript.

9.  **`try...except (TranscriptsDisabled, NoTranscriptFound) as e: ... except Exception as e: ...`**:
    * This `try...except` block handles potential errors that might occur during the transcript retrieval process.
        * `YouTubeTranscriptApi.list_transcripts(video_id)`: This uses the `YouTubeTranscriptApi` to get a list of available transcripts for the given `video_id`.
        * `transcript_list.find_generated_transcript(['en'])`: This attempts to find an automatically generated English transcript.
        * `except NoTranscriptFound:`: If no English transcript is found, it tries to get the first available transcript in any language.
        * `transcript = next(iter(transcript_list))`: This gets the first transcript object from the list.
        * `transcript_data = transcript.fetch()`: This fetches the actual text of the transcript as a list of dictionaries, where each dictionary typically contains the text segment and its start time and duration.
        * **The Fix:** The following loop iterates through the `transcript_data` and attempts to extract the text content. It includes error handling (`try...except AttributeError`) to gracefully handle cases where the items in `transcript_data` might not have a `.text` attribute directly (which can sometimes happen with variations in the API's response structure). It also includes a fallback to check if the item is a dictionary and tries to access the `'text'` key.
        * `transcript_text = " ".join(processed_texts)`: After processing all items, the extracted text segments are joined together with spaces to form a single string.
        * `except (TranscriptsDisabled, NoTranscriptFound) as e:`: This catches specific exceptions raised by the `youtube-transcript-api` if transcripts are disabled for the video or if no transcript is found. It prints an informative error message and returns `None`. It also includes specific advice for `googleusercontent.com` URLs, suggesting potential alternative URL formats.
        * `except Exception as e:`: This catches any other unexpected errors that might occur during the transcript fetching or processing, prints an error message, and returns `None`.

10. **`if not transcript_text.strip(): ...`**:
    * This checks if the `transcript_text` is empty after the retrieval process. If it is, it prints an error message and returns `None`.

11. **`# --- 2. Prepare Prompt and Configure Model (No changes needed) ---`**:
    * This comment marks the section where the prompt for the Gemini model is constructed.
    * `model = genai.GenerativeModel(...)`: This initializes the Gemini Pro model with a specified `response_mime_type` to encourage JSON output.
    * `prompt = f"""..."""`: This defines the prompt that will be sent to the Gemini model. It instructs the model to analyze the transcript and provide a structured JSON output containing a suggested title, a short summary, and key points. The transcript itself is included in the prompt.

12. **`# --- 3. Call Gemini API (No changes needed) ---`**:
    * This comment indicates the section where the call to the Gemini API is made.
    * `response = model.generate_content(prompt, request_options={'timeout': 120})`: This sends the constructed `prompt` to the Gemini model using the `generate_content` method. A `timeout` of 120 seconds is set for the API request.

13. **`# --- 4. Parse Response (No changes needed) ---`**:
    * This comment marks the section where the response from the Gemini model is processed.
    * `cleaned_text = response.text.strip().lstrip('```json').rstrip('```').strip()`: This line cleans the response text by removing any leading/trailing whitespace and potential markdown formatting (like ```json blocks) that the model might include.
    * `summary_json = json.loads(cleaned_text)`: This attempts to parse the cleaned response text as a JSON object and stores it in the `summary_json` variable.
    * `print("✅ Summary generated successfully!")`: If the JSON parsing is successful, a success message is printed.
    * `return summary_json`: The parsed JSON summary is returned.

14. **`except json.JSONDecodeError as e:`**:
    * This `except` block catches errors that might occur if the Gemini model's response is not valid JSON. It prints an error message and the raw response text for debugging and returns `None`.

15. **`except Exception as e:`**:
    * This `except` block catches any other unexpected errors that might occur during the Gemini API call, prints an error message, and returns `None`.

**In summary, this `summarize_youtube_video` function encapsulates the core logic of your project:**

* It takes a YouTube URL as input.
* It extracts the video ID.
* It retrieves the transcript of the video, handling potential errors like disabled or missing transcripts.
* It constructs a prompt that includes the transcript and asks the Gemini model for a structured JSON summary.
* It calls the Gemini API to get the summary.
* It parses the JSON response from Gemini.
* It returns the structured summary as a Python dictionary or `None` if any step fails.

This function is well-structured with error handling and clear steps, making it robust and easy to understand. The inclusion of specific handling for `googleusercontent.com` URLs and the improved transcript processing logic demonstrate a good understanding of potential challenges.

In [6]:
# --- Example Usage ---
summary_result = summarize_youtube_video("https://www.youtube.com/watch?v=Qe5WT22-AO8&ab_channel=BobbyBroccoli")

if summary_result:
    print("\n--- Video Summary ---")
    print(f"Suggested Title: {summary_result.get('video_title_suggestion', 'N/A')}")
    print(f"\nShort Summary:\n{summary_result.get('short_summary', 'N/A')}")
    print("\nKey Points:")
    key_points = summary_result.get('key_points', [])
    if key_points:
        for point in key_points:
            print(f"- {point}")
    else:
        print("N/A")
    print("\n--- Raw JSON Output ---")
    print(json.dumps(summary_result, indent=2)) # Pretty print the JSON
else:
    print("\nFailed to generate summary for the video.")

Processing video: https://www.youtube.com/watch?v=Qe5WT22-AO8&ab_channel=BobbyBroccoli
Transcript found (English (auto-generated), generated=True)
Transcript length: 75302 characters.
Generating summary with Gemini...
✅ Summary generated successfully!

--- Video Summary ---
Suggested Title: The Element Hunters: From Fermi's Fission to Ninov's Fraud

Short Summary:
This video explores the history of element discovery, from Enrico Fermi's accidental discovery of nuclear fission to Victor Ninov's fraudulent claim of element 118.  It highlights the competitive race between labs, the challenges of creating superheavy elements, and the scientific and ethical implications of Ninov's deceit.

Key Points:
- Enrico Fermi's neutron bombardment experiments aimed to create new elements beyond uranium, but accidentally led to the discovery of nuclear fission.
- Glenn Seaborg and his team at Berkeley dominated the element discovery field for decades, adding elements 93 through 101 to the periodic tab

## Example Usage of the Summarization Function

This final code cell demonstrates how to use the `summarize_youtube_video` function that you defined in the previous cell. It provides an example of calling the function with a YouTube video URL and then displaying the resulting summary in a user-friendly format, as well as the raw JSON output.

1.  **`# --- Example Usage ---`**:
    * This is a comment indicating that the following code shows how to use the `summarize_youtube_video` function.

2.  **`summary_result = summarize_youtube_video("https://www.youtube.com/watch?v=OZvbBDnzQqA")`**:
    * This line calls the `summarize_youtube_video` function with a specific YouTube video URL (`"https://www.youtube.com/watch?v=OZvbBDnzQqA"`). The returned value (either a dictionary containing the summary or `None` if an error occurred) is stored in the `summary_result` variable. **Note:** It's important to use a valid YouTube video URL here for the code to execute correctly. The example URL provided might not be a real or accessible YouTube video.

3.  **`if summary_result:`**:
    * This conditional statement checks if `summary_result` is not `None`. If it's not `None`, it means the summarization was successful, and the variable contains the summary dictionary.

4.  **`print("\n--- Video Summary ---")`**:
    * If the summarization was successful, this line prints a header to indicate the start of the formatted summary.

5.  **`print(f"Suggested Title: {summary_result.get('video_title_suggestion', 'N/A')}")`**:
    * This line prints the suggested video title. It uses the `.get()` method of the dictionary with a default value of `'N/A'`. This is a safe way to access dictionary keys, as it prevents a `KeyError` if the key is not present in the `summary_result`.

6.  **`print(f"\nShort Summary:\n{summary_result.get('short_summary', 'N/A')}")`**:
    * This line prints the short summary in a similar way, using `.get()` with a default value. The `\n` adds a newline for better formatting.

7.  **`print("\nKey Points:")`**:
    * This line prints a header for the key points.

8.  **`key_points = summary_result.get('key_points', [])`**:
    * This line retrieves the list of key points from the `summary_result` dictionary. It uses `.get()` with a default empty list (`[]`) to avoid errors if the `'key_points'` key is missing.

9.  **`if key_points:`**:
    * This checks if the `key_points` list is not empty.

10. **`for point in key_points:`**:
    * If there are key points, this loop iterates through the list.

11. **`print(f"- {point}")`**:
    * Inside the loop, each key point is printed with a leading hyphen for better readability.

12. **`else: print("N/A")`**:
    * If the `key_points` list is empty, this line prints "N/A" to indicate that no key points were found in the summary.

13.  **`print("\n--- Raw JSON Output ---")`**:
    * This line prints a header for the raw JSON output.

14. **`print(json.dumps(summary_result, indent=2))`**:
    * This line uses the `json.dumps()` function to convert the `summary_result` dictionary back into a JSON formatted string. The `indent=2` argument makes the JSON output more readable by adding indentation. This is useful for seeing the complete structure of the generated summary.

15. **`else: print("\nFailed to generate summary for the video.")`**:
    * If the initial `if summary_result:` condition was false (meaning `summary_result` was `None`), this `else` block is executed, printing a message indicating that the summary generation failed.

**In summary, this cell demonstrates how to:**

* Call the `summarize_youtube_video` function with a YouTube URL.
* Check if the function returned a valid summary.
* Access and print the different parts of the summary (suggested title, short summary, and key points) in a user-friendly way, handling cases where some parts might be missing.
* Print the raw JSON output of the summary for inspection.
* Handle the case where the summary generation fails.

This cell serves as a practical example of how to use the core functionality you've built in the previous cells and provides a way to see the output of your Gen AI pipeline. To see it in action, you would need to replace the example URL with a valid YouTube video URL.