<a href="https://colab.research.google.com/github/kirbah/genai-chaptercraft/blob/main/GenAI_ChapterCraft_YouTube.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GenAI ChapterCraft: Automated Video Chapter Generation

**Overview:**

This notebook demonstrates an automated workflow for generating video chapters using AI. It retrieves a YouTube video transcript, constructs a prompt, and leverages Gemini AI to generate SEO-friendly, timestamped chapters.

**Key Features:**

- **Transcript Extraction:** Automatically fetches and compiles YouTube video transcripts.
- **AI-driven Chapter Creation:** Utilizes Gemini AI to identify topic shifts and generate chapters with timestamps.
- **Optimized for Colab:** Designed for efficient execution in Google Colab.

**Process:**

1. Provide a YouTube video URL.
2. Extract and compile the video transcript.
3. Format the transcript into a structured prompt.
4. Use Gemini AI to generate chapter titles with timestamps.
5. Display the generated chapters.

**Benefits:**

- Saves time by automating chapter creation.
- Enhances video searchability and user experience.
- Provides an open-source solution leveraging state-of-the-art AI.

**Requirements:**

- A valid GEMINI_API_KEY from AI Studio (obtain one at: [AI Studio API Key](https://aistudio.google.com/apikey)).

**How to Use:**

1. Set your GEMINI_API_KEY as an environment variable.
2. Run the notebook cells sequentially.
3. Enjoy the generated video chapters!

Install required libraries:
- google-generativeai (for Gemini AI)
- youtube-transcript-api (to fetch YouTube transcripts)

In [48]:
!pip install -q google-generativeai youtube-transcript-api

In [49]:
import os
import re
from google.colab import userdata
from youtube_transcript_api import YouTubeTranscriptApi
import google.generativeai as genai

Define URL of YouTube video to download and prepare transcript

In [50]:
video_url = "https://www.youtube.com/watch?v=A9WY_HZUK8Q"

Define a function to extract the YouTube video ID from the URL.

In [51]:
def extract_video_id(url):
    """Extracts the YouTube video ID from the URL."""
    match = re.search(r"(?:v=|youtu\.be/)([^&?]+)", url)
    if match:
        return match.group(1)
    raise ValueError("Invalid YouTube URL provided.")

# Example video URL and extraction
video_id = extract_video_id(video_url)

Retrieve the transcript for the video using the video ID and combine segments into one text.

In [52]:
transcript_segments = YouTubeTranscriptApi.get_transcript(video_id)
transcript_text = "\n".join([segment["text"] for segment in transcript_segments])

In [53]:
transcript_text[:500]

"this 400-year old book should have\nchanged mathematics Forever This Is The\nSwiss clockmaker Jos bergy's arithmetic\nand geometric progression tables the\nbook contains an ingenious mathematical\nHack That Bergie called red numbers and\nthe design of a powerful Computing\ndevice that uses these red numbers\nhiding on its title page bergy's hack\nworks by constructing an enormous table\nof numbers where each number is simply\nthe previous number time\n1.001 starting at one and repeating this\noperation again"

Create a structured prompt embedding the transcript and instructions for generating chapters.

In [54]:
prompt = (
    "Based on the following transcript, generate a chapter list following these instructions:\n"
    "1. Identify key topic shifts and assign each a starting timestamp in MM:SS format.\n"
    "2. Format each chapter as '<timestamp> <chapter title>' (e.g., '00:00 Introduction').\n"
    "3. Then, review the chapter list and adjust or remove any misaligned chapter boundaries.\n"
    "Only output the final chapter list without extra commentary.\n\n"
    "### Transcript:\n"
    f"{transcript_text}\n\n"
    "Chapters:"
)

Configure Gemini AI:

In [55]:
gemini_api_key = userdata.get('GEMINI_API_KEY')
if not gemini_api_key:
    raise ValueError("The GEMINI_API_KEY environment variable is not set. Please set it and try again.")

genai.configure(api_key=gemini_api_key)
generation_config = {
    "temperature": 0.5,
    "top_p": 0.95,
    "top_k": 64,
    "max_output_tokens": 500,
    "response_mime_type": "text/plain",
}
model = genai.GenerativeModel(
    model_name="gemini-2.0-pro-exp-02-05",
    generation_config=generation_config,
)

Send the prompt to Gemini AI and print the generated chapters.

In [56]:
chat_session = model.start_chat(history=[])
gemini_response = chat_session.send_message(prompt)
print("Generated Chapters:\n")
print(gemini_response.text)
print("Generated using free 'GenAI ChapterCraft' tool.")

Generated Chapters:

00:00 Introduction
00:22 Bergie's Black and Red Numbers
01:10 Multiplication with Bergie's Tables
02:05 Other Mathematical Operations
02:45 Bergie's Title Page and the Slide Rule
03:23 How the Slide Rule Works
04:38 Division with the Slide Rule
05:12 Limitations and Real-World Use
05:30 Bergie's Secrecy and Kepler's Frustration
06:04 John Napier and Logarithms
06:46 Logarithms Today and Brilliant.org Promotion

Generated using free 'GenAI ChapterCraft' tool.
