[![Labellerr](https://storage.googleapis.com/labellerr-cdn/%200%20Labellerr%20template/notebook.webp)](https://www.labellerr.com)

# **YT Video Blog Generator**

---

[![labellerr](https://img.shields.io/badge/Labellerr-BLOG-black.svg)](https://www.labellerr.com/blog/<BLOG_NAME>)
[![Youtube](https://img.shields.io/badge/Labellerr-YouTube-b31b1b.svg)](https://www.youtube.com/@Labellerr)
[![Github](https://img.shields.io/badge/Labellerr-GitHub-green.svg)](https://github.com/Labellerr/Hands-On-Learning-in-Computer-Vision)

This notebook implements an automated multi-agent pipeline that transforms YouTube videos into professional blog articles using CrewAI. The system employs three specialized AI agents working in sequence: a transcript extractor that retrieves and structures video content, an outline drafter that organizes the material into a logical blog structure, and a content writer that generates polished, publication-ready articles.

The pipeline begins by extracting transcripts from any YouTube URL format using a custom tool that handles various video types (shorts, live streams, embeds). The extracted content then flows through a structured workflow where agents analyze, organize, and refine the material while maintaining the original video's educational value and insights.

Key features include configurable LLM settings for different creative tasks, professional writing guidelines to ensure human-like quality, and automatic saving of generated content with timestamps. This system effectively repurposes video content into written format, saving significant time while maintaining content quality and coherence. The modular design allows for easy customization and extension to handle various content types and writing styles.

### Install required libraries

In [2]:
# !pip install crewai python-dotenv youtube-transcript-api

### üì¶ Imports and Environment Setup

This cell imports all necessary libraries and dependencies for the YouTube to Blog conversion system:

- **crewai**: Core framework for creating AI agents, tasks, and crews
- **dotenv**: For loading environment variables from .env file
- **youtube_transcript_api**: For extracting transcripts from YouTube videos
- **re**: Regular expressions for URL parsing
- **os**: Operating system interface for environment variables

This setup provides the foundation for the multi-agent system that will process YouTube videos and convert them into blog posts.

In [3]:
# ===== Imports and Environment setup =====
from crewai import Agent, Task, Crew, Process, LLM
from crewai.tools import tool
import os
from dotenv import load_dotenv
import re
from youtube_transcript_api import YouTubeTranscriptApi


### üîë Load API Key from .env

This cell loads the Gemini API key from the environment variables:

- Uses `load_dotenv()` to read the `.env` file
- Retrieves the `GEMINI_API_KEY` environment variable
- Stores it for later use by the LLM configurations

**Important**: Make sure your `.env` file contains `GEMINI_API_KEY=your_actual_key_here` for the system to work properly.

In [4]:
# ===== Load API Key from .env =====
# Make sure you have a .env file with a line: GEMINI_API_KEY=your_key_here
load_dotenv()
GEMINI_API_KEY = os.getenv('GEMINI_API_KEY')


### üé¨ YouTube Transcript Extractor Tool

This cell defines a custom tool for extracting transcripts from YouTube videos:

**Key Features:**
- Supports multiple YouTube URL formats (watch, youtu.be, embed, shorts, live, mobile)
- Extracts 11-character video ID using regex patterns
- Fetches English transcripts, falling back to auto-generated if needed
- Returns cleaned, concatenated transcript text

**Error Handling:**
- Returns descriptive error messages if video ID extraction fails
- Handles exceptions during transcript fetching gracefully

This tool is essential for the first agent to obtain raw content from YouTube videos.

In [7]:
@tool("YouTube Transcript Extractor")
def extract_youtube_transcript(url: str) -> str:
    """
    Extract transcript from any YouTube URL format.
    
    Supports: watch, youtu.be, embed, shorts, live, mobile URLs.
    Returns English transcript text of video.
    """
    try:
        # Extract 11-char video ID from any YouTube URL format
        video_id = None
        
        # Try query parameter first (watch URLs)
        if 'v=' in url:
            video_id = re.search(r'v=([a-zA-Z0-9_-]{11})', url)
        
        # Try path-based formats (youtu.be, embed, shorts, live, v)
        if not video_id:
            video_id = re.search(r'(?:youtu\.be/|embed/|shorts/|live/|/v/)([a-zA-Z0-9_-]{11})', url)
        
        if not video_id:
            return "ERROR: Could not extract video ID from URL."
        
        # print(video_id.group(1))
        
        video_id = video_id.group(1)
        
        api = YouTubeTranscriptApi()
        transcript_list = api.list(video_id)

        # Try fetching English transcript, fallback to auto-generated
        try:
            transcript = transcript_list.find_transcript(['en']).fetch()
        except Exception:
            transcript = transcript_list.find_generated_transcript(['en']).fetch()

        # Each entry is now a FetchedTranscriptSnippet object
        # Access its .text attribute instead of ['text']
        text = ' '.join([snippet.text for snippet in transcript])

        return text
        
    except Exception as e:
        return f"ERROR: {e}"

### üß™ Test Function (Commented Out)

This cell contains a commented-out test function for the transcript extractor:

- Can be uncommented to test the YouTube transcript extraction
- Useful for debugging URL parsing and transcript fetching
- Replace the URL with any YouTube video to verify the tool works correctly

**Usage Example:**

In [None]:
# test the function
# url = "https://youtu.be/qU3Rc6_B9es?si=PbQHMxaZrM9o3ocA"
# print(extract_youtube_transcript(url))

Today I'm going over how to write production code in Python. This is a topic that's extremely important that many developers honestly never learn and it's the number one thing that separates a junior developer from a senior developer. If you're serious about writing Python code and becoming a Python developer, then definitely watch this video and I'm going to go over a lot of concepts with in-depth examples so you can understand exactly how you can apply them to your own codebase. Okay, let's dive into it. And by the way, if you want access to all of the code that I show in this video, as well as this cheat sheet that I have on screen right now, you can do that by clicking the link in the description. Simply sign up for my newsletter and I will send it over to you for free. So, this video will cover eight main design principles with in-depth examples. We'll quickly go over what those principles are and then we'll dive into the code and explain them more in depth. Number one, cohesion a

### ü§ñ LLM Configuration

This cell configures three separate LLM instances with different temperature settings:

- **extracting_llm** (temperature=0.1): For precise transcript extraction tasks
- **outline_llm** (temperature=0.2): For structured outline generation
- **writing_llm** (temperature=0.7): For creative blog writing

**Temperature Settings:**
- Lower temperature = More deterministic, consistent outputs
- Higher temperature = More creative, varied outputs

Each LLM uses Gemini 2.0 Flash Lite model, optimized for different stages of the content generation pipeline.

In [8]:
# ===== LLM Setup =====
extracting_llm = LLM(model='gemini/gemini-2.0-flash-lite', 
                     apikey=GEMINI_API_KEY, 
                     temperature=0.1)

outline_llm = LLM(model='gemini/gemini-2.0-flash-lite', 
                  apikey=GEMINI_API_KEY, 
                  temperature=0.2)

writing_llm = LLM(model='gemini/gemini-2.0-flash-lite', 
                  apikey=GEMINI_API_KEY, 
                  temperature=0.7)


### üë• Agent Definitions

This cell defines three specialized agents that work together:

#### 1. Extractor Agent
- **Role**: YouTube transcript extractor
- **Goal**: Extract and structure complete transcripts from YouTube URLs
- **Tools**: Uses the YouTube transcript extractor tool
- **Output**: Structured JSON with video title, URL, segments, and full text

#### 2. Blog Outline Drafting Agent
- **Role**: Content structure specialist
- **Goal**: Create detailed blog outlines from transcripts
- **Output**: Structured outline with title, introduction, sections, and conclusion
- **Personality**: Analytical, structured, reader-oriented

#### 3. Blog Writer Agent
- **Role**: Professional content writer
- **Goal**: Transform outlines into polished, human-like blog articles
- **Writing Guidelines**: Professional tone, active voice, natural flow
- **Personality**: Polished, thoughtful, articulate

Each agent has specific backstory, tasks, and expected output formats tailored to their role.

In [9]:
# ===== Agent Definitions =====
extractor_agent = Agent(
    role = "youtube_transcript_extractor_agent",
    goal = (
        "Extract the complete transcript from a given YouTube video URL and provide it in a clean, structured format. "
        "The output should include speaker timestamps (if available) and the full textual content of the video without summarization. "
        "This transcript will serve as the raw source material for downstream agents such as the blog_outline_drafting_agent."
    ),
    backstory = (
        "This agent was developed to specialize in transforming video content into structured text data. "
        "Unlike typical summarization agents, it preserves the full integrity of the spoken content, ensuring no loss of context. "
        "It is capable of handling both short tutorials and long-form educational content. "
        "The agent ensures the transcript is cleaned of any redundant formatting, ads, or irrelevant metadata, "
        "making it ready for analytical or content generation tasks downstream."
    ),
    tasks = [
        "Extract the full transcript from the provided YouTube URL.",
        "Automatically detect and parse all transcript segments, maintaining order and timestamp accuracy.",
        "Combine all segments into a clean, human-readable text format.",
        "Exclude ads, sponsor mentions, or unrelated metadata if detected.",
        "Provide the complete transcript as structured text output."
    ],
    expected_output_format = {
        "video_title": "Title of the YouTube video (if retrievable).",
        "video_url": "Original URL of the video.",
        "transcript": [
            {
                "start": "Timestamp in seconds or readable format (e.g., 00:01)",
                "text": "The spoken content of that segment."
            }
        ],
        "full_text": "Concatenated transcript text for downstream use."
    },
    tools = [extract_youtube_transcript],
    llm = extracting_llm,
    verbose = True
)

blog_outline_drafting_agent = Agent(
    role = "blog_outline_drafting_agent",
    goal = (
        "From the transcript provided by the extractor agent (derived from a YouTube video), "
        "create a structured and reader-friendly blog outline that serves as a foundation for the next agent to write the full article. "
        "The outline should cover key points, insights, and subtopics that make the content engaging, educational, "
        "and easy to understand for both students and professionals."
    ),
    backstory = (
        "This agent was designed by a content intelligence team specializing in transforming spoken knowledge into "
        "readable, structured formats. Its expertise lies in identifying meaningful segments, extracting insights, and "
        "organizing them logically into an outline. Drawing inspiration from industry best practices in content marketing, "
        "it ensures that each blog outline not only mirrors the intent of the speaker but also optimizes readability for diverse audiences. "
        "Whether the transcript is from a tech tutorial, a product walkthrough, or an expert discussion, the agent intelligently "
        "structures the flow ‚Äî from introduction and context to takeaways and actionable insights."
    ),
    tasks = [
        "Analyze the provided YouTube transcript for key themes, sections, and insights.",
        "Group related ideas into logical sections (Introduction, Main Concepts, Applications, Summary, Takeaways, etc.).",
        "Highlight learning objectives, definitions, or frameworks discussed in the video.",
        "Ensure the outline is detailed enough for another writing agent to develop a complete blog post from it.",
        "Maintain a balance between technical accuracy and readability."
    ],
    expected_output_format = {
        "Title": "A short and clear title representing the video/blog topic.",
        "Introduction": [
            "Brief overview of the topic and its importance.",
            "Target audience and why it matters to them."
        ],
        "Section 1": "Main concept or discussion point #1 with sub-bullets explaining what to cover.",
        "Section 2": "Main concept or discussion point #2 with sub-bullets explaining what to cover.",
        "Additional Sections": [
            "Examples, case studies, or visual explanation ideas.",
            "Common mistakes, FAQs, or myths if mentioned."
        ],
        "Conclusion": [
            "Key takeaways or final summary points.",
            "Call-to-action or suggested next steps (e.g., related resources, tutorials)."
        ]
    },
    personality = (
        "Analytical, structured, and reader-oriented. Communicates in a clear and concise way, "
        "with an understanding of content flow and audience engagement."
    ),
    output_example = (
        "Title: How AI is Transforming Video Analytics\n\n"
        "Introduction:\n- Explain the growing role of AI in video data processing.\n"
        "- Mention industries using AI-powered analytics.\n\n"
        "Section 1: Understanding Video Analytics\n- Definition and key components.\n"
        "- Traditional vs AI-based approaches.\n\n"
        "Section 2: Core Technologies Behind AI Video Analysis\n- Computer Vision and Deep Learning models.\n"
        "- Real-world examples like YOLO and MediaPipe.\n\n"
        "Section 3: Applications Across Industries\n- Security and surveillance.\n- Retail analytics.\n- Sports performance analysis.\n\n"
        "Conclusion:\n- Summarize the impact of AI.\n- Encourage readers to explore more about real-time video intelligence."
    ),
    llm=outline_llm,
)

blog_writer_agent = Agent(
    role = "blog_writer_agent",
    goal = (
        "Using the outline created by the blog_outline_drafting_agent, craft a complete blog article that reads naturally, "
        "reflects a human writing tone, and engages readers with clarity and depth. "
        "The writing should be professional, coherent, and free from AI-sounding expressions, filler language, or overused transitions. "
        "Each paragraph should flow logically and deliver value, ensuring that both students and professionals can easily follow and learn."
    ),
    backstory = (
        "This agent is a specialized long-form content creator trained on top-tier editorial standards. "
        "It was developed to replicate the style of expert technical writers, industry bloggers, and thought leaders. "
        "Instead of producing formulaic AI responses, it emphasizes natural phrasing, sentence variety, and contextual understanding. "
        "It pays close attention to flow, pacing, and readability ‚Äî ensuring that the content feels written by a skilled human writer "
        "with subject expertise and narrative control. "
        "Its writing is direct, engaging, and always aligned with the audiences knowledge level."
    ),
    tasks = [
        "Read and interpret the provided blog outline.",
        "Expand each section into well-structured paragraphs with smooth transitions.",
        "Use a professional tone suitable for both technical and general readers.",
        "Avoid robotic phrasing, repetitive structures, or generic AI-generated wording.",
        "Incorporate examples, insights, or simplified analogies where needed.",
        "Maintain consistency in formatting, heading hierarchy, and grammar quality.",
        "Produce clean, publication-ready text suitable for blogs or newsletters."
    ],
    writing_guidelines = {
        "Tone": "Professional, clear, and informative without being overly formal.",
        "Style": "Active voice, well-paced paragraphs, natural flow of ideas.",
        "Prohibited Elements": [
            "Overuse of conjunctions like 'Moreover', 'Additionally', 'In conclusion'.",
            "AI disclaimers or self-referential statements.",
            "Long or unnatural sentences.",
            "Unnecessary dashes or em-dashes."
        ],
        "Formatting": {
            "Headings": "Follow logical hierarchy (H1, H2, H3).",
            "Paragraph Length": "3 to 5 lines maximum.",
            "Readability Target": "Flesch reading ease between 55 to 70 (balanced for professionals and students)."
        }
    },
    expected_output_format = {
        "Title": "Final blog title reflecting clarity and SEO-friendliness.",
        "Introduction": "Engaging context setting up the main idea and value proposition.",
        "Body": [
            "Multiple sections expanding on each outline point.",
            "Include insights, examples, or explanations."
        ],
        "Conclusion": "Clear wrap-up summarizing the learning or insights, possibly ending with a reflective note or a call to action."
    },
    personality = (
        "Polished, thoughtful, and articulate. "
        "Writes with purpose and rhythm, ensuring that every line contributes to the readers understanding. "
        "Balances technical precision with readability."
    ),
    output_example = (
        "Title: The Future of AI in Visual Recognition\n\n"
        "Artificial Intelligence has reshaped how machines interpret the world around us. "
        "From identifying faces in security systems to detecting diseases in medical scans, visual recognition has evolved rapidly. "
        "In this article, we explore the key breakthroughs driving this change and what they mean for the next generation of applications.\n\n"
        "Modern visual recognition relies on deep learning frameworks trained on millions of images. "
        "These systems learn patterns, colors, and object relationships, allowing them to perform tasks that once required human perception. "
        "Frameworks such as YOLO and OpenCV have made real-time image processing accessible to developers and researchers alike.\n\n"
        "As the technology matures, industries from retail to healthcare continue to integrate AI-driven vision systems to improve accuracy and efficiency. "
        "The next phase of innovation will focus on reducing bias, improving interpretability, and integrating visual data with contextual understanding.\n\n"
        "The journey of AI in visual recognition has just begun. "
        "With growing computational power and smarter models, the boundary between human and machine perception continues to blur."
    ),
    llm=writing_llm,
)



### üìã Task Definitions

This cell defines the workflow tasks that connect the agents:

#### 1. Extract Transcript Task
- **Agent**: Extractor Agent
- **Input**: YouTube video URL
- **Output**: Structured transcript data
- **Output Key**: `transcript_data`

#### 2. Draft Blog Outline Task
- **Agent**: Blog Outline Drafting Agent
- **Context**: Depends on transcript extraction task
- **Input**: Transcript data from previous task
- **Output**: Structured blog outline
- **Output Key**: `blog_outline`

#### 3. Write Blog Task
- **Agent**: Blog Writer Agent
- **Context**: Depends on outline drafting task
- **Input**: Blog outline from previous task
- **Output**: Complete blog article
- **Output Key**: `final_blog`

The tasks form a sequential pipeline where each task depends on the output of the previous one.

In [10]:
extract_transcript_task = Task(
    name="Extract YouTube Transcript",
    description=(
    "Given a YouTube video url {url}, extract the full transcript. "
    "This transcript will be cleaned and structured for downstream use."
    ),
    expected_output=(
    "A structured JSON object containing video title, URL, transcript segments, "
    "and the concatenated full text."
    ),
    agent=extractor_agent,
    inputs={"video_url": "<YOUTUBE_VIDEO_URL>"},
    output_key="transcript_data"
)

draft_blog_outline_task = Task(
    name="Draft Blog Outline from Transcript",
    description=(
    "Analyze the transcript extracted from the YouTube video and generate "
    "a detailed blog outline covering the main topics, subtopics, and takeaways."
    ),
    expected_output=(
    "A structured blog outline JSON with title, introduction, sections, "
    "and conclusion ready for the writing agent."
    ),
    context=[extract_transcript_task],
    agent=blog_outline_drafting_agent,
    inputs={"transcript_data": "{{ Extract YouTube Transcript.output }}"},
    output_key="blog_outline"
)

write_blog_task = Task(
    name="Write Complete Blog Article",
    description=(
    "Using the blog outline generated from the previous step, write a polished, "
    "engaging, and human-like blog article with natural flow, examples, and insights."
    ),
    expected_output=(
    "A fully written blog article with title, introduction, main body sections, "
    "and conclusion. The text should be publication-ready."
    ),
    context=[draft_blog_outline_task],
    agent=blog_writer_agent,
    inputs={"blog_outline": "{{ Draft Blog Outline from Transcript.output }}"},
    output_key="final_blog"
)

### üöÄ Crew Assembly

This cell creates the main crew that orchestrates the entire process:

- **Name**: YouTube-to-Blog Conversion Crew
- **Description**: Automates conversion of YouTube videos to blog posts
- **Agents**: All three defined agents (extractor, outline drafter, writer)
- **Tasks**: All three defined tasks in sequence
- **Process**: Sequential execution
- **Verbose**: True (shows detailed execution logs)

The crew manages the workflow, ensuring tasks are executed in the correct order and agents receive the proper inputs.

In [11]:
# Create the crew to manage the three agents and their tasks
blog_generation_crew = Crew(
    name="YouTube-to-Blog Conversion Crew",
    description=(
        "This crew automates the process of converting a YouTube video into a complete blog post. "
        "It works in three stages: extracting the transcript, drafting a structured blog outline, "
        "and writing a polished, human-like blog article."
    ),
    agents=[
        extractor_agent,
        blog_outline_drafting_agent,
        blog_writer_agent
    ],
    tasks=[
        extract_transcript_task,
        draft_blog_outline_task,
        write_blog_task
    ],
    process=Process.sequential,
    verbose=True
)


### ‚ñ∂Ô∏è Crew Execution

This cell runs the complete YouTube to blog conversion process:

- **Input**: YouTube video URL (replace with your target URL)
- **Process**: Kicks off the sequential crew workflow
- **Result**: Stores the complete output including all intermediate steps

In [12]:
# Replace with your actual YouTube URL when running

url = "https://www.youtube.com/watch?v=lEXkHi3NYxo"

result = blog_generation_crew.kickoff(
    inputs={"url": url}
)
result

CrewOutput(raw='```markdown\n# Create Your Own LinkedIn Post Generator Agent with CrewAI\n\n## Introduction\n\nAre you tired of spending valuable time manually crafting LinkedIn posts? The process of creating engaging content, finding relevant hashtags, and scheduling posts can be incredibly time-consuming. In today\'s fast-paced digital world, maintaining a consistent social media presence is crucial for professionals and businesses alike. But what if you could automate this process, saving time and significantly enhancing your LinkedIn presence?\n\nThis blog post provides a step-by-step guide on building a LinkedIn post generator agent using CrewAI. We\'ll dive into the problem of manual LinkedIn posting, explore the benefits of an automated solution, and walk through the creation of an agent that can extract content from a blog post, summarize it, and generate a professional LinkedIn post ready to share. By the end of this tutorial, you\'ll have the knowledge and tools to automate y

In [13]:
# Access outputs
print(result.raw)

```markdown
# Create Your Own LinkedIn Post Generator Agent with CrewAI

## Introduction

Are you tired of spending valuable time manually crafting LinkedIn posts? The process of creating engaging content, finding relevant hashtags, and scheduling posts can be incredibly time-consuming. In today's fast-paced digital world, maintaining a consistent social media presence is crucial for professionals and businesses alike. But what if you could automate this process, saving time and significantly enhancing your LinkedIn presence?

This blog post provides a step-by-step guide on building a LinkedIn post generator agent using CrewAI. We'll dive into the problem of manual LinkedIn posting, explore the benefits of an automated solution, and walk through the creation of an agent that can extract content from a blog post, summarize it, and generate a professional LinkedIn post ready to share. By the end of this tutorial, you'll have the knowledge and tools to automate your social media efforts a

In [14]:
from datetime import datetime

# Create a timestamped filename
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
output_path = f"generated_blog_{timestamp}.md"

# Write the blog content to a Markdown file
with open(output_path, "w", encoding="utf-8") as f:
    f.write(result.raw)

print(f"‚úÖ Blog saved successfully at: {output_path}")

‚úÖ Blog saved successfully at: generated_blog_2025-10-27_14-31-43.md
