[![Labellerr](https://storage.googleapis.com/labellerr-cdn/%200%20Labellerr%20template/notebook.webp)](https://www.labellerr.com)

# **Twitter Post Generator Agent**

---

[![labellerr](https://img.shields.io/badge/Labellerr-BLOG-black.svg)](https://www.labellerr.com/blog/<BLOG_NAME>)
[![Youtube](https://img.shields.io/badge/Labellerr-YouTube-b31b1b.svg)](https://www.youtube.com/@Labellerr)
[![Github](https://img.shields.io/badge/Labellerr-GitHub-green.svg)](https://github.com/Labellerr/Hands-On-Learning-in-Computer-Vision)

This notebook implements an automated system for generating Twitter/X posts from either blog content or YouTube videos using CrewAI and Google's Gemini model. The system extracts content from URLs and generates engaging, concise social media posts.

## Dependencies Installation

This cell installs the required Python packages:
- crewai: Framework for creating AI agent workflows
- python-dotenv: For managing environment variables
- requests: For making HTTP requests
- beautifulsoup4: For parsing HTML content
- youtube-transcript-api: For extracting YouTube video transcripts

In [182]:
# ===== Install Dependencies =====
# !pip install crewai python-dotenv requests beautifulsoup4 youtube-transcript-api


## Import Required Libraries and Modules

This cell imports all necessary libraries for:
- CrewAI components (Agent, Task, Crew, Process, LLM)
- Web scraping and HTTP requests (requests, BeautifulSoup)
- Environment configuration (os, dotenv)
- URL parsing and text manipulation (re, urllib.parse)
- YouTube transcript extraction (YouTubeTranscriptApi)

In [183]:
# ===== Imports and Environment setup =====
from crewai import Agent, Task, Crew, Process, LLM
from crewai.tools import tool
import requests
from bs4 import BeautifulSoup
import os
from dotenv import load_dotenv
import re
from youtube_transcript_api import YouTubeTranscriptApi


## API Key Configuration

This cell loads the Google Gemini API key from a .env file. Make sure to:
1. Create a .env file in the same directory as this notebook
2. Add your Gemini API key: `GEMINI_API_KEY=your_key_here`
3. Keep the .env file secure and never commit it to version control

In [184]:
# ===== Load API Key from .env =====
# Make sure you have a .env file with a line: GEMINI_API_KEY=your_key_here
load_dotenv()
GEMINI_API_KEY = os.getenv('GEMINI_API_KEY')


## Blog Content Extraction Tool

This tool is responsible for extracting textual content from blog posts and web articles:
- Makes an HTTP request to the provided URL
- Parses HTML using BeautifulSoup4
- Extracts text from paragraph (`<p>`) tags
- Falls back to body text if insufficient content is found
- Limits output to 4000 characters
- Handles errors gracefully with informative messages

In [185]:
# ===== Blog Content Extractor Tool =====
@tool('Content Extractor')
def extract_content_from_url(url: str) -> str:
    '''Extract main textual info from blog URLs.'''
    try:
        resp = requests.get(url, timeout=15)
        resp.raise_for_status()
        soup = BeautifulSoup(resp.text, 'html.parser')
        text = '\n'.join([p.get_text(separator=' ', strip=True) for p in soup.find_all('p')])
        if len(text) < 200 and soup.body:
            text = soup.body.get_text(separator=' ', strip=True)
        return text[:4000]
    except Exception as e:
        return f"ERROR extracting content: {e}"


## YouTube Transcript Extraction Tool

This tool handles the extraction of transcripts from YouTube videos:
- Supports multiple URL formats (watch, shorts, embed, live, mobile)
- Extracts the 11-character video ID using regex patterns
- Attempts to fetch English transcripts first
- Falls back to auto-generated transcripts if needed
- Returns concatenated transcript text
- Includes error handling for invalid URLs or missing transcripts

In [186]:
@tool("YouTube Transcript Extractor")
def extract_youtube_transcript(url: str) -> str:
    """
    Extract transcript from any YouTube URL format.
    
    Supports: watch, youtu.be, embed, shorts, live, mobile URLs.
    Returns up to 4000 characters of English transcript text.
    """
    try:
        # Extract 11-char video ID from any YouTube URL format
        video_id = None
        
        # Try query parameter first (watch URLs)
        if 'v=' in url:
            video_id = re.search(r'v=([a-zA-Z0-9_-]{11})', url)
        
        # Try path-based formats (youtu.be, embed, shorts, live, v)
        if not video_id:
            video_id = re.search(r'(?:youtu\.be/|embed/|shorts/|live/|/v/)([a-zA-Z0-9_-]{11})', url)
        
        if not video_id:
            return "ERROR: Could not extract video ID from URL."
        
        # print(video_id.group(1))
        
        video_id = video_id.group(1)
        
        api = YouTubeTranscriptApi()
        transcript_list = api.list(video_id)

        # Try fetching English transcript, fallback to auto-generated
        try:
            transcript = transcript_list.find_transcript(['en']).fetch()
        except Exception:
            transcript = transcript_list.find_generated_transcript(['en']).fetch()

        # Each entry is now a FetchedTranscriptSnippet object
        # Access its .text attribute instead of ['text']
        text = ' '.join([snippet.text for snippet in transcript])

        return text
        
    except Exception as e:
        return f"ERROR: {e}"

## LLM Configuration

Setting up two instances of Google's Gemini model with different configurations:
1. **Extracting LLM** (gemini-2.0-flash-lite):
   - Used for content extraction and summarization
   - Low temperature (0.1) for more focused, deterministic output

2. **Writing LLM** (gemini-2.5-flash):
   - Used for creative Twitter post generation
   - Higher temperature (0.3) for more creative output

In [187]:
# ===== LLM Setup =====
extracting_llm = LLM(model='gemini/gemini-2.0-flash-exp', 
                     apikey=GEMINI_API_KEY, 
                     temperature=0.1)
writing_llm = LLM(model='gemini/gemini-2.5-flash', 
                  apikey=GEMINI_API_KEY, 
                  temperature=0.3)


## Agent Definitions

Creating two specialized AI agents:

1. **Content Extractor Agent**:
   - Role: Extracts and processes content from URLs
   - Tools: Blog content and YouTube transcript extractors
   - Uses the more precise gemini-2.0-flash-lite model

2. **Twitter Writer Agent**:
   - Role: Crafts engaging Twitter/X posts
   - Specializes in viral content with hooks and hashtags
   - Uses the more creative gemini-2.5-flash model
   - Ensures posts stay within 280-character limit

In [188]:
# ===== Agent Definitions =====
extractor_agent = Agent(
    role='Content Extractor',
    goal='Extract main points from either blog or YouTube video URLs using the appropriate extraction method.',
    backstory='Handles both web articles/blogs and YouTube videos, providing a clean summary.',
    tools=[extract_content_from_url, extract_youtube_transcript],
    llm=extracting_llm,
    verbose=True
)

twitter_writer_agent = Agent(
    role='Twitter/X Post Writer',
    goal='Draft a concise, viral Twitter/X post from the extracted content (max 280 chars, emojis, hashtags, and link).',
    backstory='Expert in generating Twitter posts with actionable hook and hashtags.',
    llm=writing_llm,
    verbose=True
)


## Task Definitions

Configuring two sequential tasks for the workflow:

1. **Content Extraction Task**:
   - Extracts and summarizes content from the provided URL
   - Handles both blog posts and YouTube videos
   - Creates a concise summary of main points

2. **Twitter Post Writing Task**:
   - Uses the extracted content as context
   - Creates a Twitter/X post with:
     - Key points from the content
     - Appropriate emojis
     - Relevant hashtags
     - Original URL
   - Ensures the post stays within Twitter's 280-character limit

In [189]:
# ===== Tasks =====
extract_task = Task(
    description='Extract and summarize the core topic and points from this URL: {url}',
    expected_output='Summary of main points from the content.',
    agent=extractor_agent
)

write_task = Task(
    description='Write a Twitter/X post (max 280 chars) with key points, emojis, hashtags, and include this URL: {url}',
    expected_output='Ready-to-use twitter post copy, where each are in separate line.',
    agent=twitter_writer_agent,
    context=[extract_task]  # This ensures it gets the extraction results
)

## Crew Assembly

Creating the CrewAI workflow:
- Combines both agents (Content Extractor and Twitter Writer)
- Configures sequential task execution
- Enables verbose mode for detailed execution tracking
- Ensures proper data flow between tasks

In [190]:
# ===== Assemble Crew Workflow =====
twitter_crew = Crew(
    agents=[extractor_agent, twitter_writer_agent],
    tasks=[extract_task, write_task],
    process=Process.sequential,
    verbose=True
)


## Execution

Run the Twitter post generation workflow:
1. Provide a URL (blog post or YouTube video)
2. Content is extracted and summarized
3. A Twitter/X post is generated
4. Final post is printed with the result

Example uses a YouTube video URL, but you can replace it with any blog post URL.

In [192]:
# Run the workflow with a blog or YouTube URL
# url = "https://www.youtube.com/watch?v=GWB9ApTPTv4"
url = "https://www.labellerr.com/blog/aios-explained/"
result = twitter_crew.kickoff(inputs={'url': url})

print(result.raw)

Struggling with multi-agent AI? 🤯 AIOS (AI Agent Operating System) is your solution! It brings OS-level management to LLM-powered agents for scalable, collaborative AI.
🔗 https://www.labellerr.com/blog/aios-explained/
#AIOS #AIAutomation #MultiAgentAI #LLM #Tech
