# Building a Social Media Manager with LangGraph: A Tutorial

## Overview

This tutorial guides you through creating a Social Media Manager using LangGraph, demonstrating how to build a stateful application that can process content from multiple sources and generate LinkedIn posts. The agent showcases how to handle complex workflows with multiple content sources, AI-powered content generation, and automated social media posting.

## Motivation

Managing social media content creation involves multiple steps: content sourcing, processing, generation, and posting. LangGraph provides an excellent framework for orchestrating these steps in a maintainable way. This Social Media Manager demonstrates how to create a robust workflow that can handle various content sources while maintaining clear separation of concerns.

## Key Components

1. **StateGraph**: Core workflow manager defining the content processing pipeline
2. **AgentState**: Custom type tracking conversation history, content items, and generated posts
3. **Content Sources**:
   - YouTube transcription
   - Reddit summaries
   - Towards Data Science articles
   - LinkedIn profile posts
   - Audio transcription
4. **LLM Integration**: Using LLM for content generation and decision making
5. **LinkedIn Automation**: Automated posting using Playwright

## Method Details

Our Social Media Manager follows a multi-step process:

1. **Content Router**:
   - Analyzes user intent to determine content source
   - Routes to appropriate content fetching node

2. **Content Fetching**:
   - Each source has a dedicated node (YouTube, Reddit, etc.)
   - Handles authentication and content extraction
   - Processes content into a consistent format

3. **Post Generation**:
   - Uses GPT-4 to transform source content into LinkedIn-style posts
   - Maintains consistent tone and formatting

4. **LinkedIn Posting**:
   - Automated browser control for posting
   - User confirmation before posting
   - Error handling and session management

The workflow is managed by LangGraph, ensuring proper state transitions and error handling throughout the process.


## Conclusion

This Social Media Manager demonstrates LangGraph's capability to handle complex, real-world applications. The graph-based structure provides clear separation of concerns, making it easy to add new content sources or modify the posting workflow. The integration of AI for content generation and decision-making showcases how modern language models can be effectively incorporated into practical applications.

This example serves as a foundation for developers looking to build sophisticated content management systems, showing how to handle multiple data sources, state management, and automated social media interactions within a structured framework.

# install dependencies

In [None]:
!pip install beautifulsoup4  langchain_groq langchain_core langchain_openai langgraph selenium rich langchain_anthropic whisper praw python-dotenv youtube_transcript_api gradio langserve PyPDF2 playwright lxml sse_starlette


In [None]:
# install playwright
!playwright install

!apt-get update
!apt-get install -y \
    libwoff1 \
    libharfbuzz-icu0 \
    libgstreamer-plugins-base1.0-0 \
    libgstreamer-gl1.0-0 \
    libgstreamer-plugins-bad1.0-0 \
    libenchant-2-2 \
    libsecret-1-0 \
    libhyphen0 \
    libmanette-0.2-0

### Setup and Imports
First, let's import the necessary modules and set up our environment. Create a .env file in the root of the project and add your keys. If you're using Groq, make sure to set the GROQ_API_KEY environment variable and it will only support 8k context window with current model selection. Use OpenAI for the best results.


In [None]:
# Standard library imports
import asyncio
import os
import re
from typing import Any, Annotated, Dict, List, Optional, Tuple
from uuid import uuid4

# Third-party imports
from bs4 import BeautifulSoup
from dotenv import load_dotenv
from IPython.display import display, Image
from langchain_core.messages import AIMessage, AnyMessage, HumanMessage, SystemMessage, ToolMessage
from langchain_core.runnables.graph import MermaidDrawMethod
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, StateGraph
from langgraph.graph.message import add_messages
from openai import OpenAI
from langchain_groq import ChatGroq
from playwright.async_api import (
    Page,
    TimeoutError,
    async_playwright,
    Browser,
    Playwright
)
import praw
from pydantic import BaseModel, Field
import requests
from youtube_transcript_api import YouTubeTranscriptApi

# Load environment variables
load_dotenv()

groq_key = os.getenv('GROQ_API_KEY')
openai_key = os.getenv('OPENAI_API_KEY')

if not groq_key and not openai_key:
    raise ValueError("Either GROQ_API_KEY or OPENAI_API_KEY must be set")

if openai_key:
    print("OPENAI_API_KEY is set in environment variables")
    model = ChatOpenAI(temperature=0, model="gpt-4o-2024-08-06") 
    os.environ["OPENAI_API_KEY"] = openai_key
elif groq_key:
    print("GROQ_API_KEY is set in environment variables", groq_key)
    model = ChatGroq(temperature=0, model="llama3-groq-70b-8192-tool-use-preview", base_url="https://api.groq.com")
    os.environ["GROQ_API_KEY"] = groq_key
   
# LinkedIn configuration
os.environ["LINKEDIN_EMAIL"] = os.getenv('LINKEDIN_EMAIL')
os.environ["LINKEDIN_PASSWORD"] = os.getenv('LINKEDIN_PASSWORD')
os.environ["LINKEDIN_PROFILE_NAME"] = os.getenv('LINKEDIN_PROFILE_NAME')

# Langchain configuration
os.environ["LANGCHAIN_TRACING_V2"] = os.getenv('LANGCHAIN_TRACING_V2')
os.environ["LANGCHAIN_API_KEY"] = os.getenv('LANGCHAIN_API_KEY')
os.environ["LANGCHAIN_ENDPOINT"] = os.getenv('LANGCHAIN_ENDPOINT')
os.environ["LANGCHAIN_PROJECT"] = os.getenv('LANGCHAIN_PROJECT')

# PRAW https://praw.readthedocs.io/en/stable/getting_started/quick_start.html
os.environ["PRAW_CLIENT_ID"] = os.getenv('PRAW_CLIENT_ID')
os.environ["PRAW_CLIENT_SECRET"] = os.getenv('PRAW_CLIENT_SECRET')
os.environ["PRAW_USER_AGENT"] = os.getenv('PRAW_USER_AGENT')
os.environ["PRAW_USERNAME"] = os.getenv('PRAW_USERNAME')
os.environ["PRAW_PASSWORD"] = os.getenv('PRAW_PASSWORD')


### Define Pydantic schemas for structured outputs with llms that we will use in the nodes for deterministic routing.

In [4]:
class AudioTranscriptionDecision(BaseModel):
    """
    Whether the user wants to create a post from audio
    """
    transcribe_audio: bool = Field(description="Whether to create a post from audio")
    reason: str = Field(description="Reason for the decision")

class YoutubeTranscriptionDecision(BaseModel):
    """
    Whether the user wants to create a post from youtube
    If the url is not provided, url MUST be set to 'URL NOT PROVIDED'
    """
    transcribe_youtube: bool = Field(description="Whether to create a post from YouTube")
    reason: str = Field(description="Reason for the decision")
    url: str = Field(description="The YouTube URL to parse. If the url is not provided, url MUST be set to 'URL NOT PROVIDED'", default="URL NOT PROVIDED")

class RedditSummaryDecision(BaseModel):
    """
    Whether the user wants to create a post from reddit
    """
    summarize_reddit: bool = Field(description="Whether to create a post from Reddit")
    reason: str = Field(description="Reason for the decision")

class TowardsDataScienceDecision(BaseModel):
    """
    Whether the user wants to create a post from towardsdatascience
    """
    fetch_tds_articles: bool = Field(description="Whether to create a post from Towards Data Science")
    reason: str = Field(description="Reason for the decision")

class LinkedInProfileDecision(BaseModel):
    """
    Whether the user wants to create a post from linkedin profile
    """
    fetch_linkedin_posts: bool = Field(description="Whether to create a post from LinkedIn profile")
    reason: str = Field(description="Reason for the decision")

class ExitDecision(BaseModel):
    """Model for exit decision"""
    should_exit: bool = Field(description="Whether the user wants to exit the application")
    reason: str = Field(description="Reason for the exit decision")

class UserIntentClassification(BaseModel):
    """
    Use this to classify the user's intent, whether they want to create a post from audio, youtube, reddit, or towardsdatascience,
    """
    reddit_summary_decision: RedditSummaryDecision
    towards_data_science_decision: TowardsDataScienceDecision
    linkedin_profile_decision: LinkedInProfileDecision
    audio_transcription_decision: AudioTranscriptionDecision
    youtube_transcription_decision: YoutubeTranscriptionDecision
    exit_decision: ExitDecision


class YouTubeURLParser(BaseModel):
    """
    Parse the YouTube URL from the user's input
    """
    url: str = Field(description="The YouTube URL to parse")

class RedditFetchParams(BaseModel):
    """
    Parse the number of posts and subreddit from the user's input
    """
    post_count: int = Field(description="The number of posts to fetch from Reddit")
    subreddit: str = Field(description="The subreddit to fetch posts from")


class ContentItem(BaseModel):
    """
    A model representing a single content item.

    This class encapsulates a piece of content as a string, which could be text from
    various sources like LinkedIn posts, Reddit posts, or transcribed content.

    Attributes:
        content (str): The actual content text
    """
    content: str = Field(description="The actual content")

class LinkedInPostDecision(BaseModel):
    """
    A model representing a decision about posting content to LinkedIn.

    This class contains information about whether content should be posted to LinkedIn,
    along with confidence level and reasoning for the decision.

    Attributes:
        should_post (bool): Whether the content should be posted to LinkedIn
        confidence (float): A value between 0 and 1 indicating confidence in the decision
        reasoning (str): Detailed explanation for why this decision was made
    """
    should_post: bool = Field(description="Whether the user wants to post to LinkedIn")
    confidence: float = Field(description="Confidence level of the decision", ge=0, le=1)
    reasoning: str = Field(description="Explanation for the decision")

class AgentState(BaseModel):
    """
    A model representing the current state of the agent.

    This class tracks the ongoing conversation, planned actions, collected content,
    and generated posts throughout the agent's execution.

    Attributes:
        messages (List[AnyMessage]): History of messages in the conversation
        next_action (Optional[str]): The next action the agent should take
        content_items (Optional[List[ContentItem]]): Collection of content gathered
        generated_posts (Optional[List[str]]): Posts that have been generated
    """
    messages: Annotated[List[AnyMessage], add_messages] = Field(default_factory=list)
    next_action: Annotated[Optional[str], Field(default=None)]
    content_items: Annotated[Optional[List[ContentItem]], Field(default=None)]
    generated_posts: Annotated[Optional[List[str]], Field(default=None)]


## Browser Actions

This section explains the browser actions used in the Social Media Manager application. These actions interact with the LinkedIn website to perform tasks such as logging in and creating posts.

### Initialize Browser

This action starts a Chromium browser instance using Playwright. It sets up a new browser context and a new page within that context.
This allows us to interact with the browser for web scraping and automated tasks.

### Close Browser

This action closes the browser instance that was previously opened using Playwright. This is important to release resources and prevent the browser from running unnecessarily.

### Login to LinkedIn

This action uses the `LoginPage` class to log in to the LinkedIn website.  It navigates to the login page and enters the provided credentials (LinkedIn email and password) to authenticate the user. Once the login process is complete, it waits for the user to be redirected to the LinkedIn feed, confirming a successful login.


### Create Post

This action creates a new LinkedIn post using the `FeedPage` class. It first waits for the feed to load then clicks a button to start the post creation process. Then it fills the provided content into the text area and clicks the "Post" button to publish the post on LinkedIn.

### Get LinkedIn Posts

This action retrieves LinkedIn posts from a specific profile using the `ProfilePage` class. It navigates to the user's recent activity page and uses BeautifulSoup to extract text from the posts displayed. The extracted post content is then returned as a list of ContentItem objects. It also handles scrolling the page to load more posts.

These browser actions play a crucial role in the overall functionality of the Social Media Manager. They enable automated interactions with LinkedIn, enabling features like post creation and content extraction.  Through the Playwright library, the agent can mimic user behavior, performing various actions within the browser environment.


In [5]:
class FeedPage:
    """
    Class for interacting with LinkedIn feed page and creating posts.

    Attributes:
        page (Page): Playwright page object
        start_post_button (Locator): Locator for the "Start post" button
        post_text_area (Locator): Locator for the post text input area
        post_button (Locator): Locator for the "Post" button
    """
    def __init__(self, page: Page):
        self.page = page
        # Updated selectors
        self.start_post_button = page.locator("button[class*='share-box-feed-entry__trigger']")
        self.post_text_area = page.locator("div[role='textbox']")
        self.post_button = page.locator("button[class*='share-actions__primary-action']")

    async def create_post(self, content: str):
        """
        Creates a new LinkedIn post with the provided content.

        Args:
            content (str): The text content to post

        Raises:
            TimeoutError: If any element takes too long to appear
            Exception: If post creation fails for any reason
        """
        try:
            print("Waiting for feed to load...")
            await self.wait_for_feed_load()

            print("Clicking 'Start post' button...")
            await self.start_post_button.wait_for(state="visible", timeout=10000)
            await self.start_post_button.click()
            await asyncio.sleep(2)

            print("Waiting for post editor...")
            await self.page.wait_for_selector("div[role='textbox']", state="visible", timeout=10000)

            print("Filling post content...")
            # Try multiple selector strategies
            text_area = await self.page.wait_for_selector("div[role='textbox'], div[contenteditable='true'], .ql-editor",
                                                        state="visible",
                                                        timeout=10000)
            if text_area:
                await text_area.fill(content)
            else:
                raise Exception("Could not find post text area")

            await asyncio.sleep(2)

            print("Clicking post button...")
            post_button = await self.page.wait_for_selector(
                "button[class*='share-actions__primary-action'], button[class*='share-box_actions']",
                state="visible",
                timeout=10000
            )
            if post_button:
                await post_button.click()
            else:
                raise Exception("Could not find post button")

            # Wait for post to complete
            await asyncio.sleep(5)
            print("Post created successfully!")

        except TimeoutError as e:
            print(f"Timeout error: {str(e)}")
            await self.page.screenshot(path="post_error.png")
            raise Exception(f"Failed to create post: Timeout - {str(e)}")
        except Exception as e:
            print(f"Error creating post: {str(e)}")
            await self.page.screenshot(path="post_error.png")
            raise Exception(f"Failed to create post: {str(e)}")

    async def wait_for_feed_load(self):
        """
        Waits for the LinkedIn feed to load by checking for feed indicators.

        Raises:
            Exception: If feed fails to load within timeout
        """
        try:
            # Wait for multiple possible feed indicators
            await self.page.wait_for_selector(
                ".feed-shared-update-v2, .share-box-feed-entry__trigger",
                state="visible",
                timeout=15000
            )
        except TimeoutError:
            print("Feed load timeout - taking screenshot for debugging")
            await self.page.screenshot(path="feed_load_error.png")
            raise Exception("Feed failed to load: Timeout")

class LoginPage:
    """
    Class for handling LinkedIn login process.

    Attributes:
        page (Page): Playwright page object
        email_input (Locator): Locator for email input field
        password_input (Locator): Locator for password input field
        login_button (Locator): Locator for login submit button
        pin_input (Locator): Locator for verification code input
    """
    def __init__(self, page):
        self.page = page
        self.email_input = page.get_by_label("Email or Phone")
        self.password_input = page.get_by_label("Password")
        self.login_button = page.locator('button[data-litms-control-urn="login-submit"]')
        self.pin_input = page.locator('input[name="pin"]')  # For verification code

    async def login(self):
        """
        Performs LinkedIn login process with credentials from environment variables.
        Handles verification code if required.

        Raises:
            Exception: If login fails for any reason
        """
        try:
            print("Starting LinkedIn login process...")
            username, password = os.getenv("LINKEDIN_EMAIL"), os.getenv("LINKEDIN_PASSWORD")

            print("Navigating to LinkedIn login page...")
            await self.page.goto("https://www.linkedin.com/login", timeout=15000)
            await asyncio.sleep(2)

            print("Filling login form...")
            await self.email_input.fill(username)
            await asyncio.sleep(1)
            await self.password_input.fill(password)
            await asyncio.sleep(1)

            print("Clicking login button...")
            await self.login_button.click()

            # Wait for either verification page or successful login
            try:
                # Check for verification code input
                verification_selector = 'input[name="pin"], input[name="verification-code"]'
                await self.page.wait_for_selector(verification_selector, timeout=10000)

                print("\nVerification code required!")
                print("Please check your email/phone for the verification code")
                verification_code = input("Enter the verification code: ")

                # Fill in verification code
                verification_input = await self.page.query_selector(verification_selector)
                await verification_input.fill(verification_code)

                # Click submit button (different selectors possible)
                submit_button = await self.page.query_selector('button[type="submit"]')
                if submit_button:
                    await submit_button.click()

                # Wait for successful login after verification
                print("Waiting for successful login after verification...")

            except TimeoutError:
                # No verification required, continue with normal login flow
                print("No verification code required, continuing...")

            # Final check for successful login
            try:
                await self.page.wait_for_function(
                    """() => {
                        return window.location.href.includes('/feed') ||
                               window.location.href.includes('/checkpoint') ||
                               window.location.href.includes('/home')
                    }""",
                    timeout=30000
                )
                print("Successfully logged in!")

            except TimeoutError:
                current_url = await self.page.url()
                print(f"Login failed. Current URL: {current_url}")
                await self.page.screenshot(path="login_error.png")
                raise Exception("Login process timed out")

        except Exception as e:
            print(f"Login failed with error: {str(e)}")
            await self.page.screenshot(path="login_error.png")
            raise Exception(f"Login failed: {str(e)}")

async def login_to_linkedin(page: Page) -> None:
    """Helper function to login to LinkedIn using LoginPage class"""
    login_page = LoginPage(page)
    await login_page.login()

class ProfilePage:
    """
    Class for interacting with LinkedIn profile pages and retrieving posts.

    Attributes:
        page (Page): Playwright page object
        base_url (str): Base URL for LinkedIn profiles
        linkedin_profile_name (str): Username of the profile to scrape
    """
    def __init__(self, page: Page):
        self.page = page
        self.base_url = "https://www.linkedin.com/in"
        self.linkedin_profile_name = "shreyshahh"

    async def get_linkedin_posts(self) -> List[ContentItem]:
        """
        Retrieves recent LinkedIn posts from a user's profile.

        Returns:
            List[ContentItem]: List of posts as ContentItem objects
        """
        await self.page.goto(f"{self.base_url}/{self.linkedin_profile_name}/recent-activity/all/")
        await asyncio.sleep(3)
        for _ in range(2):
            await self.page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
            await asyncio.sleep(2)

        linkedin_soup = BeautifulSoup(await self.page.content(), "lxml")
        containers = [c for c in linkedin_soup.find_all("div", {"class": "feed-shared-update-v2"})
                     if "activity" in c.get("data-urn", "")]

        posts = []
        for i, container in enumerate(containers):
            element = container.find("div", {"class": "update-components-text"})
            if element and element.text.strip():
                posts.append(ContentItem(title=f"LinkedIn Post {i+1}", content=element.text.strip()))
        return posts[:5]

async def initialize_browser(headless: bool = True) -> Tuple[Playwright, Browser, Page]:
    """
    Initializes a Playwright browser instance with appropriate settings.

    Args:
        headless (bool): Whether to run browser in headless mode

    Returns:
        Tuple[Playwright, Browser, Page]: Initialized Playwright objects
    """
    playwright = await async_playwright().start()

    # Configure browser for Colab environment
    browser = await playwright.chromium.launch(
        headless=True,  # Set to True for Colab
        args=[
            '--no-sandbox',
            '--disable-dev-shm-usage',
            '--disable-blink-features=AutomationControlled'
        ]
    )

    # Create context with realistic browser settings
    context = await browser.new_context(
        viewport={'width': 1920, 'height': 1080},
        user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36'
    )

    page = await context.new_page()
    return playwright, browser, page

async def close_browser(playwright: Playwright, browser: Browser) -> None:
    """Closes browser and stops playwright instance"""
    await browser.close()
    await playwright.stop()


async def login_to_linkedin(page: Page) -> None:
    """Helper function to login to LinkedIn"""
    await LoginPage(page).login()


def fetch_url_content(url):
    """
    Fetches content from a URL using requests.

    Args:
        url (str): URL to fetch content from

    Returns:
        Optional[bytes]: Response content if successful, None otherwise
    """
    response = requests.get(url)
    return response.content if response.status_code == 200 else None

def parse_article_content(article_url):
    """
    Parses article content from a URL, removing unwanted elements.

    Args:
        article_url (str): URL of article to parse

    Returns:
        Optional[str]: Cleaned article content if successful, None otherwise
    """
    if not (content := fetch_url_content(article_url)):
        return None

    article_soup = BeautifulSoup(content, "html.parser")
    full_content = "\n".join(p.get_text() for p in article_soup.find_all("p"))

    unwanted = ["Sign up\nSign in\nSign up\nSign in\nMariya Mansurova\nFollow\nTowards Data Science\n--\nListen\nShare",
                """\n--\n--\nTowards Data Science\nData & Product Analytics Lead at Wise | ClickHouse Evangelist\nHelp\nStatus\nAbout\nCareers\nPress\nBlog\nPrivacy\nTerms\nText to speech\nTeams"""]

    return re.sub(r"[^\x00-\x7F]+", "", "".join(full_content.replace(s, "") for s in unwanted))



# Define prompts for summarizing and creating the post.

Provide your writing style to create similar content. Provide at least 5 examples

In [6]:

prompt = """
Analyze these LinkedIn posts:

WRITER POSTS:
______________________
A lot of people are not going to like this.

AI employees are taking over phone calls:

This is Bland AI. And it's changing everything.

If businesses don't adapt to this new tech,
they will be left behind.

What does that mean?

→ AI handles millions of calls 24/7.
→ AI talks in any language or voice.
→ AI integrates with data systems seamlessly.
→ AI customized for customer service, HR, or sales.

Bland AI is leading this change. Period.

If you haven't already,
do it before others.

♻️ Repost this if you think it's the future.

PS: If you want to stay updated with genAI

1. Scroll to the top.
2. Follow Shrey Shah to never miss a post.
_____________________________

Create a LinkedIn post about this topic in the same style:
{topic}

Guidelines:
- Match the tone and structure
- Use similar formatting (bullet points, emojis)
- Include hashtags
- Add links as [text](url) if provided
- Focus on the main topic
- Keep it professional

Write only the final post content.
"""

reddit_summarization_prompt = """
Summarize this Reddit content:

Title: {title}
Content: {body}
Comments: {comments}

Guidelines:
- Extract key points and insights
- Include any links/references
- Focus on the main topic
- Make it LinkedIn-friendly
- 3-5 paragraphs
- No Reddit-specific terms
- Professional tone

Write only the final summary.
"""

#Define Agent Functions
Now we'll define the main functions that our agent will use

### Core Functions:

- **`fetch_tds_articles`:** Retrieves articles from Towards Data Science and prepares them for potential posting.
- **`fetch_linkedin_profile_posts`:** Extracts recent posts from a specified LinkedIn profile.
- **`transcribe_audio`:** Transcribes audio files (e.g., podcasts, voice recordings) using OpenAI's Whisper.
- **`transcribe_youtube`:** Transcribes YouTube videos using the YouTubeTranscriptApi.
- **`summarize_reddit`:** Fetches posts from a given subreddit and generates a concise summary for each one.
- **`create_post`:** Generates a social media post based on a provided content item (e.g., article, audio transcript, Reddit summary).
- **`should_post_to_linkedin`:** Determines if the user wishes to publish a specific post to LinkedIn, based on user prompt.
- **`post_to_linkedin`:** Publishes a social media post to the user's LinkedIn profile using the Playwright library to interact with the browser.
- **`determine_next_action`:** Determine the next action in teh workflow based on the current agent state
- **`user_intent_classification`:** Classifies user intent from conversation messages using an AI model.
- **`content_router:`** Routes the workflow based on classified user intent and manages content source selection.


In [7]:
# Fetch articles from Towards Data Science
def fetch_tds_articles(state: AgentState) -> AgentState:
    print("Fetching articles from Towards Data Science...")
    page_content = fetch_url_content("https://towardsdatascience.com/latest")
    if not page_content:
        print("Failed to fetch TDS content")
        return {"content_items": []}

    soup = BeautifulSoup(page_content, "html.parser")
    content_items = [ContentItem(content=content) for article in soup.find_all("div", class_="postArticle", limit=5)
                    if (link_tag := article.find("a", {"data-action": "open-post"}))
                    and (content := parse_article_content(link_tag["href"]))]

    print(f"Found {len(content_items)} TDS articles")
    return {"content_items": content_items}

async def fetch_linkedin_profile_posts(state: AgentState) -> AgentState:
    print("Fetching LinkedIn profile posts...")
    playwright, browser, page = await initialize_browser(headless=False)
    try:
        await login_to_linkedin(page)
        # Change this line from scrape_linkedin_posts to get_linkedin_posts
        posts = await ProfilePage(page).get_linkedin_posts()
        print(f"Found {len(posts)} LinkedIn posts")
        return {"content_items": posts}
    finally:
        await close_browser(playwright, browser)

# Transcribe audio file
async def transcribe_audio(state: AgentState, audio_file: str = "./audio.mp3", openai_model: str = "whisper-1") -> AgentState:
    print(f"Transcribing audio file: {audio_file}")
    with open(audio_file, "rb") as audio:
        transcription = OpenAI().audio.transcriptions.create(model=openai_model, file=audio, response_format="text")
    print("Audio transcription complete")
    return {"content_items": [ContentItem(content=transcription)]}

# Transcribe YouTube video
async def transcribe_youtube(state: AgentState) -> AgentState:
    print("Transcribing YouTube video...")
    user_input = state.messages[-1].content if state.messages else ""
    video_id = model.with_structured_output(YouTubeURLParser).invoke([SystemMessage(content="Parse the YouTube URL and return the video ID"), HumanMessage(content=user_input)]).url.split("v=")[1]
    print(f"Processing video ID: {video_id}")
    transcript = " ".join(entry["text"] for entry in YouTubeTranscriptApi.get_transcript(video_id))
    return {"messages": [AIMessage(content=transcript)], "content_items": [ContentItem(content=transcript)]}


# Summarize Reddit posts
async def summarize_reddit(state: AgentState, post_count: int = 2, subreddit: str = "LangChain") -> AgentState:
    print(f"Summarizing {post_count} posts from r/{subreddit}")
    reddit = praw.Reddit(**{k.lower()[5:]: v for k, v in os.environ.items() if k.startswith("PRAW_")})
    content_items = []
    for submission in list(reddit.subreddit(subreddit).hot(limit=post_count))[1:]:
        print(f"Processing submission: {submission.title}")
        submission.comments.replace_more(limit=None)
        summary = model.invoke([HumanMessage(content=reddit_summarization_prompt.format(
            title=submission.title,
            body=submission.selftext or "No content",
            comments="\n".join(c.body for c in submission.comments.list()[:5])
        ))]).content
        content_items.append(ContentItem(content=summary))
    return {"content_items": content_items}

# Create social media post
async def create_post(state: AgentState) -> AgentState:
    generated_posts: List[str] = []
    if state.content_items:
        print(f"Creating posts from {len(state.content_items)} content items")
        for i, content_item in enumerate(state.content_items, 1):
            print(f"Generating post {i}...")
            final_prompt = prompt.format(topic=content_item.content)
            messages = [
                HumanMessage(content=final_prompt),
            ]
            response: AIMessage = await model.ainvoke(messages)
            generated_posts.append(response.content)
    return {"generated_posts": generated_posts}

# Check if post should be published to LinkedIn
async def should_post_to_linkedin(post: str) -> bool:
    print("Checking if post should be published to LinkedIn...")
    user_input = input(f"Do you want to post this content to LinkedIn?\n\n{post}\n\nEnter 'yes' to post or 'no' to skip: ")
    analysis = await model.with_structured_output(LinkedInPostDecision).ainvoke([
        SystemMessage(content="Analyze the user's response to determine if they want to post the content to LinkedIn. Provide a decision, confidence level, and reasoning."),
        HumanMessage(content=f"User's response: {user_input}")
    ])
    print(f"Post decision: {analysis.should_post}")
    return analysis.should_post

# Post content to LinkedIn
async def post_to_linkedin(state: AgentState) -> None:
    if not state.generated_posts:
        print("No posts to publish")
        return

    print("Initializing LinkedIn posting process...")
    playwright, browser, page = await initialize_browser(headless=True)

    try:
        print("Logging into LinkedIn...")
        await login_to_linkedin(page)

        # Wait for navigation after login
        await asyncio.sleep(5)

        print("Navigating to LinkedIn feed...")
        await page.goto("https://www.linkedin.com/feed/", timeout=15000)
        await asyncio.sleep(3)

        feed_page = FeedPage(page)

        for i, post in enumerate(state.generated_posts, 1):
            print(f"Processing post {i}/{len(state.generated_posts)}")

            max_retries = 3
            for attempt in range(max_retries):
                try:
                    if await should_post_to_linkedin(post):
                        print(f"Attempt {attempt + 1} to create post...")
                        await feed_page.create_post(post)
                        print("Successfully posted to LinkedIn")
                        break
                    else:
                        print("Post skipped")
                        break
                except Exception as e:
                    print(f"Attempt {attempt + 1} failed: {str(e)}")
                    if attempt == max_retries - 1:
                        raise
                    await asyncio.sleep(5)

            # Wait between posts
            await asyncio.sleep(5)

    except Exception as e:
        print(f"Error in LinkedIn posting process: {str(e)}")
        await page.screenshot(path="linkedin_error.png")
        raise
    finally:
        await close_browser(playwright, browser)

    return {"messages": [AIMessage(content="Post created on LinkedIn. Do you want to create another post from any other source?")]}

# Determine the next action in the workflow
def determine_next_action(state: AgentState) -> str:
    """
    Determines the next action in the workflow based on the current agent state and user intent.

    Args:
        state (AgentState): The current state containing:
            - messages: List of conversation messages
            - next_action: The next action to be taken

    Returns:
        str: The next action to take or END if should exit

    Examples:
        >>> # Continue case
        >>> state = AgentState(next_action="GET CONTENT FROM YOUTUBE")
        >>> determine_next_action(state)
        "GET CONTENT FROM YOUTUBE"

        >>> # Exit case
        >>> state = AgentState(messages=[HumanMessage(content="no thanks, I'm done")])
        >>> determine_next_action(state)
        "__end__"
    """
    # Check for exit keywords in the last message
    if state.messages:
        last_message = state.messages[-1].content.lower()
        exit_keywords = ['exit', 'quit', 'done', 'no', 'stop', 'bye', 'goodbye', 'end']

        if any(keyword in last_message for keyword in exit_keywords):
            print("Exit request detected")
            return END

    print(f"Next action determined: {state.next_action or END}")
    return state.next_action or END

# Classify user intent from messages
async def user_intent_classification(state: AgentState) -> Dict[str, Any]:
    """
    Classifies user intent including exit requests.
    """
    print("Classifying user intent...")
    system_message = """Classify the user's intent based on their messages:
    1. Determine if they want to exit (look for words like 'exit', 'quit', 'done', 'no', 'stop', 'bye')
    2. If not exiting, determine if they want to create a post from:
       - audio
       - YouTube
       - Reddit
       - Towards Data Science
       - LinkedIn profile
    Extract any relevant information from the user's messages to help with classification.
    """

    filtered_messages = [msg for msg in state.messages if not isinstance(msg, ToolMessage)]
    result = await model.with_structured_output(UserIntentClassification).ainvoke([
        SystemMessage(content=system_message),
        *filtered_messages
    ])
    print(f"Classification result: {result}")
    return result

# Route content based on user intent

# Then, fix the content router function
async def content_router(state: AgentState) -> Dict[str, Any]:
    """
    Routes the workflow based on classified user intent and manages content source selection.
    Includes exit handling.
    """
    print("Routing content based on user intent...")

    # Check if this is the first message
    if not state.messages:
        return {
            "messages": [
                AIMessage(content="""Hello, what kind of post do you want to create?
                                    Here are the options:
                                    1. youtube
                                    2. reddit
                                    3. towardsdatascience
                                    4. audio transcript
                                    5. linkedin profile
                                    Or type 'exit' to quit.""")
                                                ],
                                                "next_action": END
                                            }

    user_intent = await user_intent_classification(state)

    # Check for exit intent first
    if user_intent.exit_decision.should_exit:
        return {
            "messages": [AIMessage(content="Thank you for using the service. Goodbye!")],
            "next_action": END
        }

    # Handle YouTube URL request
    if user_intent.youtube_transcription_decision.transcribe_youtube:
        if user_intent.youtube_transcription_decision.url == "URL NOT PROVIDED":
            return {
                "messages": [AIMessage(content="Please provide a YouTube URL")],
                "next_action": END
            }
        else:
            return {"next_action": "GET CONTENT FROM YOUTUBE"}

    # Map intents to actions
    intent_mapping = {
        user_intent.audio_transcription_decision.transcribe_audio: "GET CONTENT FROM AUDIO",
        user_intent.reddit_summary_decision.summarize_reddit: "GET CONTENT FROM REDDIT",
        user_intent.towards_data_science_decision.fetch_tds_articles: "GET CONTENT FROM TOWARDS DATA SCIENCE",
        user_intent.linkedin_profile_decision.fetch_linkedin_posts: "GET CONTENT FROM LINKEDIN PROFILE"
    }

    # Find the first true intent and get its action
    for intent_condition, action in intent_mapping.items():
        if intent_condition:
            print(f"Selected next action: {action}")
            return {"next_action": action}

    # Default action if no intent matches
    default_action = "GET CONTENT FROM TOWARDS DATA SCIENCE"
    print(f"No specific intent matched. Selected default action: {default_action}")
    return {"next_action": default_action}

## Create and Compile the Graph
Now we'll create our LangGraph workflow and compile it.

In [None]:
router_paths = {
    "GET CONTENT FROM AUDIO": "GET CONTENT FROM AUDIO",
    "GET CONTENT FROM YOUTUBE": "GET CONTENT FROM YOUTUBE",
    "GET CONTENT FROM REDDIT": "GET CONTENT FROM REDDIT",
    "GET CONTENT FROM TOWARDS DATA SCIENCE": "GET CONTENT FROM TOWARDS DATA SCIENCE",
    "GET CONTENT FROM LINKEDIN PROFILE": "GET CONTENT FROM LINKEDIN PROFILE",
    "CREATE POST": "CREATE POST",
    END: END,
}

# Build the workflow graph
def build_workflow() -> StateGraph:
    print("Building workflow graph...")
    workflow = StateGraph(AgentState)
    workflow.add_node("ROUTER", content_router)
    workflow.add_node("GET CONTENT FROM AUDIO", transcribe_audio)
    workflow.add_node("GET CONTENT FROM YOUTUBE", transcribe_youtube)
    workflow.add_node("GET CONTENT FROM REDDIT", summarize_reddit)
    workflow.add_node("GET CONTENT FROM TOWARDS DATA SCIENCE", fetch_tds_articles)
    workflow.add_node("GET CONTENT FROM LINKEDIN PROFILE", fetch_linkedin_profile_posts)
    workflow.add_node("CREATE POST", create_post)
    workflow.add_node("POST TO LINKEDIN", post_to_linkedin)

    workflow.set_entry_point("ROUTER")
    workflow.add_conditional_edges("ROUTER", determine_next_action, router_paths)

    for source in ["GET CONTENT FROM AUDIO", "GET CONTENT FROM YOUTUBE", "GET CONTENT FROM REDDIT",
                  "GET CONTENT FROM TOWARDS DATA SCIENCE", "GET CONTENT FROM LINKEDIN PROFILE"]:
        workflow.add_edge(source, "CREATE POST")

    workflow.add_edge("CREATE POST", "POST TO LINKEDIN")
    workflow.add_edge("POST TO LINKEDIN", END)

    print("Workflow graph built successfully")
    return workflow

workflow_builder: StateGraph = build_workflow()
memory: MemorySaver = MemorySaver()
workflow: Any = workflow_builder.compile(checkpointer=memory)



#Display the graph structure

In [None]:

display(
    Image(
        workflow.get_graph().draw_mermaid_png(
            draw_method=MermaidDrawMethod.API,
        )
    )
)

#Run the graph

In [None]:
user_request = "Create a post from towardsdatascience"
result = None
config = {"configurable": {"thread_id": uuid4()}}
default = AgentState()
default.messages.append(
    HumanMessage(content=user_request)
)
print("Starting main workflow loop...")
while True:
    try:
        result = await workflow.ainvoke(input=result if result else default, config=config)

        # Check if we should exit
        if result.get("next_action") == END and any(
            msg.content.lower() == "thank you for using the service. goodbye!"
            for msg in result.get("messages", [])
        ):
            print("Exiting workflow")
            break

        user_input = input(result["messages"][-1].content + "\n(Type 'exit' to quit): ")
        result["messages"].append(HumanMessage(content=user_input))

    except Exception as e:
        print(f"Error in workflow: {str(e)}")
        break

#Use case examples

In [27]:
user_request = "Create a post from linkedin" # you can define how many posts you want to create

# OR

user_request = "Create a post from reddit" # you can define how many posts you want to create

#  OR

user_request = "Create a post from towardsdatascience" # you can define how many posts you want to create

#  OR

user_request = "Create a post from audio" # you must update an audio file first and update path for the file

#  OR

user_request = "Create a post from youtube"
