<a href="https://colab.research.google.com/github/vrangayyan6/GenAI/blob/main/Multi_Agent_LangGraph_Gemini_Google.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using AI to Streamline Research for Content Creation
## The Problem
Content creators, researchers, and marketers face a significant challenge when developing comprehensive material on specialized topics. The traditional research process is both time-consuming and labor-intensive:

- They must formulate effective search queries
- Sift through numerous search results manually
- Evaluate the relevance and credibility of sources
- Extract and organize key information
- Synthesize findings into coherent content
- Repeat this process multiple times to fill knowledge gaps

This workflow can take hours or even days, delaying content production and limiting the number of topics a team can cover effectively. For small teams or individual creators without research assistants, this bottleneck severely impacts productivity.

## The Solution
This notebook showcases how LangChain + LangGraph and other tools can be used to automate and optimize the research pipeline using agents ([see graph](https://drive.google.com/file/d/1qAq8IjjfSuVslKCnuHljdiVwcJXhlzZR/view?usp=drive_link))

Key capabilities:

- Query Optimization: The AI refines user queries to match search engine algorithms for better results (as shown in the format_search method)
- Automated Information Gathering: Instead of manual searching, the AI conducts multiple search queries and aggregates the results
- Intelligent Gap Analysis: The system evaluates the completeness of research and automatically identifies missing information (via the EditorAgent)
- Iterative Research: The AI conducts multiple rounds of research until sufficient information is gathered, targeting different aspects of the topic with each iteration
- Content Synthesis: When research is complete, the AI transforms raw research into well-structured content (via the WriterAgent)

## Real-World Impact
This AI-powered research workflow reduces what might take hours into minutes. Content creators can focus on refining and adding their unique perspective to AI-generated drafts rather than spending time on initial research and organization. The solution is particularly valuable for:

- Marketing teams needing to create content across multiple product lines
- Researchers exploring new domains quickly
- Educational content creators covering diverse topics
- Small businesses without dedicated research staff

By increasing the number of search results (as you've requested), the system becomes even more effective, gathering a wider range of perspectives and information in each research iteration.

## Environment Setup
Before we dive in, we prepare a custom Python environment and install the required libraries:

In [1]:
# Create a custom install directory
import os
# import shutil
# custom_path = "/kaggle/working/custom_env"
# if os.path.isdir(custom_path):
#     shutil.rmtree(custom_path)

# os.makedirs(custom_path, exist_ok=True)

# Install your desired packages into that path   --no-deps --target={custom_path}
!pip install -q langchain_community langgraph langgraph-prebuilt \
langchain  langsmith langchain_experimental langchain-google-genai langchain-google-community \
google-search-results langgraph-checkpoint ormsgpack filetype langchain_core langgraph_sdk \
pyppeteer requests beautifulsoup4 xxhash google-generativeai

# Prepend custom path to sys.path so Python loads from there
# import sys
# sys.path.insert(0, custom_path)

## Configure API Keys
We securely fetch API keys from Kaggle Secrets. You must add the following secrets in the Add-ons > Secrets menu:

GOOGLESEARCH_API_KEY — for Google Search API

GOOGLE_CSE_ID — Custom Search Engine ID

LANGCHAIN_API_KEY — for LangChain’s services

In [2]:
from google.colab import userdata

os.environ["GOOGLE_API_KEY"] = userdata.get("GOOGLESEARCH_API_KEY")
os.environ["GOOGLE_CSE_ID"] = userdata.get("GOOGLE_CSE_ID")
os.environ["LANGCHAIN_API_KEY"] = userdata.get('LANGCHAIN_API_KEY')

## Enable Tracing
LangChain offers an optional tracing feature that helps in debugging complex chains and workflows.

In [3]:
os.environ["LANGCHAIN_TRACING_V2"] = "true"

## Setup Data Storage
We initialize a simple in-memory list to store results from the agent workflow:

In [4]:
final_result = []  # List to store results from each step of the process

## Define Shared State
We use a TypedDict class to define and track the shared state between agents. This makes it easier to debug and enforce consistency.

In [5]:
from typing_extensions import TypedDict
from typing import Annotated
from langgraph.graph.message import add_messages

class State(TypedDict):
    """
    Manages the workflow's shared state as it moves between agents.

    Fields:
        query (list): History of query messages (as LangGraph messages)
        url (list): Links returned from the research agent
        research (list): Results returned from the research agent
        content (str): The generated blog post content
        content_ready (bool): Flag indicating if the content is complete
        iteration_count (int): Number of times the research step has run
    """
    query: Annotated[list, add_messages]
    url: Annotated[list, add_messages]
    research: Annotated[list, add_messages]
    content: str
    content_ready: bool
    iteration_count: int

## Initialize the Language Model
We use Google's Gemini model via LangChain’s wrapper.



In [6]:
import google.generativeai as genai

for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-001
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-flash-latest
models/gemini-1.5-flash-001
models/gemini-1.5-flash-001-tuning
models/gemini-1.5-flash
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-1.5-flash-8b-exp-0827
models/gemini-1.5-flash-8b-exp-0924
models/gemini-2.5-pro-exp-03-25
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-04-17
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemini-2.0-flash-think

In [7]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash-lite")

## Research Agent
format_search(): Uses Gemini to transform natural queries into more search-optimized forms.

search(): Executes a search via GoogleSearchRun and stores results in state.

🔁 Iteration results are logged in final_result for each loop of query → search.

In [8]:
import time
from langchain_google_community import GoogleSearchAPIWrapper
# from langchain_google_community import GoogleSearchRun
# from googlesearch import search
from langchain_core.messages import HumanMessage
import requests
from bs4 import BeautifulSoup

class ResearchAgent:
    """
    Agent responsible for optimizing search queries and fetching research results.
    Acts as the initial step in the blog post creation workflow.
    """
    def format_search(self, query: str) -> str:
        """
        Optimizes a search query to improve search result relevance.

        Args:
            query (str): The original search query provided by the user

        Returns:
            str: An optimized version of the query for better search results
        """
        prompt = (
            "You are an expert at optimizing search queries for Google. "
            "Your task is to take a given query and return an optimized version of it, making it more likely to yield relevant results. "
            "Do not include any explanations or extra text, only the optimized query.\n\n"
            "Example:\n"
            "Original: best laptop 2023 for programming\n"
            "Optimized: top laptops 2023 for coding\n\n"
            "Example:\n"
            "Original: how to train a puppy not to bite\n"
            "Optimized: puppy training tips to prevent biting\n\n"
            "Now optimize the following query:\n"
            f"Original: {query}\n"
            "Optimized:"
        )

        time.sleep(2)
        response = llm.invoke(prompt)  # Use LLM to optimize the query
        return response.content

    def fetch_full_content(self, url: str) -> str:
        try:
            print(f"getting page content for {url}")
            headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36'}
            response = requests.get(url, headers=headers, timeout=30)
            response.raise_for_status()  # Raise an exception for bad status codes
            soup = BeautifulSoup(response.content, 'html.parser')
            # Extract text content - you might need to adjust the selectors
            text_parts = soup.find_all('p')  # get all paragraph text
            full_text = "\n".join([part.get_text() for part in text_parts])
            if not full_text.strip():
                full_text = soup.get_text()[:5000]
            return f"{full_text[:5000]} " # Limit content length
        except requests.exceptions.RequestException as e:
            return f"URL: {url}\nError fetching content: {e}"
        except Exception as e:
            return f"URL: {url}\nError parsing content: {e}"

    def search(self, state: State):
        """
        Performs a Google search using the optimized query and updates the state.

        Args:
            state (State): Current workflow state containing the query

        Returns:
            dict: Updated state with research results
        """
        # Initialize the Google Search API wrapper first
        search_wrapper = GoogleSearchAPIWrapper(k=10)  # Configure to return top 10 results

        # Track execution time for performance monitoring
        start_time = time.perf_counter()
        optimized_query = self.format_search(state.get('query', "")[-1].content)  # Get and optimize the latest query
        end_time = time.perf_counter()

        for _ in range(3):
            raw_results = search_wrapper.results(optimized_query, 1)  # Execute the search

            if raw_results and raw_results[0].get('link'):
                url = raw_results[0]['link']
                full_content = self.fetch_full_content(url)
                time.sleep(1) # Be respectful of website rate limits
                if full_content!="":
                    final_result.append({"subheader": f"Research Iteration", "content": [full_content], "time": time.perf_counter() - time.perf_counter()}) # Correct the time calculation
                    return {"research": full_content, "url": url, "query": optimized_query}
                else:
                    print("No content found")
            else:
                print("No url found")

        return {"query": optimized_query}  # Return updated state

## Editor Agent
- Uses Gemini to inspect accumulated research.

- If results are sufficient, it sets content_ready = True.

- If insufficient, it generates a new improved query and continues the loop.

🔐 Includes a hard limit of 10 iterations to prevent infinite loops.

In [9]:
class EditorAgent:
    """
    Agent responsible for evaluating research results and determining if more research is needed.
    Acts as the decision-making component in the workflow.
    """
    def evaluate_research(self, state: State):
        """
        Evaluates if the current research is sufficient to write a comprehensive blog post.

        Args:
            state (State): Current workflow state containing queries and research

        Returns:
            dict: Updated state with evaluation results and either content_ready flag or new query
        """
        # Combine all existing queries and research results
        query = '\n'.join(message.content for message in state.get("query"))
        research = '\n'.join(message.content for message in state.get("research"))

        # Track iteration count to prevent infinite loops
        iteration_count = state.get("iteration_count", 1)

        if iteration_count is None:
            iteration_count = 1

        # Hard limit of 10 iterations to prevent endless loops
        if iteration_count >= 3:
            return {"content_ready": True}  # Force completion after 10 iterations

        prompt = (
            "You are an expert editor. Your task is to evaluate the research based on the query. "
            "If the information is sufficient to create a comprehensive and accurate blog post, respond with 'sufficient'. "
            "If the information is not sufficient, respond with 'insufficient' and provide a new, creative query suggestion to improve the results. "
            "If the research results appear repetitive or not diverse enough, think about a very different kind of question that could yield more varied and relevant information. "
            "Consider the depth, relevance, and completeness of the information when making your decision.\n\n"
            "Example 1:\n"
            "Used queries: What are the benefits of a Mediterranean diet?\n"
            "Research: The Mediterranean diet includes fruits, vegetables, whole grains, and healthy fats.\n"
            "Evaluation: Insufficient\n"
            "New query: Detailed health benefits of a Mediterranean diet\n\n"
            "Example 2:\n"
            "Used queries: How does solar power work?\n"
            "Research: Solar power works by converting sunlight into electricity using photovoltaic cells.\n"
            "Evaluation: Sufficient\n\n"
            "Example 3:\n"
            "Used queries: Effects of climate change on polar bears?\n"
            "Research: Climate change is reducing sea ice, affecting polar bear habitats.\n"
            "Evaluation: Insufficient\n"
            "New query: How are polar bears adapting to the loss of sea ice due to climate change?\n\n"
            "Now evaluate the following:\n"
            f"Used queries: {query}\n"
            f"Research: {research}\n\n"
            "Evaluation (sufficient/insufficient):\n"
            "New query (if insufficient):"
        )

        time.sleep(2)

        # Track execution time for performance monitoring
        start_time = time.perf_counter()
        response = llm.invoke(prompt)  # Use LLM to evaluate research quality
        end_time = time.perf_counter()

        evaluation = response.content.strip()

        # Record this iteration's evaluation for logging/debugging
        final_result.append({"subheader": f"Editor Evaluation Iteration", "content": evaluation, "time": end_time - start_time})

        # Parse response to determine next steps
        if "new query:" in evaluation.lower():
            # Extract new query suggestion if research is insufficient
            new_query = evaluation.split("New query:", 1)[-1].strip()
            print(f"New query suggestion: {new_query}")
            return {"query": [new_query], "iteration_count": iteration_count + 1, "evaluation": evaluation}
        else:
            # Research is sufficient, proceed to content creation
            return {"content_ready": True, "evaluation": evaluation}

## Writer Agent
- Combines query and research into a single blog post.

- Uses Gemini to generate detailed, structured content (intro, body, conclusion).

In [10]:
class WriterAgent:
    """
    Agent responsible for writing the final blog post based on the research.
    Acts as the final step in the workflow once research is deemed sufficient.
    """
    def write_blogpost(self, state: State):
        """
        Generates a comprehensive blog post based on the query and research.

        Args:
            state (State): Current workflow state containing query and research data

        Returns:
            dict: Updated state with the final blog post content
        """
        query = state.get("query")[0].content  if isinstance(state.get("query", [""]), list) else state.get("query", "") # Get the original query
        research = "\n".join([f"[{i+1}] {message.content} \n" for i, message in enumerate(state.get("research", []))])  # Combine all research results
        references = "\n".join([f"[{i+1}] {message.content} \n" for i, message in enumerate(state.get("url", []))])

        prompt = (
            "You are an expert blog post writer. Your task is to take a given query and context, and write a comprehensive, engaging, and informative short blog post about it. "
            "Make sure to include an introduction, main body with detailed information, and a conclusion. Use a friendly and accessible tone, and ensure the content is well-structured and easy to read. "
            "Apply best practices to cite the sources you used by referring to the number in the square brackets. "
            "Do not add the References section, I will add it at the bottom of the blog post."
            "Do not add anything other content not in the sources.\n\n"
            f"Query: {query}\n\n"
            f"Context:\n{research}\n\n"
            f"**References:**\n{references}"
            "Write a detailed and engaging blog post based on the above query and context."
        )

        time.sleep(2)

        response = llm.invoke(prompt)  # Use LLM to generate the blog post

        blogpost = f"{response.content}\n\n**References:**\n\n{references}"

        return {"content": blogpost}

## Define the LangGraph
🔄 This creates a loop between search_agent → editor_agent → (search or writer).

In [11]:
from langgraph.graph import StateGraph, START, END

# Initialize the StateGraph
graph = StateGraph(State)  # Create workflow graph using the State class as schema

# Add agent nodes to the graph
graph.add_node("search_agent", ResearchAgent().search)  # Research node
graph.add_node("writer_agent", WriterAgent().write_blogpost)  # Writer node
graph.add_node("editor_agent", EditorAgent().evaluate_research)  # Editor node

# Set the entry point of the workflow
graph.set_entry_point("search_agent")  # Start with search

# Add direct edge from search to editor
graph.add_edge("search_agent", "editor_agent")

# Add conditional edges from editor based on research evaluation
graph.add_conditional_edges(
    "editor_agent",
    lambda state: "accept" if state.get("content_ready") else "revise",  # Decision function
    {
        "accept": "writer_agent",  # If research is sufficient, proceed to writing
        "revise": "search_agent"   # If more research needed, loop back to search
    }
)

# Add edge from writer to end the workflow
graph.add_edge("writer_agent", END)

# Compile the graph to make it executable
graph = graph.compile()

The following code renders the graph. ([see graph rendered on https://mermaid.live/](https://drive.google.com/file/d/1qAq8IjjfSuVslKCnuHljdiVwcJXhlzZR/view?usp=drive_link))

In [12]:
from IPython.display import Image

# Image(graph.get_graph().draw_mermaid_png())

## Main Invocation Function
This function:

- Starts the LangGraph with the user’s query

- Automatically loops through agents until content_ready

- Returns the final blog post

In [13]:
def invoke_graph(user_prompt):
    """
    Main function to execute the entire blog post generation workflow.

    Args:
        user_prompt (str): The initial user query to research and write about

    Returns:
        str: The final blog post content
    """
    # Track total execution time
    start_time = time.perf_counter()
    blogpost = graph.invoke({"query": user_prompt})  # Start the workflow with user prompt
    end_time = time.perf_counter()
    print(f"Total time in completing the workflow: {end_time - start_time} seconds\n\n")  # Add spacing for readability in console output
    return blogpost["content"]  # Return the final blog post content

## Provide your prompt

In [14]:
user_prompt = "Act as an expert in Financial Services, explain in detail on separately managed accounts using 5000 words or more."

## View response
You will see the Google search results in the first section, and the blogpost generated in the next.

In [15]:
from IPython.display import Markdown

result = invoke_graph(user_prompt)
Markdown(result)

getting page content for https://careers.seic.com/global/en/job/R0032000/Specialist-I-Separately-Managed-Accounts
New query suggestion: What are the key benefits and drawbacks of separately managed accounts (SMAs) for different investor profiles (e.g., high-net-worth individuals, institutional investors)? Provide a detailed comparison of SMAs to other investment vehicles (e.g., mutual funds, ETFs, hedge funds), covering topics such as cost, transparency, tax efficiency, customization, and risk management. Include real-world examples and case studies.
getting page content for https://www.linkedin.com/in/andrew-wilson-600b55
New query suggestion: Analyze the current market landscape for Separately Managed Accounts (SMAs), including the latest trends in investment strategies, fee structures, and technological advancements. Focus on how these trends are impacting different investor segments (high-net-worth individuals, institutional investors, and retail investors) and the challenges and o



Total time in completing the workflow: 1780.5810141789998 seconds




## Diving Deep into Separately Managed Accounts (SMAs): A Comprehensive Guide

Are you looking for a more personalized and potentially tax-efficient way to invest? Have you heard of Separately Managed Accounts (SMAs) and wondered if they might be right for you? This blog post will provide a comprehensive overview of SMAs, exploring their features, benefits, and considerations to help you determine if they align with your financial goals.

### What are Separately Managed Accounts?

Unlike mutual funds where your money is pooled with other investors' funds, a Separately Managed Account (SMA) is a portfolio of investments managed specifically for you [1]. This means your investments are held in your name, and the portfolio is tailored to your individual financial situation, investment objectives, and risk tolerance. SMAs are typically managed by professional investment managers, often affiliated with brokerage firms or independent advisory firms.

### Key Features of SMAs:

*   **Customization:** This is the hallmark of an SMA. The investment manager can customize the portfolio to your specific needs, unlike a mutual fund that follows a standardized investment strategy. This customization can include asset allocation, security selection, and tax-management strategies.
*   **Direct Ownership:** You directly own the securities in your portfolio. This provides greater transparency and control compared to investing in a fund.
*   **Tax Efficiency:** SMAs often offer greater tax efficiency than mutual funds. The investment manager can actively manage capital gains and losses to minimize your tax liability. They can also utilize tax-loss harvesting, selling investments that have lost value to offset gains elsewhere in the portfolio.
*   **Transparency:** You typically receive regular reports detailing the holdings, performance, and transactions within your account. This provides a clear understanding of how your investments are performing.
*   **Professional Management:** SMAs are managed by experienced investment professionals who have the expertise to make informed investment decisions.

### Benefits of Investing in SMAs:

*   **Personalized Investment Strategy:** The primary benefit is the ability to create an investment strategy tailored to your unique circumstances. This includes aligning the portfolio with your risk tolerance, time horizon, and financial goals.
*   **Potential for Higher Returns:** Skilled investment managers can potentially generate higher returns by actively managing the portfolio and selecting investments that align with your objectives.
*   **Tax Optimization:** As mentioned, SMAs offer opportunities for tax-efficient investing, which can significantly impact your after-tax returns.
*   **Enhanced Communication and Service:** You typically have direct access to your investment manager, providing opportunities for personalized advice and ongoing communication.
*   **Greater Control:** While you delegate the investment decisions, you maintain control over your investments and can provide input on your investment strategy.

### Considerations Before Investing in an SMA:

*   **Minimum Investment Requirements:** SMAs typically have higher minimum investment requirements than other investment vehicles, such as mutual funds or ETFs. This can range from $100,000 to several million dollars [1].
*   **Fees:** SMAs typically charge an annual management fee, usually a percentage of the assets under management (AUM). This fee can vary depending on the investment manager and the complexity of the portfolio.
*   **Performance:** While professional management can enhance returns, there is no guarantee of outperformance. It's crucial to evaluate the investment manager's track record and investment strategy.
*   **Complexity:** SMAs can be more complex than other investment options. It's essential to understand the investment strategy and how it aligns with your goals.
*   **Due Diligence:** Before investing in an SMA, it's essential to conduct thorough due diligence. This includes researching the investment manager, reviewing their track record, understanding their investment strategy, and assessing their fees.

### Who is an SMA Right For?

SMAs are generally best suited for investors who:

*   Have a significant amount of investable assets.
*   Seek a highly personalized investment strategy.
*   Desire tax-efficient investing.
*   Want direct access to a professional investment manager.
*   Prefer a high degree of transparency and control over their investments.

### Finding the Right Investment Manager:

Selecting the right investment manager is critical to the success of your SMA. Consider the following factors:

*   **Experience and Credentials:** Look for an investment manager with a proven track record and relevant certifications, such as the Chartered Financial Analyst (CFA) designation.
*   **Investment Strategy:** Ensure the manager's investment strategy aligns with your financial goals and risk tolerance.
*   **Fees and Expenses:** Understand the fee structure and all associated costs.
*   **Communication and Service:** Assess the manager's communication style and the level of service they provide.
*   **References:** Ask for references from existing clients to gauge their satisfaction with the manager's services.

### Conclusion:

Separately Managed Accounts offer a compelling investment solution for high-net-worth individuals seeking personalized investment management, tax efficiency, and direct ownership of their investments. By carefully considering your financial goals, risk tolerance, and the factors outlined in this guide, you can determine if an SMA is the right choice for you. Remember to conduct thorough due diligence and choose an investment manager with the expertise and experience to help you achieve your financial objectives.

**References:**

[1] https://careers.seic.com/global/en/job/R0032000/Specialist-I-Separately-Managed-Accounts 

[2] https://www.linkedin.com/in/andrew-wilson-600b55 

[3] https://www.investmentadviser.org/wp-content/uploads/2024/06/Snapshot2024_FINAL.pdf 


In [16]:
print(result)

## Diving Deep into Separately Managed Accounts (SMAs): A Comprehensive Guide

Are you looking for a more personalized and potentially tax-efficient way to invest? Have you heard of Separately Managed Accounts (SMAs) and wondered if they might be right for you? This blog post will provide a comprehensive overview of SMAs, exploring their features, benefits, and considerations to help you determine if they align with your financial goals.

### What are Separately Managed Accounts?

Unlike mutual funds where your money is pooled with other investors' funds, a Separately Managed Account (SMA) is a portfolio of investments managed specifically for you [1]. This means your investments are held in your name, and the portfolio is tailored to your individual financial situation, investment objectives, and risk tolerance. SMAs are typically managed by professional investment managers, often affiliated with brokerage firms or independent advisory firms.

### Key Features of SMAs:

*   **Customiz

# Blogpost for Gen AI Intensive Course Capstone 2025Q1
This is the link to the blogpost https://www.linkedin.com/pulse/supercharging-content-creation-ai-research-agents-vinod-v--pwjte

# YouTube video
This is the link to the YouTube video https://youtu.be/XUpenymdGDQ