We are going to build a ReAct agent that uses an LLM for planning (generating research questions) and then uses a web search tool to gather information, finally compiling a report.

Steps:

Set up the environment and install necessary packages.

Use an LLM (we'll use OpenAI's GPT model via LangChain) for generating research questions.

Use a web search tool (we'll use Tavily Search API) to search the web for each question and extract information.

Implement the agent flow: generate questions -> search for each -> compile report.

Structure the report.

We'll use:

LangChain: for LLM integration and agent tools.

LangGraph: to define the agent workflow (though for simplicity we might not need a graph for this linear flow, but we can use it for practice).

Tavily: for web search.

However, note that the assignment mentions using the "Taxily" library (which might be a typo for Tavily). We'll use Tavily.

Since the assignment requires building from scratch, we'll create a custom agent without using the built-in LangChain agents, but we'll use LangChain's LLM and Tavily tool.

Steps in code:

Set up environment variables for OpenAI and Tavily.

Create a function to generate research questions using an LLM.

Create a function to perform web search for a given question and extract the top results (we'll get the top 3 for each question).

Create a function to format the search results for a question.

After gathering information for all questions, create a function to generate the report.

We are not using LangGraph for this simple linear flow, but if we want to extend to more complex workflows we could. However, the assignment is straightforward.

Let's break down:

Agent Class:

init(self, topic: str, llm, search_tool)

generate_questions(self) -> List[str]

search_question(self, question: str) -> List[dict]

Let's break down how to build this ReAct agent in Google Colab using LangChain and Tavily, keeping in mind the constraints of the free Gemini model.

Important Considerations for using the free Gemini model:

The free Gemini model might have limitations on the complexity of prompts and the amount of text it can process in one go. You might need to adjust the size of the report you generate.
API rate limits might apply.
You will need to use the Google Generative AI library for interacting with Gemini, not the OpenAI API as mentioned in your prompt.

In [16]:
!pip install -qU google-generativeai langchain langchain_core tavily-python langchain_google_genai

Steps in Code:

Set up environment variables: You'll need to store your API keys securely. In Colab, you can use os.environ.

Install necessary packages:

In [17]:
!pip install -qU google-generativeai langchain langchain_core tavily-python

Authenticate with Google Generative AI:

In [18]:
import google.generativeai as genai
import os

    # Add your API key as a Colab Secret in the sidebar
    # Learn more on https://colab.research.google.com/notebooks/basic_features.ipynb#secrets
os.environ["GOOGLE_API_KEY"] = "AIzaSyCMuPYsqtBxcvvA4ewX7r32tdMWEf-2WT8"

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])

Create the Agent Class:

In [19]:
from typing import List, Dict
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from tavily import TavilyClient
class ResearchAgent:
        def __init__(self, topic: str, llm, search_tool):
            self.topic = topic
            self.llm = llm
            self.search_tool = search_tool

        def generate_questions(self) -> List[str]:
            prompt = PromptTemplate.from_template(
                "Generate 3-5 research questions about the topic: {topic}"
            )
            chain = {"topic": RunnablePassthrough()} | prompt | self.llm | StrOutputParser()
            questions_string = chain.invoke(self.topic)
            # Simple parsing of questions assuming each question is on a new line
            questions = [q.strip() for q in questions_string.split('\n') if q.strip()]
            return questions

        def search_question(self, question: str) -> List[Dict]:
            try:
                results = self.search_tool.search(query=question, max_results=3)
                return results['results']
            except Exception as e:
                print(f"Error during search for question '{question}': {e}")
                return []

        def format_results(self, results: List[Dict]) -> str:
            formatted_output = ""
            for i, result in enumerate(results):
                formatted_output += f"Source {i+1}:\n"
                formatted_output += f"Title: {result.get('title', 'N/A')}\n"
                formatted_output += f"URL: {result.get('url', 'N/A')}\n"
                formatted_output += f"Content: {result.get('content', 'N/A')}\n\n"
            return formatted_output

        def compile_report(self, research_data: Dict[str, str]) -> str:
            report_template = """
            Research Report on: {topic}

            Introduction:
            This report summarizes information gathered through research on the topic "{topic}".

            Research Findings:

            {findings}

            Conclusion:
            This research provides insights into "{topic}" based on the gathered information.
            """

            findings_section = ""
            for question, results_text in research_data.items():
                findings_section += f"Question: {question}\n"
                findings_section += results_text
                findings_section += "---\n\n"

            report = report_template.format(topic=self.topic, findings=findings_section)
            return report

        def run(self):
            print(f"Starting research on: {self.topic}")
            questions = self.generate_questions()
            print("Generated research questions:")
            for q in questions:
                print(f"- {q}")

            research_data = {}
            for question in questions:
                print(f"\nSearching for: {question}")
                results = self.search_question(question)
                formatted_results = self.format_results(results)
                research_data[question] = formatted_results
                print("Search complete.")

            report = self.compile_report(research_data)
            return report

Instantiate and Run the Agent:

In [20]:
# Replace "YOUR_TAVILY_API_KEY" with your actual Tavily API key
# Add your API key as a Colab Secret in the sidebar
# Learn more on https://colab.research.google.com/notebooks/basic_features.ipynb#secrets
import os # Import os if not already imported
os.environ["TAVILY_API_KEY"] = "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"

# Import the LangChain Google Generative AI integration
from langchain_google_genai import ChatGoogleGenerativeAI

# Set your Google API Key as an environment variable if not using Colab Secrets
# os.environ["GOOGLE_API_KEY"] = "AIzaSyCMuPYsqtBxcvvA4ewX7r32tdMWEf-2WT8"

# Instantiate ChatGoogleGenerativeAI instead of the raw genai.GenerativeModel
# This makes the LLM compatible with LangChain Runnables
# Changed 'gemini-pro' to 'gemini-1.5-flash-latest'
llm = ChatGoogleGenerativeAI(model='gemini-1.5-flash-latest', google_api_key=os.environ.get("GOOGLE_API_KEY"))

# Assuming ResearchAgent and TavilyClient are defined in previous cells
tavily_search = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

topic = "Impact of AI on the job market"
agent = ResearchAgent(topic, llm, tavily_search)
final_report = agent.run()

print("\n--- Final Report ---")
print(final_report)

Starting research on: Impact of AI on the job market
Generated research questions:
- 1. **To what extent does the adoption of AI-driven automation differentially impact job displacement across various skill levels and occupational sectors, and what are the mediating factors (e.g., education, training, geographic location)?**  (This question explores the uneven impact of AI and seeks to identify factors that exacerbate or mitigate displacement.)
- 2. **How effectively do current reskilling and upskilling initiatives address the job market shifts caused by AI, and what are the key barriers to successful workforce adaptation?** (This focuses on the policy and practical responses to AI-driven job changes.)
- 3. **What is the relationship between AI adoption and the creation of new job roles, and what are the required skills and qualifications for these emerging positions?** (This explores the potential for AI to generate new employment opportunities.)
- 4. **How does the increasing prevale