# Knowledge Synthesis Platform - Research Agent Demo

This notebook demonstrates the capabilities of the enhanced research agent with improved query decomposition and visualization features.

## Setup

First, let's import the necessary libraries and initialize the research agent.

In [None]:
import os
import sys
from dotenv import load_dotenv
from IPython.display import HTML, display

# Add the app directory to the path so we can import from it
sys.path.append(os.path.abspath('app'))

# Load environment variables
load_dotenv()

# Import the ResearchAgent
from services.research_agent import ResearchAgent
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Check if OpenAI API key is set
if not os.getenv('OPENAI_API_KEY'):
    raise ValueError("OPENAI_API_KEY environment variable is not set. Please set it in the .env file.")

# Initialize the LLM and embeddings
llm = ChatOpenAI(model_name='gpt-4', temperature=0)
embeddings = OpenAIEmbeddings()

# Initialize the research agent
research_agent = ResearchAgent(llm=llm, embeddings=embeddings)

print("Research agent initialized successfully!")


## Example Query: Analyzing Machine Learning for Climate Change

Let's test the agent with a complex query that requires multiple steps and tools.

In [None]:
query = "Analyze recent machine learning applications for climate change, summarize key papers, identify datasets used, and show publication trends."

print("Executing research query...")
response = research_agent.research(query)
print("Research completed!")


## Display Research Results

Now let's examine the results of our research query.

In [None]:
# Display the response text
print("\nRESEARCH RESPONSE:\n")
print(response['text'])

# Display the sources used
print("\nSOURCES:\n")
for source in response['sources']:
    print(f"- {source}")

# Display datasets identified (if any)
if 'datasets' in response and response['datasets']:
    print("\nDATASETS IDENTIFIED:\n")
    for dataset in response['datasets']:
        print(f"- {dataset}")

# Display evaluation scores
print("\nEVALUATION SCORES:\n")
print(f"Faithfulness: {response['evaluation']['faithfulness']['score']}/10")
print(f"Relevance: {response['evaluation']['relevance']['score']}/10")
print(f"Overall: {response['evaluation']['overall_score']}/10")


## Visualize Publication Trends

Let's use our enhanced visualization capabilities to analyze publication trends for machine learning in climate change.

In [None]:
# Generate publication trend visualization
trend_html = research_agent.analyze_publication_trends("machine learning climate change", years_back=5)

# Display the visualization
display(HTML(trend_html))


## Examining Query Decomposition

Let's look at how the agent decomposes a complex query into subtasks.

In [None]:
# Example complex query
complex_query = "Compare deep learning and traditional machine learning approaches for climate prediction, identify the most commonly used datasets, and analyze which methods perform better for different climate variables."

# Decompose the query
subtasks = research_agent._decompose_query(complex_query)

# Display the subtasks
print("QUERY DECOMPOSITION:\n")
for i, task in enumerate(subtasks, 1):
    print(f"Task {i}: {task['description']}")
    print(f"Tool: {task['tool']}")
    print(f"Input: {task['input']}")
    if 'dependencies' in task and task['dependencies']:
        print(f"Dependencies: {task['dependencies']}")
    print()


## Conclusion

This notebook demonstrates the enhanced capabilities of our research agent, including:

1. Improved query decomposition for complex multi-step requests
2. Enhanced visualization for publication trends
3. Integration of multiple tools (web search, arxiv, document processing)
4. Evaluation metrics for response quality

The agent successfully handles complex research queries by breaking them down into manageable subtasks and orchestrating the execution of these tasks to produce comprehensive research results.