#  Research Paper Analysis with OpenAGI  

This notebook demonstrates how to use OpenAGI to automate the extraction, analysis, and summarization of research papers using **two specialized workers**.  


## 🔧 Components Used  
- **PDFLoaderTool**: Loads and extracts content from a research paper.  
- **Workers**:  
  - **PDF Content Extractor** – Extracts and structures content from the document.  
  - **Research Summary Writer** – Synthesizes insights and creates a structured summary.  
- **Admin Agent**: Orchestrates the workflow, using OpenAGI's task planning system.  




In [None]:
pip install openagi

Collecting openagi
  Downloading openagi-0.2.9.9-py3-none-any.whl.metadata (9.0 kB)
Collecting azure-identity<2.0.0,>=1.15.0 (from openagi)
  Downloading azure_identity-1.19.0-py3-none-any.whl.metadata (80 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m80.6/80.6 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting chromadb<0.6.0,>=0.5.0 (from openagi)
  Downloading chromadb-0.5.23-py3-none-any.whl.metadata (6.8 kB)
Collecting duckduckgo-search<7.0.0,>=6.1.0 (from openagi)
  Downloading duckduckgo_search-6.4.2-py3-none-any.whl.metadata (25 kB)
Collecting faiss-cpu<2.0.0,>=1.8.0 (from openagi)
  Downloading faiss_cpu-1.10.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.4 kB)
Collecting fake-useragent<2.0.0,>=1.5.1 (from openagi)
  Downloading fake_useragent-1.5.1-py3-none-any.whl.metadata (15 kB)
Collecting fastapi<0.111.0,>=0.110.0 (from openagi)
  Downloading fastapi-0.110.3-py3-none-any.whl.metadata (24 kB)
Collecting google-search-results<3.0.0,>=2.

In [None]:
import os
from openagi.actions.tools.document_loader import PDFLoaderTool
from openagi.actions.files import WriteFileAction
from openagi.agent import Admin
from openagi.memory import Memory
from openagi.planner.task_decomposer import TaskPlanner
from openagi.worker import Worker

In [None]:
PDFLoaderTool.set_config({
    "filename": "/content/DeepSeek_R1.pdf"  # Path to the file containing the answer
})

In [None]:
# Set Azure environment variables
os.environ["AZURE_OPENAI_API_KEY"] = "c66fa9c426a4441f962f8c86b7cf2169"
os.environ["AZURE_BASE_URL"] = "https://internkey.openai.azure.com/"
os.environ["AZURE_DEPLOYMENT_NAME"] = "intern-gpt4"
os.environ["AZURE_MODEL_NAME"] = "gpt-4o-mini"
os.environ["AZURE_OPENAI_API_VERSION"] = "2023-05-15"

config = AzureChatOpenAIModel.load_from_env_config()
llm = AzureChatOpenAIModel(config=config)

In [None]:
# Content Extractor Worker
extractor = Worker(
    role="PDF Content Extractor",
    instructions="""
    Extract and structure content from the research paper PDF.
    Focus on:
    1. Identifying different sections (Abstract, Introduction, Methodology, Results, Discussion)
    2. Extracting tables, figures, and their captions
    3. Organizing references and citations
    """,
    actions=[PDFLoaderTool],
)

# Research Summarizer Worker
summarizer = Worker(
    role="Research Summary Writer",
    instructions="""
    Create a comprehensive research summary:
    1. Synthesize key findings and implications
    2. Evaluate the significance of the research
    3. Identify potential applications and future directions
    4. Create a structured summary document
    """,
    actions=[PDFLoaderTool],

)

admin = Admin(
    planner=TaskPlanner(human_intervene=False),
    memory=Memory(),
    actions=[PDFLoaderTool],
    llm=llm,
)
admin.assign_workers([extractor, summarizer])

In [None]:
# Run the analysis
res = admin.run(
    query=f"Analyze the uploaded research paper content:",
    description="""
    Perform a comprehensive analysis of the research paper:
    1. First, extract and structure all content from the PDF
    2. Then, conduct a detailed analysis of the methodology
    3. Finally, create a comprehensive summary with key insights
    Focus on understanding the research approach, findings, and implications.
    Provide a summary of key findings.
    """
)

In [None]:
print(res)

```markdown
# Key Findings

## Models
- **DeepSeek-R1-Zero**: This model was trained using large-scale reinforcement learning without any supervised fine-tuning. It showcases impressive reasoning abilities but struggles with readability.
- **DeepSeek-R1**: An enhanced version that incorporates multi-stage training and utilizes cold-start data, achieving performance levels similar to those of OpenAI models.

## Performance
- **Benchmarks**:
  - **DeepSeek-R1**: Demonstrates performance comparable to OpenAI-o1-1217 on reasoning tasks.
  - **Open Source**: Both DeepSeek-R1-Zero and DeepSeek-R1, along with several dense models, have been made available as open-source.

# Significance
The research marks a notable advancement in AI reasoning capabilities, highlighting the effectiveness of reinforcement learning techniques without the need for prior supervised training.

# Applications
- Tasks in Natural Language Processing that require robust reasoning capabilities.
- Open-source models that