# Together's Open Deep Research Cookbook: An Efficient and Open Source Implementation with Multi-Step Web Search

Authors: Shang Zhu, Federico Bianchi, Zhichao Li, Ben Athiwaratkun, Albert Meixner, James Zou

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/togethercomputer/together-cookbook/blob/main/Agents/Together_Open_Deep_Research_CookBook.ipynb)

## Introduction

Welcome to the Together's Open Deep Research Cookbook! This guide demonstrates how to answer complex research questions with comprehensive, evidence-based reports by combining the power of large language models (LLMs) with structured web search techniques.

We break down the research process into clear, modular steps that can be understood, customized, and improved independently. This modularity makes advanced research capabilities accessible and adaptable to diverse research needs.

### Together's Open Deep Research Workflow

<img src="https://github.com/vinid/data/blob/master/loop.png?raw=true">


Our Deep Research framework offers several key advantages:

1. **Progressive Exploration**: Starting with initial queries derived from your research topic, the system intelligently explores information, identifies gaps, and refines its search strategy.

2. **Evidence-Based Synthesis**: All findings are grounded in verified sources with proper citations, reducing the hallucinations common in pure LLM-based question and answering.

3. **Adaptive Search with Completeness Evaluation**: The system evaluates information gaps against research goals and generates targeted follow-up queries for missing specifics. This ensures resources focus precisely on completing your research rather than collecting redundant information.

4. **Source Filtering**: Not all information is equally valuable - our system evaluates and prioritizes the most relevant and reliable sources.

5. **Structured Output**: Results are presented as well-organized reports with clear sections, cohesive narratives, and proper citations.

### Cookbook Contributions

- Set up a flexible research environment with configurable parameters
- Process complex topics through a multi-stage research pipeline
- Generate high-quality reports for the given research topic

The functional design pattern we showcase makes each step transparent and customizable, allowing you to adapt the system to specific domains or research styles. Each function handles a specific part of the research process, from query generation to answer synthesis, with explicit inputs and outputs that can be tested and improved independently.

Whether you're a researcher, student, analyst, or curious explorer, this cookbook provides the tools to transform your research questions into comprehensive, evidence-based reports that would normally require hours of manual work.

Let's begin our exploration of AI-powered research!

## Install packages

In [1]:
%%capture
!pip install tavily-python together markdown

## Set Up API Keys

Masukkan API key untuk Together AI dan Tavily di bawah ini. Anda bisa mendapatkan API key dari:
- [Together AI](https://docs.together.ai/docs/quickstart)
- [Tavily](https://docs.tavily.com/documentation/quickstart)

In [None]:
# Masukkan API key Anda di sini
TOGETHER_API_KEY = "5b7c2acb5de3e7a326788a07af683af1a9512a69aaaf751b68a7812ee09261e4"  # Ganti dengan API key Together AI Anda
TAVILY_API_KEY = "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"      # Ganti dengan API key Tavily Anda

## Initialize necessary functions and configuration

In [2]:
from dataclasses import dataclass
from typing import Literal, Optional
from pydantic import BaseModel, Field
import json
import asyncio
from typing import List
from together import AsyncTogether, Together
from tavily import AsyncTavilyClient

### Data Model Definitions

Before diving into the research functionality, we need to define the data structures that will represent our research plan, search results, and source evaluation. These models provide a structured way to handle the different types of data flowing through our research pipeline.

The data model consists of several key components:

1. **ResearchPlan**: Structures the initial research plan with search queries generated by the LLM
2. **SourceList**: Captures the filtered list of relevant sources chosen by the LLM
3. **SearchResult**: Represents an individual search result with its metadata and content
4. **SearchResults**: A collection of search results with utilities for manipulation and display

These structured data models allow us to maintain organization throughout the research process, from query generation to final report synthesis.

The first two will be pydantic base classes, and it will be used for JSON mode. The latter two will instread be more general dataclasses.

In [3]:
class ResearchPlan(BaseModel):
    """
    Structured representation of a research plan with search queries.

    Used to parse the LLM's planning output into a structured format
    that can be easily processed by the research pipeline.
    """
    queries: list[str] = Field(description="A list of search queries to thoroughly research the topic")

class SourceList(BaseModel):
    """
    Structured representation of filtered source indices.

    Used to parse the LLM's source evaluation output into a structured
    format that identifies which search results should be retained.
    """
    sources: list[int] = Field(description="A list of source numbers from the search results")

@dataclass
class SearchResult:
    """
    Container for an individual search result with its metadata and content.

    Holds both the original content and the filtered/processed content
    that's relevant to the research topic.
    """
    title: str
    link: str
    content: str
    filtered_raw_content: Optional[str] = None

    def __str__(self):
        """(For Report Generation and Completeness Evaluation) String representation with title, link and refined content."""
        return (
            (f"Title: {self.title}\n" f"Link: {self.link}\n" f"Refined Content: {self.filtered_raw_content}")
            if self.filtered_raw_content
            else (f"Title: {self.title}\n" f"Link: {self.link}\n" f"Raw Content: {self.content[:1000]}")
        )

    def short_str(self):
        """(For Filtering ONLY) Abbreviated string representation with truncated raw content."""
        return f"Title: {self.title}\nLink: {self.link}\nRaw Content: {self.content[:1000]}"


@dataclass
class SearchResults:
    """
    Collection of search results with utilities for manipulation and display.

    Provides methods for combining result sets, deduplication, and
    different string representations for processing and display.
    """
    results: list[SearchResult]

    def __str__(self):
        """Detailed string representation of all search results with indices."""
        return "\n\n".join(f"[{i+1}] {str(result)}" for i, result in enumerate(self.results))

    def __add__(self, other):
        """Combine two SearchResults objects by concatenating their result lists."""
        return SearchResults(self.results + other.results)

    def short_str(self):
        """Abbreviated string representation of all search results with indices."""
        return "\n\n".join(f"[{i+1}] {result.short_str()}" for i, result in enumerate(self.results))

    def dedup(self):
        """
        Remove duplicate search results based on URL.

        Returns a new SearchResults object with unique entries.
        """
        def deduplicate_by_link(results):
            seen_links = set()
            unique_results = []

            for result in results:
                if result.link not in seen_links:
                    seen_links.add(result.link)
                    unique_results.append(result)

            return unique_results

        return SearchResults(deduplicate_by_link(self.results))

## Deep Research Configuration Parameters

This section defines the key parameters that control the Deep Research process. These configuration settings allow you to customize the research behavior, model selection, and output format according to your specific needs.

The parameters include:

- **Model Selection**: Defines which LLMs to use for different stages of the research process
- **Research Budget**: Controls the number of research cycles performed, with higher values enabling more thorough but time-consuming exploration
- **Query Limits**: Sets boundaries on how many search queries to generate and execute
- **Source Management**: Configures the maximum amount of sources to be considered in the final synthesis (avoid ultra-long context that may break the context window)
- **Prompt Templates**: Defines the instruction prompts utilized for selected models.


Adjust these parameters to balance depth, speed, and resource usage according to your research priorities.

# Disclaimers ⚠️

**For readability**, here we provide rather simple prompts but we encourage you to optimize them as needed.

**This workflow involves multiple LLM calls and search API calls**, so we remind the user to be mindful about the resource management as defined in the following code block.

**Be sure** to understand that this code will make API requests to other services. Specifically, you will need Tavily API key (get it [here](https://docs.tavily.com/documentation/quickstart)) and Together API key (get it [here](https://docs.together.ai/docs/quickstart)).

In [None]:
# Research Configuration
# ======================

# Model Selection
# --------------
# Specialized models for different stages of the research pipeline
planning_model = "Qwen/Qwen2.5-72B-Instruct-Turbo"  # Used for research planning and evaluation
json_model = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo"  # Used for structured data parsing
summary_model = "meta-llama/Llama-4-Scout-17B-16E-Instruct"  # Used for web content summarization
answer_model = "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"  # Used for final answer synthesis

# Resource Allocation
# ------------------
# Parameters controlling research depth and breadth
budget = 2  # Number of research refinement cycles to perform (in addition to the initial search operation)
max_queries = 2  # Maximum number of search queries per research cycle
max_sources = 10  # Maximum number of sources to include in final synthesis
max_tokens = 8192 # Maximum number of tokens in the generated report

# System Prompts
# -------------
# Instructions for each stage of the research process
prompts = {
    # Planning: Generates initial research queries
    "planning_prompt": """You are a strategic research planner with expertise in breaking down complex
                         questions into logical search steps. Generate focused, specific, and self-contained queries that
                         will yield relevant information for the research topic.""",

    # Plan Parsing: Extracts structured data from planning output
    "plan_parsing_prompt": """Extract search queries that should be executed.""",

    # Content Processing: Identifies relevant information from search results
    "raw_content_summarizer_prompt": """Extract and synthesize only the information relevant to the research
                                       topic from this content. Preserve specific data, terminology, and
                                       context while removing irrelevant information.""",

    # Completeness Evaluation: Determines if more research is needed
    "evaluation_prompt": """Analyze these search results against the original research goal. Identify
                          specific information gaps and generate targeted follow-up queries to fill
                          those gaps. If no significant gaps exist, indicate that research is complete.""",

    # Evaluation Parsing: Extracts structured data from evaluation output
    "evaluation_parsing_prompt": """Extract follow-up search queries from the evaluation. If no follow-up queries are needed, return an empty list.""",

    # Source Filtering: Selects most relevant sources
    "filter_prompt": """Evaluate each search result for relevance, accuracy, and information value
                       related to the research topic. At the end, you need to provide a list of
                       source numbers with the rank of relevance. Remove the irrelevant ones.""",
    # Source Filtering: Selects most relevant sources
    "source_parsing_prompt": """Extract the source list that should be included.""",

    # Answer Generation: Creates final research report
    "answer_prompt": """Create a comprehensive, publication-quality markdown research report based exclusively
                       on the provided sources. The report should include: title, introduction, analysis (multiple sections with insights titles)
                       and conclusions, references. Use proper citations (source with link; using \n\n \\[Ref. No.\\] to improve format),
                       organize information logically, and synthesize insights across sources. Include all relevant details while
                       maintaining readability and coherence. In each section, You MUST write in plain
                       paragraghs and NEVER describe the content following bullet points or key points (1,2,3,4... or point X: ...)
                       to improve the report readability."""
}

# Initialize Clients
# ------------------
# Initialize the Together and Tavily clients with your API keys
together_client = AsyncTogether(api_key= TOGETHER_API_KEY)
tavily_client = AsyncTavilyClient(api_key= TAVILY_API_KEY)

## Building the Research Pipeline

### STEP 1: Query Generation - Breaking Down the Research Question

In this first stage, we transform the broad research topic into specific, targeted search queries. This crucial step determines what information we'll gather and sets the foundation for the entire research process.


In [8]:
async def generate_initial_queries(topic: str, together_client: AsyncTogether, max_queries: int, planning_model: str, json_model: str, prompts: dict) -> List[str]:
    """Step 1: Generate initial research queries based on the topic"""
    queries = await generate_research_queries(topic, together_client, planning_model, json_model, prompts)
    if max_queries > 0:
        queries = queries[:max_queries]
    print(f"\n\nInitial queries: {queries}")

    if len(queries) == 0:
        print("ERROR: No initial queries generated")
        return []

    return queries

async def generate_research_queries(topic: str, together_client: AsyncTogether, planning_model: str, json_model: str, prompts: dict) -> list[str]:
    """Generate research queries for a given topic using LLM"""
    PLANNING_PROMPT = prompts["planning_prompt"]

    planning_response = await together_client.chat.completions.create(
        model=planning_model,
        messages=[
            {"role": "system", "content": PLANNING_PROMPT},
            {"role": "user", "content": f"Research Topic: {topic}"}
        ]
    )
    plan = planning_response.choices[0].message.content

    print(f"Generated plan: {plan}")

    SEARCH_PROMPT = prompts["plan_parsing_prompt"]

    json_response = await together_client.chat.completions.create(
        model=json_model,
        messages=[
            {"role": "system", "content": SEARCH_PROMPT},
            {"role": "user", "content": f"Plan to be parsed: {plan}"}
        ],
        response_format={"type": "json_object", "schema": ResearchPlan.model_json_schema()}
    )

    response_json = json_response.choices[0].message.content
    plan = json.loads(response_json)
    return plan["queries"]

**In `generate_research_queries` function:** We use a two-model approach where the planning model creates comprehensive research strategies, then the JSON model structures these into precisely formatted queries. This ensures both creative planning and reliable structured output.

**The `generate_initial_queries` function** wraps this process, applying query limits and providing additional validation and logging before returning the final set of targeted search queries.

Let's demonstrate this process with a practical example, using the research topic: "The impact of artificial intelligence on the manufacturing industry" This multifaceted subject provides an excellent case study for showing how our system breaks down complex topics into targeted queries.

**Note: In rare cases we saw none initial queries were generated, where rerunning the cell would typically resolve the issue.**

In [9]:
research_topic="The impact of artificial intelligence on the manufacturing industry"

initial_queries = await generate_initial_queries(
    topic=research_topic,
    together_client=together_client,
    max_queries=max_queries,
    planning_model=planning_model,
    json_model=json_model,
    prompts=prompts
)

Generated plan: To effectively research the impact of artificial intelligence (AI) on the manufacturing industry, you can break down the topic into several focused, specific, and self-contained queries. Each query will help gather relevant information that contributes to a comprehensive understanding of the topic. Here are the queries:

1. **Overview of AI in Manufacturing:**
   - What are the key applications of AI in the manufacturing industry?
   - How has the adoption of AI in manufacturing evolved over the past decade?

2. **Economic Impact:**
   - What are the economic benefits of AI in manufacturing, such as cost reduction and efficiency gains?
   - How has AI influenced the job market in the manufacturing sector, including job creation and displacement?

3. **Operational Efficiency:**
   - How does AI improve production efficiency and quality control in manufacturing?
   - What are the specific AI technologies (e.g., machine learning, robotics, predictive maintenance) used to e

Below are the targeted queries generated by our system, demonstrating how the general research topic has been decomposed into specific, searchable questions. Each query addresses a distinct aspect of AI's impact on the manufacturing industry:

In [10]:
initial_queries

['What are the key applications of AI in the manufacturing industry?',
 'How has the adoption of AI in manufacturing evolved over the past decade?']

### STEP 2: Information Gathering - Searching and Processing Web Content

Now that we have our targeted queries, we move to the information gathering phase. This step involves two critical processes:

1. **Web Search**: Executing our queries against web sources to retrieve relevant information
2. **Content Processing**: Filtering and summarizing the raw content to extract what's most relevant to our research, using an LLM (`summary_model`).

#### Let's first examine a single search operation (using [Tavily Search API](https://tavily.com/) as an example) to understand the core mechanics:

In [15]:
async def tavily_search(query: str, tavily_client: AsyncTavilyClient, prompts: dict, together_client: AsyncTogether, summary_model: str) -> SearchResults:
    """Perform a single Tavily search with rate-limited summarization"""
    print(f'Perform Tavily search with query: {query}')

    response = await tavily_client.search(query, include_raw_content=True)
    print(f"Tavily Responded with {len(response['results'])} results (Tavily returning None will be ignored for summarization)")

    RAW_CONTENT_SUMMARIZER_PROMPT = prompts["raw_content_summarizer_prompt"]

    # Create a list of tasks for summarization and store corresponding result info
    summarization_tasks = []
    result_info = []
    for result in response["results"]:
        if result["raw_content"] is None or result["raw_content"].strip() == "":
            continue
        task = summarize_content(result["raw_content"], query, RAW_CONTENT_SUMMARIZER_PROMPT, together_client, summary_model)
        summarization_tasks.append(task)
        result_info.append(result)

    # Execute tasks serially with a delay to avoid rate limits
    summarized_contents = []
    for task in summarization_tasks:
        try:
            summary = await task
            summarized_contents.append(summary)
        except Exception as e:
            print(f"Error while summarizing content: {e}")
            summarized_contents.append("Summary not available")
        await asyncio.sleep(1.2)  # Delay untuk menghindari melebihi rate limit (1 QPS)

    formatted_results = []
    for result, summarized_content in zip(result_info, summarized_contents):
        formatted_results.append(
            SearchResult(
                title=result["title"],
                link=result["url"],
                content=result["raw_content"],
                filtered_raw_content=summarized_content,
            )
        )
    return SearchResults(formatted_results)


async def summarize_content(raw_content: str, query: str, prompt: str, together_client: AsyncTogether, summary_model: str) -> str:
    """Summarize content asynchronously using the LLM with error handling"""
    print("Summarizing content asynchronously using the LLM")
    try:
        summarize_response = await together_client.chat.completions.create(
            model=summary_model,
            messages=[
                {"role": "system", "content": prompt},
                {"role": "user", "content": f"<Raw Content>{raw_content}</Raw Content>\n\n<Research Topic>{query}</Research Topic>"}
            ]
        )
        return summarize_response.choices[0].message.content
    except Exception as e:
        print(f"Error in summarize_content: {e}")
        return "Summary not available"

To demonstrate the search and content processing mechanism, let's execute a single search using the first query from our generated list. This allows us to examine in detail how the system retrieves and processes information before we implement the full multi-query research workflow:

In [16]:
search_results=await tavily_search(initial_queries[0], tavily_client, prompts, together_client, summary_model)

Perform Tavily search with query: What are the key applications of AI in the manufacturing industry?
Tavily Responded with 10 results (Tavily returning None will be ignored for summarization)
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM


Below is an example of the first retrieved and summarized result. Note how the system has not only found relevant content but also processed it to extract information specifically related to our research topic. This demonstrates the content filtering capabilities that make our research pipeline more effective than simple search retrieval:

In [17]:
print(f"Title: {search_results.results[0].title}\n\nLink: {search_results.results[0].link}\n\nContent: {search_results.results[0].filtered_raw_content[:1000]}[...]")

Title: AI in Manufacturing : Key Use Cases and benefits for 2025 and Beyond

Link: https://www.softlabsgroup.com/blogs/ai-in-manufacturing/

Content: ## Key Applications of AI in Manufacturing

The integration of Artificial Intelligence (AI) in the manufacturing industry is transforming operations, enhancing efficiency, and driving innovation. Here are the key applications of AI in manufacturing:

### 1. **Predictive Maintenance & Equipment Health Monitoring**
- **Application**: AI analyzes sensor data to predict equipment failures, reducing unexpected downtime and extending machine life.
- **Benefits**: Proactive maintenance scheduling, cost reduction, and increased equipment reliability.

### 2. **Automated Quality Inspection & Defect Detection**
- **Application**: AI-powered vision systems inspect production lines in real-time to detect defects and anomalies.
- **Benefits**: Minimized rework, reduced waste, and ensured high-quality output.

### 3. **Supply Chain & Inventory Optimiza

#### Executing the Complete Initial Search Process

Now that we understand how a single search works, let's implement the parallel search process that executes all our initial queries simultaneously. This approach significantly improves efficiency by:

1. Running multiple search operations concurrently
2. Processing all results in parallel
3. Combining findings into a comprehensive result set


**Disclaimer: This step involves multiple LLM runs processing long web texts, so we remind the user to be mindful about the resource management as defined at the beginning.**

This demonstrates a key advantage of our functional pipeline architecture:

In [18]:
async def perform_search(queries: List[str], tavily_client: AsyncTavilyClient, prompts: dict, together_client: AsyncTogether, summary_model: str) -> SearchResults:
    """Execute searches for all queries in parallel"""
    tasks = [tavily_search(query, tavily_client, prompts, together_client, summary_model) for query in queries]
    results_list = await asyncio.gather(*tasks)

    combined_results = SearchResults([])
    for results in results_list:
        combined_results = combined_results + results

    combined_results_dedup=combined_results.dedup()
    print(f"Search complete, found {len(combined_results_dedup.results)} results after deduplication")
    return combined_results_dedup

In [19]:
initial_results=await perform_search(initial_queries, tavily_client, prompts, together_client, summary_model)

Perform Tavily search with query: What are the key applications of AI in the manufacturing industry?
Perform Tavily search with query: How has the adoption of AI in manufacturing evolved over the past decade?
Tavily Responded with 10 results (Tavily returning None will be ignored for summarization)
Summarizing content asynchronously using the LLM
Tavily Responded with 5 results (Tavily returning None will be ignored for summarization)
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content asynchronously using the LLM
Summarizing content as

Below is the complete set of results from our parallel search operations, consolidated into a single collection. This demonstrates how our system gathers diverse information across multiple queries while maintaining organization. The results include sources addressing different aspects of our research topic, providing a foundation for comprehensive analysis.

In [20]:
print(f"First 10000 characters of {len(initial_results.results)} results:\n\n {str(initial_results)[:10000]}\n\n...\n\n Last 10000 characters of {len(initial_results.results)} results:\n\n {str(initial_results)[-10000:]}\n\n")

First 10000 characters of 7 results:

 [1] Title: AI in Manufacturing : Key Use Cases and benefits for 2025 and Beyond
Link: https://www.softlabsgroup.com/blogs/ai-in-manufacturing/
Refined Content: ## Key Applications of AI in Manufacturing

The integration of Artificial Intelligence (AI) in the manufacturing industry is transforming operations, enhancing efficiency, and driving innovation. Here are the key applications of AI in manufacturing:

### 1. **Predictive Maintenance & Equipment Health Monitoring**
- **Application**: AI analyzes sensor data to predict equipment failures, reducing unexpected downtime and extending machine life.
- **Benefits**: Proactive maintenance scheduling, cost reduction, and increased equipment reliability.

### 2. **Automated Quality Inspection & Defect Detection**
- **Application**: AI-powered vision systems inspect production lines in real-time to detect defects and anomalies.
- **Benefits**: Minimized rework, reduced waste, and ensured high-quality ou

### STEP 3: Iterative Refinement - Addressing Information Gaps

One of the most powerful aspects of our research pipeline is its ability to identify information gaps and adaptively refine the research process. Rather than relying solely on initial queries, the system:

1. Evaluates the completeness of currently gathered information
2. Identifies specific knowledge gaps related to the research topic
3. Generates targeted follow-up queries to fill those gaps
4. Iteratively expands the research until sufficient information is obtained

**Disclaimer: This step involves multiple LLM runs processing long web texts, so we remind the user to be mindful about the resource management as defined at the beginning.**

This iterative approach mimics how expert researchers work - continuously assessing what's known, what's missing, and how to fill those gaps systematically. Let's examine how this process works:

In [21]:
async def conduct_iterative_research(topic: str, initial_results: SearchResults, all_queries: List[str],
                                  budget: int, max_queries: int, tavily_client: AsyncTavilyClient, together_client: AsyncTogether,
                                  planning_model: str, json_model: str, summary_model: str, prompts: dict) -> tuple[SearchResults, List[str]]:
    """
    Conduct iterative research within budget to refine results.

    Args:
        topic: The research topic
        initial_results: Results from initial search
        all_queries: List of all queries used so far
        budget: Maximum number of follow-up iterations
        max_queries: Maximum number of queries to use per iteration
        tavily_client: The Tavily client for web search
        together_client: The Together AI client for LLM operations
        planning_model: Model to use for evaluation
        json_model: Model to use for JSON parsing
        summary_model: Model to use for summarization
        prompts: Dictionary of prompt templates

    Returns:
        Tuple of (final results, all queries used)
    """
    results = initial_results

    for _ in range(0, budget):
        # Evaluate if more research is needed using the independent function
        additional_queries = await evaluate_research_completeness(
            topic, results, all_queries, together_client, planning_model, json_model, prompts
        )

        # Exit if research is complete
        if not additional_queries:
            print("No need for additional research")
            break

        # Limit the number of queries if needed
        if max_queries > 0:
            additional_queries = additional_queries[:max_queries]
        print("================================================\n\n")
        print(f"Additional queries from evaluation parser: {additional_queries}\n\n")
        print("================================================\n\n")

        # Expand research with new queries
        new_results = await perform_search(
            additional_queries,
            tavily_client,
            prompts,
            together_client,
            summary_model
        )

        results = results + new_results
        all_queries.extend(additional_queries)

    return results, all_queries

async def evaluate_research_completeness(topic: str, results: SearchResults, queries: List[str],
                                       together_client: AsyncTogether, planning_model: str, json_model: str, prompts: dict) -> list[str]:
    """
    Evaluate if the current search results are sufficient or if more research is needed.

    Args:
        topic: The research topic
        results: Current search results
        queries: List of queries already used
        together_client: The Together AI client for LLM operations
        planning_model: Model to use for evaluation
        json_model: Model to use for JSON parsing
        prompts: Dictionary of prompt templates

    Returns:
        List of additional queries needed or empty list if research is complete
    """
    # Format the search results for the LLM
    formatted_results = str(results)

    EVALUATION_PROMPT = prompts["evaluation_prompt"]

    evaluation_response = await together_client.chat.completions.create(
        model=planning_model,
        messages=[
            {"role": "system", "content": EVALUATION_PROMPT},
            {"role": "user", "content": (
                f"<Research Topic>{topic}</Research Topic>\n\n"
                f"<Search Queries Used>{queries}</Search Queries Used>\n\n"
                f"<Current Search Results>{formatted_results}</Current Search Results>"
            )}
        ]
    )
    evaluation = evaluation_response.choices[0].message.content

    print("================================================\n\n")
    print(f"Evaluation:\n\n {evaluation}")

    EVALUATION_PARSING_PROMPT = prompts["evaluation_parsing_prompt"]

    json_response = await together_client.chat.completions.create(
        model=json_model,
        messages=[
            {"role": "system", "content": EVALUATION_PARSING_PROMPT},
            {"role": "user", "content": f"Evaluation to be parsed: {evaluation}"}
        ],
        response_format={"type": "json_object", "schema": ResearchPlan.model_json_schema()}
    )

    response_json = json_response.choices[0].message.content
    evaluation = json.loads(response_json)
    return evaluation["queries"]

In [22]:
results, all_queries=await conduct_iterative_research(topic=research_topic, initial_results=initial_results, all_queries=initial_queries, budget=budget, max_queries=max_queries, tavily_client=tavily_client, together_client=together_client, planning_model=planning_model, json_model=json_model, summary_model=summary_model, prompts=prompts)



Evaluation:

 ### Analysis of Search Results

#### Key Findings:
1. **Key Applications of AI in Manufacturing:**
   - **Predictive Maintenance:** AI is used to predict equipment failures, reducing downtime and maintenance costs.
   - **Quality Control:** AI-powered vision systems detect defects and ensure high-quality products.
   - **Supply Chain Management:** AI optimizes inventory levels, demand forecasting, and logistics.
   - **Automation and Robotics:** AI enables cobots to work alongside humans, enhancing productivity and safety.
   - **Digital Twins:** AI simulates production processes to optimize workflows and predict performance.
   - **Energy Management:** AI optimizes energy consumption, reducing costs and improving sustainability.
   - **Custom Manufacturing:** AI allows for mass customization without slowing down production.

2. **Evolution of AI Adoption:**
   - **Past Decade:** The adoption of AI has seen significant growth, particularly with the advent of Industry 4.

Below is the comprehensive collection of all queries executed and results gathered throughout our research process. This dataset represents:

1. Initial queries generated from the research topic
2. Follow-up queries identified through gap analysis
3. All retrieved content after preprocessing and relevance filtering
4. Source metadata for proper attribution

This consolidated view demonstrates the breadth and depth of information our system has gathered, providing the foundation for our final research synthesis:

In [23]:
all_queries

['What are the key applications of AI in the manufacturing industry?',
 'How has the adoption of AI in manufacturing evolved over the past decade?',
 'What are specific case studies of AI adoption in manufacturing with detailed metrics on cost savings, efficiency improvements, and reduction in downtime?',
 'What are the exact percentage improvements in production efficiency and quality due to AI in manufacturing?',
 'What are the long-term environmental and economic impacts of AI adoption in manufacturing?',
 'How do manufacturers ensure the sustainability of AI-driven processes over time?']

In [24]:
print(f"First 10000 characters of {len(results.results)} results:\n\n {str(results)[:10000]}\n\n...\n\n Last 10000 characters of {len(results.results)} results:\n\n {str(results)[-10000:]}\n\n")

First 10000 characters of 20 results:

 [1] Title: AI in Manufacturing : Key Use Cases and benefits for 2025 and Beyond
Link: https://www.softlabsgroup.com/blogs/ai-in-manufacturing/
Refined Content: ## Key Applications of AI in Manufacturing

The integration of Artificial Intelligence (AI) in the manufacturing industry is transforming operations, enhancing efficiency, and driving innovation. Here are the key applications of AI in manufacturing:

### 1. **Predictive Maintenance & Equipment Health Monitoring**
- **Application**: AI analyzes sensor data to predict equipment failures, reducing unexpected downtime and extending machine life.
- **Benefits**: Proactive maintenance scheduling, cost reduction, and increased equipment reliability.

### 2. **Automated Quality Inspection & Defect Detection**
- **Application**: AI-powered vision systems inspect production lines in real-time to detect defects and anomalies.
- **Benefits**: Minimized rework, reduced waste, and ensured high-quality o

### STEP 4: Content Filtering and Prioritization - Optimizing Information Quality

After gathering comprehensive information through our iterative search process, we face a new challenge: not all content is equally valuable or relevant. This step addresses the critical task of quality control by:

1. Evaluating each source for relevance, credibility, and information value
2. Prioritizing the most pertinent content while filtering out noise
3. Organizing information to create an optimal foundation for synthesis

This filtering process improves the quality of our final research output by ensuring we synthesize only the most valuable information. The system acts as a discerning research assistant, making informed judgments about which sources deserve attention:

In [25]:
async def filter_results(topic: str, results: SearchResults, together_client: AsyncTogether, json_model: str, max_sources: int, prompts: dict) -> tuple[SearchResults, SourceList]:
    """
    Filter and rank search results based on relevance to the research topic.

    Args:
        topic: The research topic
        results: Search results to filter
        together_client: The Together AI client for LLM operations
        json_model: Model to use for filtering
        max_sources: Maximum number of sources to keep (-1 for unlimited)
        prompts: Dictionary of prompt templates

    Returns:
        Tuple of (filtered results, source list with indices)
    """
    # Format the search results for the LLM, without the raw content
    formatted_results = results.short_str()

    FILTER_PROMPT = prompts["filter_prompt"]

    SOURCE_PARSING_PROMPT = prompts['source_parsing_prompt']

    llm_filter_response = await together_client.chat.completions.create(
        model=json_model,
        messages=[
            {"role": "system", "content": FILTER_PROMPT},
            {"role": "user", "content": (
                f"<Research Topic>{topic}</Research Topic>\n\n"
                f"<Current Search Results>{formatted_results}</Current Search Results>"
            )}
        ]
    )

    llm_filter_response_content = llm_filter_response.choices[0].message.content
    print(f'filter response: {llm_filter_response_content}')

    json_response = await together_client.chat.completions.create(
        model=json_model,
        messages=[
            {"role": "system", "content": SOURCE_PARSING_PROMPT},
            {"role": "user", "content": f"<FILTER_RESPONSE>{llm_filter_response_content}</FILTER_RESPONSE>"}
        ],
        response_format={"type": "json_object", "schema": SourceList.model_json_schema()}
    )

    response_json = json_response.choices[0].message.content
    evaluation = json.loads(response_json)
    sources = evaluation["sources"]

    print(f'sources ranked by relevance {sources} (we will keep maximum of {max_sources} sources, as defined by the user)')

    if max_sources > 0:
        sources = sources[:max_sources]

    # Filter the results based on the source list
    filtered_results = [results.results[i] for i in sources if i < len(results.results)]

    return SearchResults(filtered_results), sources

async def process_search_results(topic: str, results: SearchResults, together_client: AsyncTogether, json_model: str, max_sources: int, prompts: dict) -> SearchResults:
    """Step 4: Process search results by deduplicating and filtering"""
    # Deduplicate results
    results = results.dedup()
    print(f"Deduplication complete, kept {len(results.results)} results")

    # Filter results
    filtered_results, sources = await filter_results(topic, results, together_client, json_model, max_sources, prompts)
    print(f"LLM Filtering complete, kept {len(filtered_results.results)} results")

    return filtered_results

**Note: In rare cases we saw none results were kept after LLM filtering, where rerunning the cell would typically resolve the issue.**

In [26]:
processed_results=await process_search_results(research_topic, results, together_client, json_model, max_sources, prompts)

Deduplication complete, kept 18 results
filter response: Based on the provided search results, I have evaluated each result for relevance, accuracy, and information value related to the research topic "The impact of artificial intelligence on the manufacturing industry." Here is the list of source numbers with their rank of relevance:

**Highly Relevant (1-5)**

1. [4] Title: How is AI being used in Manufacturing | IBM - This article provides a comprehensive overview of AI applications in manufacturing, including its benefits and potential use cases.
2. [6] Title: Artificial Intelligence in manufacturing: State of the art ... - This article presents a detailed analysis of AI in manufacturing, covering its current state, perspectives, and future directions.
3. [8] Title: Stats & Impact of AI in Manufacturing [2025] - This article provides statistics and insights on the impact of AI in manufacturing, including its growth rate and potential benefits.
4. [12] Title: AI-Powered Innovations 

### STEP 5: Research Synthesis - Generating the Comprehensive Final Report

We've now reached the culmination of our research process. After carefully gathering, refining, and filtering information through multiple stages, we're ready to synthesize everything into a cohesive, publication-quality report. This final step:

1. Integrates information from all selected sources into a unified narrative
2. Structures content logically with clear sections and progression of ideas
3. Provides proper citations for all factual claims and insights
4. Balances breadth and depth to create a thorough yet readable analysis
5. Preserves nuance and context from the original sources

The synthesis process uses our specialized answer model, which excels at long-form, coherent content generation while maintaining factual accuracy. Let's generate our final research report:

In [27]:
def remove_thinking_tags_from_answer(answer: str) -> str:
    """Remove content within <think> tags"""
    while "<think>" in answer and "</think>" in answer:
        start = answer.find("<think>")
        end = answer.find("</think>") + len("</think>")
        answer = answer[:start] + answer[end:]
    return answer

async def generate_research_answer(topic: str, results: SearchResults, together_client: AsyncTogether, answer_model: str, prompts: dict, max_tokens: int, remove_thinking_tags: bool = True) -> str:
    """
    Generate a comprehensive answer to the research topic based on the search results.

    Args:
        topic: The research topic
        results: Filtered search results to use for answer generation
        together_client: The Together AI client for LLM operations
        answer_model: Model to use for answer generation
        prompts: Dictionary of prompt templates
        max_tokens: Maximum number of tokens in the answer
        remove_thinking_tags: Whether to remove <think> tags from the answer

    Returns:
        Detailed research answer as a string
    """
    formatted_results = str(results)

    ANSWER_PROMPT = prompts["answer_prompt"]

    answer_response = await together_client.chat.completions.create(
        model=answer_model,
        messages=[
            {"role": "system", "content": ANSWER_PROMPT},
            {"role": "user", "content": f"Research Topic: {topic}\n\nSearch Results:\n{formatted_results}"}
        ],
        max_tokens=max_tokens
    )

    answer = answer_response.choices[0].message.content

    # Remove <think> tokens for reasoning models
    if remove_thinking_tags:
        answer = remove_thinking_tags_from_answer(answer)

    # Handle potential error cases
    if answer is None or not isinstance(answer, str):
        print("ERROR: No answer generated")
        return "No answer generated"

    return answer.strip()

In [28]:
research_answer = await generate_research_answer(research_topic, processed_results, together_client, answer_model, prompts, max_tokens)

In [29]:
from IPython.display import Markdown
display(Markdown(research_answer))

The Impact of Artificial Intelligence on the Manufacturing Industry
===========================================================

Introduction
------------

The manufacturing industry is undergoing a significant transformation with the integration of Artificial Intelligence (AI). AI is revolutionizing production processes, enhancing efficiency, and reducing costs. This report provides an overview of the impact of AI on the manufacturing industry, highlighting key applications, benefits, and future outlook.

Analysis
--------

### Key Applications of AI in Manufacturing

AI is being applied in various areas of manufacturing, including predictive maintenance, quality control, automation and robotics, supply chain optimization, and engineering collaboration and workflow management [1]. Predictive maintenance uses AI-driven algorithms to monitor equipment performance, predict failures, and reduce unplanned downtime. Quality control is enhanced through advanced machine vision systems powered by deep learning algorithms, detecting defects and deviations with high accuracy.

The adoption of AI in manufacturing has led to significant improvements in production efficiency and quality. For instance, AI-powered predictive maintenance can reduce unexpected downtimes by up to 50% [4]. Additionally, AI-driven quality control can reduce quality inspection errors by up to 95% and quality variability by up to 45% [5].

### Evolution of AI Adoption in Manufacturing

The adoption of AI in manufacturing has evolved significantly over the past decade, with leading manufacturers pushing the boundaries of AI adoption and scaling impact across entire production networks [2]. The maturity of AI has reached unprecedented levels, empowering machines with specialized intelligence to perform complex tasks. The Global Lighthouse Network has made significant progress in AI adoption, with 60% of Lighthouses implementing AI-based use cases in 2023, up from 11% in 2019.

### Case Studies of AI Adoption in Manufacturing

Several case studies demonstrate the benefits of AI adoption in manufacturing. For example, Siemens AG implemented AI-driven predictive maintenance, reducing unplanned downtime by 50% and saving €10 million annually in maintenance costs [3]. General Electric (GE) adopted AI-powered Predix platform, reducing unexpected downtime by 40% and cutting maintenance expenses by 20%. Other companies, such as Toyota Motor Corporation, Boeing, and Intel Corporation, have also achieved significant benefits through AI adoption.

### Benefits and Future Outlook

The adoption of AI in manufacturing has the potential to revolutionize production processes, enhance efficiency, and reduce costs. As AI continues to evolve, it is essential for manufacturers to upskill their workforce to effectively collaborate with intelligent systems and navigate ethical considerations surrounding data privacy and job displacement [1]. The future outlook for AI in manufacturing is promising, with potential benefits including increased efficiency, cost reduction, improved decision-making, and increased safety.

Conclusions
----------

The impact of AI on the manufacturing industry is significant, with key applications in predictive maintenance, quality control, automation and robotics, supply chain optimization, and engineering collaboration and workflow management. The adoption of AI has led to substantial improvements in production efficiency and quality, with benefits including cost savings, reduced downtime, and improved product quality. As AI continues to evolve, manufacturers must be prepared to adapt and leverage its potential to remain competitive.

References
----------

[1] The Rise of AI in Manufacturing: 2025 Trends, Tools & Real-World Impact. https://www.authentise.com/post/the-rise-of-ai-in-manufacturing-2025-trends-tools-real-world-impact

[2] Adopting AI in manufacturing at speed and scale | McKinsey. https://www.mckinsey.org/capabilities/operations/our-insights/adopting-ai-at-speed-and-scale-the-4ir-push-to-stay-competitive

[3] How can AI be Used in Manufacturing? [15 Case Studies] [2025]. https://digitaldefynd.com/IQ/ai-use-in-manufacturing-case-studies/

[4] AI in Manufacturing in 2025 - thoughtminds.io. https://www.thoughtminds.io/ai-in-manufacturing/

[5] How artificial intelligence is reshaping manufacturing operations from ... https://omdia.tech.informa.com/blogs/2025/march/how-artificial-intelligence-is-reshaping-manufacturing-operations-from-the-factory-floor-to-the-cloud

[6] AI in Manufacturing: 12 key Use Cases, Examples, and more - MultiQoS. https://multiqos.com/blogs/ai-in-manufacturing/

[7] AI in Manufacturing: The Smart Revolution in Industry. https://sigmatechnology.com/articles/the-application-of-ai-in-manufacturing/

[8] How is AI being used in Manufacturing | IBM. https://www.ibm.com/think/topics/ai-in-manufacturing

[9] Artificial Intelligence in manufacturing: State of the art .... https://www.sciencedirect.com/science/article/pii/S000785062400115X

[10] Eight AI Case Studies Demonstrate the Potential of AI in Manufacturing. https://rstartec.com/insights/eight-ai-case-studies-demonstrate-the-potential-of-ai-in-manufacturing/

#### Exporting and Sharing Research Results

The final step in our research pipeline is to format and export the results for sharing, preservation, and presentation (e.g., html, pdf, slides, etc). Enjoy!


In [30]:
import markdown

# Convert to HTML
report_html = markdown.markdown(research_answer)

filename="research_report.html"
with open(filename, "w", encoding="utf-8") as f:
    f.write(report_html)

print(f"Research report saved as {filename}")


Research report saved as research_report.html


## Limitations and Considerations

While the Deep Search approach offers powerful research capabilities, it's important to understand its limitations and constraints:

### Information Constraints
- **Recency Gap**: Search-based research is limited by the indexing latency of search engines. Very recent events or publications may not be captured.
- **Access Limitations**: Some valuable information sources may be behind paywalls or otherwise inaccessible to the search system.
- **Domain Specificity**: Highly specialized or technical domains may have limited indexing in general search engines.

### Processing Considerations
- **Query Dependency**: The quality of results is heavily influenced by the initial and follow-up queries generated. Poorly formulated queries can lead to information gaps.
- **Source Reliability**: While the system attempts to prioritize credible sources, it cannot perfectly assess the accuracy or bias of all content.
- **Content Extraction Challenges**: Complex formats like tables, charts, or specialized notation may not be accurately captured in the content processing stage.

### Resource Requirements
- **API Costs**: The system makes multiple calls to search and LLM APIs, which can incur significant costs for extensive research topics.
- **Time Efficiency**: The iterative, multi-step nature of the process means it takes longer than simple RAG or direct LLM approaches.
- **Model Dependencies**: The quality of outputs depends on having access to capable LLM models for different stages of the pipeline.

### Ethical Considerations
- **Source Attribution**: While the system attempts to provide proper citations, it may not always perfectly attribute information to original creators.
- **Content Bias**: The system may inadvertently amplify biases present in the indexed content or search ranking algorithms.
- **Privacy Implications**: Research on certain topics may involve information with privacy sensitivities.

Understanding these limitations helps set appropriate expectations and identify situations where this approach should be complemented with other research methodologies. For critical research needs, human review and validation remain essential.

## Acknowledgements

This cookbook relies on several key technologies and services that make advanced research capabilities possible:

**[Tavily Search API](https://tavily.com/)**: Our cookbook leverages the Tavily API for efficient, high-quality web search functionality. Tavily provides comprehensive search results with raw content access, enabling our system to perform detailed content analysis. We are grateful to the Tavily team for providing this essential capability that forms the foundation of our information gathering process.

**[Together AI](https://together.ai/)**: The LLM-powered components of our system are built using Together AI's API, which provides access to state-of-the-art language models optimized for different aspects of the research pipeline. Together AI's infrastructure enables the sophisticated analysis, planning, and synthesis capabilities that drive our research system.

**Open Source Community**: This work builds upon numerous open-source libraries, tools, and research that have advanced the fields of natural language processing, information retrieval, and AI-assisted research. We acknowledge the broader community whose ongoing contributions make projects like this possible.

If you use this cookbook in your own work or research, please consider acknowledging both Tavily and Together AI for their enabling technologies.


## Related

Open source developers across the tech landscape have launched several meaningful attempts to recreate the capabilities of OpenAI's Deep Researcher. Some notable ones are as follows:

* [Huggingface](https://huggingface.co/blog/open-deep-research)
* [Open Deep Research](https://github.com/nickscamara/open-deep-research)