# Search Agent Cookbook with Seed1.8

This cookbook implements a **Search Agent** with **medium reasoning effort** using the **Seed1.8 model**. It integrates web search and link reading tools to retrieve, summarize, and answer queries.

### Core Features

1. Single-Turn Inference
- Processes a single user query and provides a response.
- Supports tool integration for online search and content summarization.

2. Multi-Turn Inference
- Handles multi-step reasoning to accomplish complex tasks.
- Manages conversation history using a memory condenser to optimize performance.

3. Memory Management
- Implements a FIFO-based memory condenser to retain the most appropriate conversation history during multi-turn inference.

4. Force Answer
- When max steps is reached, the agent will be forced to provide an answer.

### 0. Prerequisites
- API Key: [Obtain an API key](https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey) for the Seed1.8 model.
- Model Endpoint: Ensure the base URL of the Ark API.
- Activate the Model: Activate the Seed1.8 model in [the Ark Console](https://console.volcengine.com/ark/region:ark+cn-beijing/openManagement).
- Install Dependencies: Install required Python packages.


In [None]:
pip install -r ../requirements.txt

In [2]:
# Copyright 2025 Bytedance Ltd. and/or its affiliates.
# SPDX-License-Identifier: Apache-2.0

# Please set the API key here
import os

os.environ['ARK_API_KEY']  = 'your_ark_api_key'
os.environ['ARK_MODEL_ENDPOINT'] = "doubao-seed-1-8-251215"
base_url = "https://ark.cn-beijing.volces.com/api/v3"

### 1. Define Tool Schemas
Define the schemas for the tools available to the agent, specifically for web search and reading URL content. These schemas provide a structured description of each tool's function and parameters, enabling you to understand how to call and use these tools effectively to fulfill user requests.

In [4]:
# tool schemas
tool_schemas = [
    {
        "type": "function",
        "function": {
            "name": "Search",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    }
                },
                "required": [
                    "query"
                ]
            },
            "description": "This is an online search tool. Input a search query, and it returns a list of web pages with corresponding summary information."
        }
    },
    {
        "type": "function",
        "function": {
            "name": "Open_url",
            "description": "This is a link browsing tool that can open links (such as web pages, PDFs, etc.) and summarize all relevant information on the page based on the description of the requirements. It is recommended to call this tool for all valuable links to obtain information. Valuable links include but are not limited to: 1. URLs explicitly provided in the task; 2. URLs with relevant summaries provided in search results; 3. URLs contained in the content returned by previous Open_url calls that are judged to possibly contain useful information. Please avoid constructing links out of thin air as much as possible.",
            "parameters": {
                "type": "object",
                "properties": {
                    "description": {
                        "type": "string",
                        "description": "Requirement description text, detailing the content you want to obtain within the current URL."
                    },
                    "url": {
                        "type": "string",
                        "description": "Target link, should be a complete URL (starting with http)."
                    }
                },
                "required": ["description", "url"]
            }
        }
    }
]

### 2. Prepare Inputs

Prepare all the necessary inputs for the model call. 

In [None]:

import json
import os
import requests

def single_turn_inference(messages, tool_schemas=[]):

    headers = {
        'Authorization': f'Bearer {api_key}',
        'Content-Type': 'application/json'
    }

    data = {
        "model": model,
        "messages": messages,
        "max_tokens": 4096,
        "top_p": 0.9,
        "tools": tool_schemas,
        "temperature": 0.0,
        "reasoning_effort": "medium"
    }
    
    response = requests.post(base_url, headers=headers, json=data)
    try:
        print(f"Response: {response.json()}")
    except:
        print("Error fetching response.json:")
        print(response)
        
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Request failed with status code {response.status_code}")
        print(response.json())
        return {
            "error": f"Request failed with status code {response.status_code}",
            "details": response.text
        }
        
base_url = "https://ark.cn-beijing.volces.com/api/v3/chat/completions"
api_key = os.getenv('ARK_API_KEY')
model = os.getenv('ARK_MODEL_ENDPOINT')

messages = [
    {
        "role": "system",
        "content": "You are an agent designed to accomplish tasks.\n\nYour goal is to provide a comprehensive, objective, and well-researched answer to the user's query. You have access to multiple tools. Please read each tool's description and parameters carefully. There is no limit on how many times tools can be used. Please reason and respond in the same language as the user's question, unless the user explicitly requests otherwise. Put your final answer between <answer> and </answer>."
    },
    {
        "role": "user",
        "content": "Search weather in Beijing"
    }
]

### 3. Single-Turn Inference

Demonstrates single-turn interaction with the model. It involves sending the prepared inputs to the model, which then decides whether to call a tool or provide a direct answer. 


In [6]:
# Single-turn Model inference
response = single_turn_inference(messages, tool_schemas)

# Output model prediction results
print("\n" + "=" * 40)
print("üåü üåü üåü  **Model Prediction**  üåü üåü üåü")
print("=" * 40)
print(json.dumps(response, indent=4))  # Format JSON output

# Parse tool calls
actions = []
print("\n" + "=" * 40)
print("üîß **Tool Calls** üîß")
print("=" * 40)
for tool_call in response["choices"][0]["message"]["tool_calls"]:
    print("\nüìå Tool Call:")
    print(json.dumps(tool_call, indent=4, ensure_ascii=False))  # Format JSON output
    function_name = tool_call["function"]["name"]
    function_arguments = tool_call["function"]["arguments"]
    function_arguments = json.loads(function_arguments)
    actions.append(
        {
            "action_type": function_name,
            "action_inputs": function_arguments
        }
    )

Response: {'choices': [{'finish_reason': 'tool_calls', 'index': 0, 'logprobs': None, 'message': {'content': '', 'reasoning_content': 'Got it, the user wants to search the weather in Beijing. First, I need to use the Search tool with the query "weather in Beijing". Let me make sure the parameters are correct, the query should be exactly that. Alright, let\'s call the Search function now.', 'role': 'assistant', 'tool_calls': [{'function': {'arguments': '{"query":"weather in Beijing"}', 'name': 'Search'}, 'id': 'call_15e4fxv7ek54guj4q4f9fgsm', 'type': 'function'}]}}], 'created': 1765521711, 'id': '021765521709307cd417e3a5d5f9b5e363d704a28c424921c4382', 'model': 'doubao-seed-1-8-preview-251115', 'service_tier': 'default', 'object': 'chat.completion', 'usage': {'completion_tokens': 86, 'prompt_tokens': 670, 'total_tokens': 756, 'prompt_tokens_details': {'cached_tokens': 0}, 'completion_tokens_details': {'reasoning_tokens': 58}}}

üåü üåü üåü  **Model Prediction**  üåü üåü üåü
{
    "c

### 4. Multi-Turn Inference

Operates within a loop, typically for a set number of steps (`max_steps`). In each turn, the model assesses the conversation history and decides whether to provide a final answer or call a tool (e.g., `Search`, `Open_url`) to gather more information. The tool's output is then added to the history, enriching the context for the next turn. To manage context length, a `FIFOCondenser` is used to discard older, less relevant interactions, simulating a short-term memory.

This iterative ‚Äúthink, act, observe‚Äù cycle mirrors human problem-solving, leading to more intelligent and robust behavior. It allows the agent to recover from failed steps and adjust its strategy, significantly improving task success rates.


In [None]:
# Multi-turn Model inference

class FIFOCondenser:
    def __init__(self, max_queue_length: int = 9999, min_queue_length: int = 1) -> None:
        super().__init__()
        self.max_queue_length = max_queue_length
        self.min_queue_length = min_queue_length

    def condense(
            self, conversations: list[dict], **kwargs
    ) -> list[dict]:
        if len(conversations) <= self.max_queue_length:
            return conversations

        new_conversations = []
        num_2b_popped = self.max_queue_length - self.min_queue_length
        for conversation_id, conversation in enumerate(conversations):
            if conversation["role"] in ["system", "user"]:
                new_conversations.append(conversation)
                continue
            if num_2b_popped > 0:
                num_2b_popped -= 1
                continue
            new_conversations.append(conversation)

        return new_conversations


def multi_turn_inference(instruction, max_steps=5):
    """
    Perform multi-turn inference to accomplish a task based on the given instruction.

    Args:
        instruction (str): The user's query or task instruction.
        max_steps (int): Maximum number of steps allowed for the inference process.

    Returns:
        str: The final answer provided by the agent.
    """
    # Initialize the conversation with system and user messages
    messages = [
        {
            "role": "system",
            "content": (
                "You are an agent designed to accomplish tasks.\n\n"
                "Your goal is to provide a comprehensive, objective, and well-researched answer to the user's query. "
                "You have access to multiple tools. Please read each tool's description and parameters carefully. "
                "There is no limit on how many times tools can be used. Please reason and respond in the same language "
                "as the user's question, unless the user explicitly requests otherwise. "
                "Put your final answer between <answer> and </answer>."
            )
        },
        {
            "role": "user",
            "content": instruction
        }
    ]

    # Initialize variables
    finished = False
    cur_step = 0
    answer = None
    fifo_condenser = FIFOCondenser(max_queue_length=30, min_queue_length=20)  # Short-term memory condenser, the best practice is max_queue_length=30, min_queue_length=20

    while not finished and cur_step < max_steps:
        print(f"\n{'=' * 40}")
        print(f"üîÑ Step {cur_step + 1}/{max_steps}")
        print(f"{'=' * 40}")

        # If it's the last step, force the model to output the final answer
        if cur_step + 1 == max_steps:
            messages.append(
                {
                    "role": "user",
                    "content": (
                        "You must absolutely not perform any FunctionCall, tool invocation, Search, Open_url, code execution, or similar actions. "
                        "You can only answer the original question based on the information already retrieved and your own internal knowledge. "
                        "Please put your final answer only inside <answer>...</answer> tags, and do not output anything else outside the tags. "
                        "If you attempt to call any tool or FunctionCall, it will be considered a mistake."
                    )
                }
            )

        # Condense short-term memory to manage message history
        condensed_messages = fifo_condenser.condense(messages)

        # Perform single-turn inference
        response = single_turn_inference(condensed_messages, tool_schemas)
        messages.append(response["choices"][0]["message"])  # Append the model's response to the conversation history

        # Check the finish reason to determine the next action
        finish_reason = response["choices"][0]["finish_reason"]
        if finish_reason == "stop":
            # The model has provided the final answer
            finished = True
            answer = response["choices"][0]["message"]["content"].replace("</answer>", "")
            print("‚úÖ Final answer received.")
        elif finish_reason == "tool_calls":
            # The model has requested tool usage
            tool_calls = response["choices"][0]["message"]["tool_calls"]
            tool_response = ""
            for tool_call in tool_calls:
                tool_name = tool_call["function"]["name"]
                tool_args = json.loads(tool_call["function"]["arguments"])
                tool_call_id = tool_call["id"]
                # Mock tool response (replace with actual tool execution if needed)
                tool_response += f"üîß Tool response from: {tool_name}({json.dumps(tool_args, ensure_ascii=False)})\n"
            messages.append({"role": "tool", "content": tool_response, "tool_call_id": tool_call_id})
            print(f"üîß Tool call made: {tool_response}")
        else:
            # Handle unknown finish reasons
            print(f"‚ö†Ô∏è Unknown finish_reason: {finish_reason}")
            break

        # Print the response for debugging
        print(f"üìù Response: {json.dumps(response, indent=4, ensure_ascii=False)}")
        cur_step += 1

    # Final output
    if finished:
        assert answer is not None
        print(f"\nüéâ Task completed in {cur_step} steps!")
        print(f"‚úÖ Final Answer: {answer}")
    else:
        print(f"\n‚ùå Failed to complete the task within {cur_step} steps.")

    return answer


# Example usage
answer = multi_turn_inference("Search the history news about the stock market. If the tool is not working, answer with your own knowledge.")


üîÑ Step 1/5
Response: {'choices': [{'finish_reason': 'tool_calls', 'index': 0, 'logprobs': None, 'message': {'content': '', 'reasoning_content': 'Got it, let\'s tackle this query about historical stock market news. First, I should start by searching for key historical stock market events and their related news. Let me use the Search tool with a query like "major historical stock market news events timeline" to get a good overview.', 'role': 'assistant', 'tool_calls': [{'function': {'arguments': '{"query":"major historical stock market news events timeline"}', 'name': 'Search'}, 'id': 'call_n1u1lyqx1wxzakbko9aapxzx', 'type': 'function'}]}}], 'created': 1765521713, 'id': '021765521711773dae410d06a911c4f7bd0a80e7e433f8a6939f9', 'model': 'doubao-seed-1-8-preview-251115', 'service_tier': 'default', 'object': 'chat.completion', 'usage': {'completion_tokens': 90, 'prompt_tokens': 688, 'total_tokens': 778, 'prompt_tokens_details': {'cached_tokens': 0}, 'completion_tokens_details': {'reasoni