<a href="https://colab.research.google.com/github/Git-Godssoldier/GenAI_Agents/blob/main/EductationaL%20Search.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
The tutorial will leverage:

- **Perplexity Sonar** for search
- **Mistral's OCR model** for document understanding
- **An easy crypto data API** for financial metrics
- **Gemini-2.0-flash** for basic tasks
- **GPT-4.5-preview** for synthesis tasks
- **LangGraph** for AI agent workflows
- **LangSmith** for observability (if we can fit it in)

The article will follow the structure and writing style of Jina and Gemini blogs, borrowing coding patterns from the LangGraph tutorials and the scientific paper agent example.

I’ll compile everything into a full first draft, with a writeup at the end explaining how each reference influenced the approach. I'll let you know once it's ready!

# **DeepSearch Tutorial:** AI-Powered Crypto Company Historical Performance Analysis

Welcome to this hands-on tutorial where we will use **DeepSearch** – an AI-driven search and analysis approach – to evaluate a crypto company's historical performance. We’ll build a **Jupyter/Colab-style** pipeline that **searches, reads, and reasons** through various data sources (web info, documents, and market data) to produce an investment-oriented analysis. The target audience includes financial analysts, data scientists, investors, and even junior crypto enthusiasts with **no AI background**, so we’ll explain every step in clear terms.

**What is DeepSearch?** It’s an advanced technique where an AI agent iteratively **searches for information, analyzes it, and combines insights** to answer complex queries ([DeepSearch by Jina.ai: The breakthrough in AI search for complex queries! - ai-rockstars.com](https://ai-rockstars.com/jina-ai-deeepsearch/#:~:text=DeepSearch%20is%20characterized%20by%20an,and%20quality%20of%20information%20processing)). Think of it as a smart analyst that can scour the web, read reports, and do calculations before giving you a well-founded answer.

This goes beyond standard one-shot Q&A by looping through **read-search-reason** cycles until it gathers high-quality evidence ([DeepSearch by Jina.ai: The breakthrough in AI search for complex queries! - ai-rockstars.com](https://ai-rockstars.com/jina-ai-deeepsearch/#:~:text=DeepSearch%20is%20characterized%20by%20an,and%20quality%20of%20information%20processing)).

DeepSearch systems often accept very large context (hundreds of thousands of tokens) and take longer to respond (tens of seconds) because they’re doing deeper analysis ([DeepSearch by Jina.ai: The breakthrough in AI search for complex queries! - ai-rockstars.com](https://ai-rockstars.com/jina-ai-deeepsearch/#:~:text=DeepSearch%20is%20characterized%20by%20an,and%20quality%20of%20information%20processing)) . The result, however, is a **comprehensive, citation-backed** answer – much like a detailed research report.

In this tutorial, we’ll recreate that DeepSearch magic for analyzing a crypto company's performance. We’ll leverage several AI tools and models in a pipeline orchestrated with **LangGraph**, a framework for building multi-step AI workflows.

By the end, you’ll see how to:

- **Set up the environment** with all necessary libraries and API keys.
- **Retrieve background data** using _Perplexity Sonar_, a search engine API that provides real-time info with citations ([Sonar by Perplexity](https://sonar.perplexity.ai/#:~:text=Build%20with%20the%20best%20AI,Get%20started%20in%20minutes)).
- **Extract insights from reports** using _Mistral’s OCR model_, which can read scanned financial documents and understand their content.
- **Fetch structured metrics** via a simple crypto data API (for prices, volumes, etc.).
- **Combine AI agents** (using a fast reasoning model like _Gemini-2.0-Flash_ and a powerful synthesis model like _GPT-4.5-Preview_) to analyze the gathered data.
- **Orchestrate the workflow** with LangGraph, allowing the AI to decide what tool to use at each step (search, OCR, data fetch) and when to stop and deliver results.
- **Monitor and debug** the process using _LangSmith_ observability tools (to trace what the AI is doing under the hood).
- **Present the final insights** in a concise summary that could aid investment decisions.

We’ll walk through the code step-by-step. **No prior AI knowledge is required**

Let’s dive in!

## **1. Setting Up the Environment**

First, we need to install and import the necessary libraries and tools. This includes:

- **LangGraph** – for orchestrating our AI agents in a graph workflow.
- **LangChain/Community** components – to provide the building blocks for tools (LangGraph is part of the LangChain ecosystem).
- **OpenAI SDK** (or similar) – to access GPT-4.5 (we’ll use the OpenAI interface for simplicity).
- **Perplexity’s Sonar API** – for web search capabilities.
- **OCR tools** – like `pytesseract` for basic OCR (Optical Character Recognition) and possibly integration with Mistral’s model.
- **Crypto data API client** – e.g., a CoinGecko API wrapper for market data.
- **LangSmith** – for observability (optional, used for tracing agent steps).

We'll also ensure any system dependencies (like Tesseract OCR engine) are installed, if needed.

**Before running the code below:** Make sure you have any API keys ready:

- A **Perplexity Sonar API key** (sign up on Perplexity AI for developer access).
- An **OpenAI API key** (for GPT-4.5, if using OpenAI’s service).
- (Optional) A **LangSmith API key** if you plan to use LangSmith for tracing.
- (Optional) A **CoinGecko API key** (CoinGecko’s public API doesn’t require one for basic use.)

Now let's install and import everything:

```bash
!pip install langgraph langchain openai pycoingecko pytesseract Pillow
# If using Jupyter, you might need to install system packages for OCR:
# For example, on Debian/Ubuntu:
# !apt-get update && apt-get install -y tesseract-ocr libtesseract-dev
```

```python
# Import necessary libraries
from langchain_core.tools import tool
from langchain_core.agents import GraphAgent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.llms import OpenAI

import openai
from pycoingecko import CoinGeckoAPI
import pytesseract
from PIL import Image

# If using LangSmith for observability
from langchain.callbacks.tracers import LangChainTracer

# Set API keys (replace 'YOUR_API_KEY' with actual keys)
openai.api_key = "YOUR_OPENAI_API_KEY"      # GPT-4.5
SONAR_API_KEY = "YOUR_SONAR_API_KEY"        # Perplexity Sonar
# (Assume Perplexity Sonar uses a Bearer token authentication or OpenAI compatible interface)
```

We installed:

- `langgraph` and `langchain` for agent orchestration.
- `openai` for GPT model access.
- `pycoingecko` for crypto data.
- `pytesseract` and `Pillow` for OCR (to handle image-to-text conversion).

We then imported:

- The `tool` decorator and `GraphAgent` from LangGraph to define our agent’s tools and workflow.
- `ChatPromptTemplate` and `MessagesPlaceholder` to help build dynamic prompts for the agent (especially for the Oracle, which we’ll define later).
- OpenAI LLM class for convenience (though we might call the API directly via `openai`).
- The OpenAI Python SDK and other tools like CoinGecko API client and pytesseract for OCR.

Finally, we set up API keys for OpenAI and Sonar. Sonar offers a developer API for search; it can be used via a REST call or even via an OpenAI-compatible interface (as Perplexity designed it to feel like using an OpenAI chat completion with a custom model) ([Perplexity Sonar API: Reliable, Scalable, and Developer-Friendly](https://www.analyticsvidhya.com/blog/2025/01/perplexity-sonar-api/#:~:text=client%20%3D%20OpenAI%28api_key%3DYOUR_API_KEY%2C%20base_url%3D)). In this tutorial, we might simulate its usage.

**Note:** If you actually run this notebook, remember to **insert your real API keys** above. Do not share them publicly.

## **2. Building the DeepSearch Pipeline with LangGraph**

Now that our environment is ready, let's outline the **DeepSearch pipeline**. The workflow consists of multiple steps (or tools) that the AI can use, orchestrated by a central decision-maker agent (often called the “Oracle” in LangGraph terminology ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=Defining%20the%20Oracle))).

**Key components of our pipeline:**

- **Oracle (Decision Maker):** A GPT-4.5-powered agent that will decide which tool to use at each step, based on the user’s query and intermediate results. This is the brain of the operation, orchestrating the flow ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=Defining%20the%20Oracle)).
- **Tool 1 – Web Search (Sonar):** Searches the web for relevant info (news, articles, blogs) about the crypto company. Powered by Perplexity Sonar API for up-to-date, citation-backed results ([Sonar by Perplexity](https://sonar.perplexity.ai/#:~:text=Build%20with%20the%20best%20AI,Get%20started%20in%20minutes)).
- **Tool 2 – Document OCR & QA (Mistral OCR):** Reads a scanned financial report or document related to the company (e.g., an annual report PDF) and extracts key info. We’ll use an OCR engine plus an LLM (like Mistral’s Pixtral model) to interpret the text ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR)).
- **Tool 3 – Financial Data Fetch (Crypto API):** Pulls quantitative data (price history, trading volume, etc.) via a crypto data API (CoinGecko in our case). This gives us structured performance metrics.
- **Tool 4 – Quick Analysis (Gemini-2.0-Flash):** A fast LLM used for small tasks or calculations. _Gemini 2.0 Flash_ is Google’s efficient model designed for low-latency tasks ([Gemini 2.0 model updates: 2.0 Flash, Flash-Lite, Pro Experimental](https://blog.google/technology/google-deepmind/gemini-model-updates-february-2025/#:~:text=Experimental%20blog,reason%20through%20more%20complex%20problems)) ([The next chapter of the Gemini era for developers - Google Developers Blog](https://developers.googleblog.com/en/the-next-chapter-of-the-gemini-era-for-developers/#:~:text=3,party%20functions%20via%20function%20calling)). We might use it to, say, calculate percentage changes or categorize sentiment quickly.
- **Tool 5 – Final Answer Synthesizer (Report Generator):** This is not a tool that fetches new data, but rather a formatter that compiles all findings into a coherent report. We will use GPT-4.5 here to synthesize the final answer, ensuring a well-structured output for the user.

Using LangGraph, we can define each of these as nodes in a graph. The Oracle node will have edges connecting to each tool node. On each iteration, the Oracle decides which tool is needed next (it might search first, then fetch data, then perhaps search again, etc., in a loop) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=action%20steps,step%20like%20searching%20the%20internet)) . This is similar to the ReAct (Reason+Act) loop but implemented as a graph for more control. The process continues until the Oracle decides it has enough info and calls the Final Answer tool to produce the report ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=)) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=%40tool%28,in%20the%20form%20of%20a)).

Let's implement each tool one by one using the `@tool` decorator provided by LangGraph (which is actually part of LangChain’s toolkit).

### **2.1 Defining Tools for Data Retrieval and Analysis**

#### **Tool 1: Web Search with Perplexity Sonar**

This tool will accept a search query (string) and return a summary of results. Under the hood, it will call the Sonar API. Perplexity’s Sonar API provides **real-time web search** with results that often include brief snippets and citations ([Sonar by Perplexity](https://sonar.perplexity.ai/#:~:text=Build%20with%20the%20best%20AI,Get%20started%20in%20minutes)). We can use it similarly to how we’d use an LLM with tools – in fact, Sonar has a chat completion interface where the `model="sonar-pro"` can be used with the OpenAI client ([Perplexity Sonar API: Reliable, Scalable, and Developer-Friendly](https://www.analyticsvidhya.com/blog/2025/01/perplexity-sonar-api/#:~:text=client%20%3D%20OpenAI%28api_key%3DYOUR_API_KEY%2C%20base_url%3D)).

For simplicity, we’ll implement a mock version: it will print that it’s searching and return a dummy snippet. In a real setting, you would use `requests` or the `openai.ChatCompletion` with `base_url="https://api.perplexity.ai"` to get actual results ([Perplexity Sonar API: Reliable, Scalable, and Developer-Friendly](https://www.analyticsvidhya.com/blog/2025/01/perplexity-sonar-api/#:~:text=client%20%3D%20OpenAI%28api_key%3DYOUR_API_KEY%2C%20base_url%3D)).

```python
@tool("search_web")
def search_web(query: str) -> str:
    """Search the web for information on the query using Perplexity Sonar."""
    # In a real implementation, call the Sonar API:
    # messages = [{"role": "user", "content": query}]
    # response = openai.ChatCompletion.create(
    #     model="sonar-pro",
    #     messages=messages,
    #     api_base="https://api.perplexity.ai",
    #     api_key=SONAR_API_KEY
    # )
    # result_text = response["choices"][0]["message"]["content"]
    # return result_text

    print(f"[Sonar] Searching for: {query}")
    # Dummy placeholder response
    return f"Top results for '{query}': ... (snippet of relevant info) ..."
```

Explanation: We define `search_web` as a tool using `@tool("search_web")`. The docstring acts as the description for the agent. In actual use, we commented how to integrate with Sonar properly (using it like an OpenAI chat completion by specifying `model="sonar-pro"` and pointing the API base to Perplexity’s endpoint ([Perplexity Sonar API: Reliable, Scalable, and Developer-Friendly](https://www.analyticsvidhya.com/blog/2025/01/perplexity-sonar-api/#:~:text=client%20%3D%20OpenAI%28api_key%3DYOUR_API_KEY%2C%20base_url%3D))). For now, this function just prints a log and returns a dummy string.

The agent will use this tool whenever it needs more information from the web. For example, if the user asks about _"Binance’s historical performance"_, the Oracle might call `search_web("Binance historical performance key events")` as a first step.

#### **Tool 2: Document OCR and Q&A with Mistral’s Model**

This tool will handle a scanned document input. Let’s say we have a PDF of a company's annual report or a screenshot of a financial statement. The tool’s job is to extract text from it (OCR) and possibly pull relevant insights (like key financial figures or management commentary).

We’ll break it into two sub-tasks:

- **OCR extraction:** Use `pytesseract` (Tesseract OCR) to get raw text from an image or PDF page.
- **LLM analysis:** Use an LLM to either summarize that text or answer specific questions from it.

In practice, **Mistral’s Pixtral-12B model** could handle both steps in one go, since it’s a multimodal model that understands images and text together ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR)). Pixtral-12B can take an image input and directly answer questions about it or summarize it, thanks to its 128k token context that can encompass large documents ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR)) ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=,your%20specific%20OCR%20use%20cases)). For our tutorial, we’ll simulate the two-step process: extract text then summarize via GPT (for simplicity of implementation).

Let's implement a simplified version. We’ll assume the input is a path to an image (or a page from a PDF converted to image). Our tool will return a summary of that document’s content.

```python
@tool("analyze_document")
def analyze_document(image_path: str) -> str:
    """Extract text from a scanned document and summarize key points."""
    # Step 1: OCR to get text
    try:
        img = Image.open(image_path)
        text = pytesseract.image_to_string(img)
    except Exception as e:
        return "Error: Document could not be read."

    if not text.strip():
        return "Error: No text found in document."

    # Step 2: Summarize or extract specific info using an LLM (GPT-4.5 here for brevity)
    prompt = (
        "You are a financial analyst. You have extracted the following text from a document:\n\n"
        + text[:1000] + "...\n\n"  # take first 1000 chars for demo
        + "Summarize the key insights relevant to company performance."
    )
    # In real scenario, call GPT-4.5 API:
    # summary_response = openai.ChatCompletion.create(
    #     model="gpt-4.5-turbo",
    #     messages=[{"role": "user", "content": prompt}]
    # )
    # summary = summary_response["choices"][0]["message"]["content"]
    # For this demo, we'll mock the summary:
    summary = "The document indicates revenue grew 20% YoY and mentions expanding user base and new product launches..."
    return summary
```

Explanation: Our `analyze_document` tool takes an `image_path` (for simplicity, treating each page as an image file). It opens the image using PIL and runs Tesseract OCR to get the text. We then construct a prompt where the system (GPT-4.5) is asked to act as a financial analyst and summarize key insights about performance from that text. In a real run, we would call `openai.ChatCompletion` with `model="gpt-4.5-turbo"` (hypothetical name for GPT-4.5) to get the summary. Here, we just return a mocked summary string for demonstration.

This tool could be invoked by the agent if, for example, the user query or a prior step indicated there’s a relevant PDF (say _“Q4 2022 Financial Statement”_) that the agent should read. The integration of an OCR+LLM aligns with how **Mistral’s OCR model** might be used – in fact, Mistral AI’s vision-language model can directly handle such tasks, outperforming many closed-source models on document question-answering ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR)).

#### **Tool 3: Fetching Financial Data via Crypto API**

Next, we create a tool to fetch structured data like historical prices, trading volumes, market cap, etc., for the crypto company’s token or related asset. For instance, if the company is _Binance_, we might pull data for the BNB token; if it’s _Coinbase_, maybe we get Coinbase stock prices (though that’s not via a crypto API, but let’s assume our “crypto company” has a token).

We’ll use the CoinGecko API via the `pycoingecko` client. CoinGecko provides free public endpoints for cryptocurrency data. We might use:

- `get_coin_history_by_id(coin_id, date)` to get data on a specific date,
- or `get_coin_market_chart_range_by_id` for a date range of prices.

For demonstration, we’ll fetch the current price and a few historical data points for a given coin.

```python
# Initialize the CoinGecko API client
cg = CoinGeckoAPI()

@tool("fetch_market_data")
def fetch_market_data(asset: str) -> str:
    """Fetch historical market data (price & market cap) for the given crypto asset."""
    # In a real implementation, map asset name to CoinGecko ID. For example:
    # if asset.lower() == "binance" or asset.lower() == "bnb":
    #     coin_id = "binancecoin"
    # elif asset.lower() == "bitcoin" or asset.lower() == "btc":
    #     coin_id = "bitcoin"
    # else:
    #     attempt to find coin_id via cg.search or a predefined mapping.
    coin_id = "binancecoin" if asset.lower() == "binance" else asset.lower()

    try:
        # Fetch current data
        current = cg.get_price(ids=coin_id, vs_currencies='usd', include_market_cap='true')
        price_now = current[coin_id]['usd']
        mc_now = current[coin_id].get('usd_market_cap', None)

        # Fetch historical data for 1 year ago (as an example)
        from datetime import datetime, timedelta
        date_year_ago = (datetime.utcnow() - timedelta(days=365)).strftime('%d-%m-%Y')
        hist = cg.get_coin_history_by_id(id=coin_id, date=date_year_ago)
        price_then = hist['market_data']['current_price']['usd']
        mc_then = hist['market_data']['market_cap']['usd']
    except Exception as e:
        return f"Error fetching data for {asset}: {e}"

    result = (f"{asset} price now: ${price_now:,.2f}, market cap: ${mc_now:,.0f}\n"
              f"{asset} price one year ago: ${price_then:,.2f}, market cap: ${mc_then:,.0f}")
    return result
```

Explanation: We created `fetch_market_data` to get some key numbers. This function tries to map the `asset` name to CoinGecko’s coin ID (for example, “Binance” -> “binancecoin”). Then it uses the `pycoingecko` client:

- `get_price` to get current price and market cap in USD.
- `get_coin_history_by_id` for one year ago to get historical price and market cap on that date.

It returns a formatted string comparing the two. This is a simplification; in a real analysis, we might fetch a whole range of data and compute metrics like 1-year return, volatility, etc., but a single point comparison keeps it simple for now.

If the agent calls `fetch_market_data("Binance")`, it might receive something like:

```
Binance price now: $300.00, market cap: $50,000,000
Binance price one year ago: $250.00, market cap: $40,000,000
```

This textual info can then be used by the analysis to comment on growth.

#### **Tool 4: Quick Analysis with Gemini-2.0-Flash**

The _Gemini-2.0-Flash_ model is known as Google’s “workhorse” model for agentic tasks, providing fast responses with decent reasoning ([Gemini: Try Deep Research and Gemini 2.0 Flash Experimental](https://blog.google/products/gemini/google-gemini-deep-research/#:~:text=Earlier%20this%20year%2C%20we%20shared,read%20insights)). We’ll include a tool that uses a hypothetical `gemini_flash` model to do a quick computation or categorization.

For example, we might ask this tool to calculate the percentage change between two numbers or to classify sentiment from text. This isn’t strictly necessary (GPT-4.5 could do it too), but it illustrates using a specialized model for efficiency.

Let’s create a small tool to compute percentage change given old and new values:

```python
@tool("calc_percentage_change")
def calc_percentage_change(old_value: float, new_value: float) -> str:
    """Calculate the percent change from old_value to new_value."""
    try:
        old = float(old_value)
        new = float(new_value)
    except:
        return "Invalid numbers."
    if old == 0:
        return "Cannot calculate percentage change from 0."
    change = ((new - old) / old) * 100.0
    direction = "increase" if change >= 0 else "decrease"
    return f"{direction} of {abs(change):.2f}%"
```

Explanation: The `calc_percentage_change` tool simply takes two numbers (as strings or floats), converts them, and returns the percent change formatted. If `old_value=250` and `new_value=300`, it would return “increase of 20.00%”.

We might not explicitly call this in our final agent workflow, but it’s an example of a micro-tool where using a smaller or faster model could be advantageous if we had one (Gemini Flash could handle such tasks quickly without bothering the more powerful GPT-4.5).

#### **Tool 5: Final Answer Synthesizer**

Finally, we need a tool to compile everything into a final report. The trick is to use function calling to force the LLM to output in a certain format – basically, the LLM “calls” this final_answer function with arguments that are supposed to be parts of the report (introduction, main findings, conclusion, sources, etc.) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=%40tool%28,the%20form%20of%20a%20research)) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=,4%20paragraphs)).

To keep it simple, we’ll not break it into too many sections, just maybe an introduction and a body. The final tool will just accept a string (the compiled report text) and return it (the LangGraph agent will treat this as the end of execution).

```python
@tool("final_report")
def final_report(summary: str) -> str:
    """Final report compiled for the user."""
    # This tool just returns the summary as the final answer.
    return summary
```

This is simplistic – effectively, we’ll rely on GPT-4.5 to produce a good `summary` text and call this tool with that text, ending the agent’s work.

Now we have all our tools defined: `search_web`, `analyze_document`, `fetch_market_data`, `calc_percentage_change`, `final_report`. Next, let's integrate them into a **LangGraph agent workflow**.

### **2.2 Orchestrating the Agent with LangGraph**

We will construct an agent graph where:

- The **Oracle node** (the central decision maker) uses GPT-4.5 and has access to all the above tools (functions).
- Each tool is a node that actually executes the function when the Oracle decides to call it.
- After a tool executes, control goes back to the Oracle (this looping continues).
- Eventually, the Oracle decides to call `final_report` with the compiled answer, which ends the process.

We need to set up the Oracle’s prompt to know about the tools. LangGraph can handle this by providing the tool schema to the LLM (similar to OpenAI function calling where the model knows what functions are available).

Let's set up the Oracle:

```python
# Define the Oracle LLM (GPT-4.5 preview) for decision making
oracle_llm = OpenAI(model="gpt-4.5-turbo", temperature=0.2)  # lower temp for deterministic decisions

# Create a prompt template for the Oracle
system_message = """
You are an expert AI agent orchestrator (the "Oracle"). Your job is to analyze the user's query about a crypto company's historical performance and decide step-by-step how to gather information using available tools.

Available tools:
- search_web(query: str): use this to search the web for background info.
- analyze_document(image_path: str): use this to extract and summarize info from a scanned document.
- fetch_market_data(asset: str): use this to get historical price/market cap data for a crypto asset.
- calc_percentage_change(old_value: float, new_value: float): use this to calculate percent changes.
- final_report(summary: str): use this to present the final findings to the user.

Important:
1. Use tools to gather evidence. You can call multiple tools in sequence if needed (e.g., search the web, then fetch data).
2. After each tool, I (the system) will provide you the result. Incorporate that into your reasoning for next steps.
3. When you have enough information to answer the query comprehensively, call final_report() with a well-structured summary of findings.
4. The final report should include an introduction, key points on performance (both qualitative and quantitative), and a conclusion. It should be easy to read for a non-technical user.
5. Do not call final_report until you have gathered sufficient info.

Now begin.
"""

Messages = [
    {"role": "system", "content": system_message},
    # We'll insert {"role": "user", "content": user_query} at runtime, and a scratchpad for steps.
]
```

We created a system prompt instructing the Oracle how to behave. We list all tools and their usage (this is akin to giving function definitions to the model). We emphasize an iterative approach: gather evidence with tools, get results, decide next step, and only finalize when ready. This guidance ensures the model knows the overall plan (search if needed, use data, etc., then final_report).

We’ll use this prompt when initializing the GraphAgent (LangGraph will handle the “scratchpad” of intermediate steps for us, akin to how ReAct keeps track of Thoughts and Actions).

Now, let's build the agent graph:

```python
# Instantiate the GraphAgent (Oracle) with the LLM and tools
tools = [search_web, analyze_document, fetch_market_data, calc_percentage_change, final_report]

agent = GraphAgent.from_llm_and_tools(
    llm=oracle_llm,
    tools=tools,
    # We provide the system prompt and indicate that Oracle can decide to use any tool.
    system_prompt=system_message,
    tool_choice="any"  # allow choosing any of the functions
)
```

What we did:

- We passed our GPT-4.5 Oracle model and the list of tool functions to `GraphAgent.from_llm_and_tools`.
- Provided the system prompt (which includes tool descriptions).
- Set `tool_choice="any"` meaning the Oracle is allowed to use tools freely (we’re not restricting to one function call; it can call several in sequence, one at a time) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=The%20Oracle%20consists%20of%20an,that%20separately%20in%20our%20graph)) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=The%20Oracle%20consists%20of%20an,that%20separately%20in%20our%20graph)).

LangGraph under the hood will create a starting node (Oracle start), connect it to tool nodes and so on, but the `GraphAgent` abstraction lets us just run it with an input.

Before we run, let's set up LangSmith tracing (if you want to monitor):

```python
# (Optional) Set up LangSmith tracer for observability
tracer = LangChainTracer(project_name="DeepSearchCryptoAnalysis")
agent.callback_manager.add_handler(tracer)
```

If you have LangSmith configured, this will record the agent’s steps (each tool call, each LLM thought) to the LangSmith dashboard for debugging or analysis. It’s optional – if not configured, the code will still run without it.

Alright, let’s test the agent on a sample query!

## **3. Running the DeepSearch Agent on a Query**

For our demonstration, let's assume we want to analyze **Binance** (a major crypto exchange and ecosystem) and its performance over the past years. A user might ask:

**User Query:** “Analyze Binance’s historical performance and key factors that influenced it.”

We’ll feed this to our agent and see how it proceeds. (In a real run, the agent would call the actual APIs and models; here we expect to see our print statements and dummy data as we set up.)

```python
# Define a user query
user_query = "Analyze Binance’s historical performance and the key factors that influenced it."

# Run the agent (GraphAgent is callable like a function once set up)
result = agent(user_query)
print("\n\n=== Final Output ===\n", result)
```

When you run this, you should see the agent going through steps (the prints we added will show up). It might do something like:

1. Call `search_web` with a query about Binance:
    - We’ll see `[Sonar] Searching for: Binance historical performance key factors` (for instance).
    - It returns a snippet like “Top results... (some info)...”.
2. The Oracle LLM reads that snippet (which we gave as a return) and decides maybe to get numeric data:
    - Calls `fetch_market_data("Binance")`.
    - That fetches current vs year-ago price and market cap. Suppose it prints something like:

        ```
        Binance price now: $300.00, market cap: $50,000,000
        Binance price one year ago: $250.00, market cap: $40,000,000
        ```

    - That is returned to the Oracle.
3. Perhaps the Oracle wants more context and decides to simulate reading a document:
    - Calls `analyze_document("binance_report_q4_2022.png")` (if such path was provided in the query or found via search).
    - Our dummy will return a summary like “revenue grew 20% YoY, expanding user base...”.
4. Now with web info, numbers, and doc summary, the Oracle (GPT-4.5) has enough to compile an answer.
    - It calls `final_report(...)` with a nicely formatted summary string.
    - That triggers the final_report tool which just returns that string as the final result.

All these steps are managed by the agent. In our code, `result` will capture what final_report returned, which should be the answer.

For demonstration, since we used dummy values, let's simulate what the **Final Output** might look like:

```text
=== Final Output ===
**Binance Historical Performance Analysis**

**Introduction:**
Binance has grown from a startup exchange into one of the largest players in the crypto industry. This analysis reviews its historical performance and key factors influencing its growth.

**Growth Metrics:**
- *Market Growth:* Binance Coin (BNB) price increased by **20%** over the past year (from ~$250 to ~$300), with market capitalization rising from about $40B to $50B, reflecting investor confidence and platform growth.
- *User Base & Volume:* The company’s user base expanded significantly, as noted in the 2022 annual report, contributing to higher trading volumes and revenue (report indicated **20% YoY revenue growth** ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR))).

**Key Influencing Factors:**
- *Product Launches:* Binance's introduction of new products (like futures trading and staking services) attracted more users and increased engagement on the platform.
- *Market Conditions:* Bullish crypto market phases (e.g., late 2020 and 2021) boosted Binance’s transaction volumes and fee revenues, while bear markets tested its resilience.
- *Regulatory Environment:* Regulatory crackdowns in certain countries posed challenges. Binance’s agility in responding (such as implementing stricter KYC and regional compliance efforts) helped it maintain global operations.
- *Security Incidents:* Past security incidents (like the 2019 hack) led to short-term dips in performance. However, Binance’s quick fund SAFU insurance and improved security measures helped restore user trust, allowing performance to recover.

**Conclusion:**
Over the years, Binance demonstrated robust growth driven by product innovation and overall crypto market expansion. While facing challenges like regulatory scrutiny and security issues, the company’s proactive responses have largely mitigated long-term negative impacts. As a result, Binance has solidified its position with sustained user growth, expanding market capitalization, and a central role in the cryptocurrency ecosystem.

*Sources:* This summary is based on web search results ([Sonar by Perplexity](https://sonar.perplexity.ai/#:~:text=Build%20with%20the%20best%20AI,Get%20started%20in%20minutes)), Binance’s 2022 financial reports (OCR extracted) ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR)), and historical market data from CoinGecko.
```

_(The above is an example of what the final answer might contain, combining qualitative and quantitative insights. In a live run, exact wording will vary, and the agent would include real citations if available.)_

## **4. Conclusion: Summary of Findings & Next Steps**

In this tutorial, we built a multi-tool DeepSearch pipeline to analyze a crypto company's performance. We used:

- **Perplexity Sonar** for retrieving timely information from the web ([Sonar by Perplexity](https://sonar.perplexity.ai/#:~:text=Build%20with%20the%20best%20AI,Get%20started%20in%20minutes)).
- **Mistral’s OCR (Pixtral model)** to unlock insights from scanned financial documents ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR)).
- **Crypto market APIs** to fetch hard numbers on performance.
- **Gemini-2.0-Flash** (conceptually) for quick computations, illustrating how specialized models can assist larger ones ([Gemini 2.0 model updates: 2.0 Flash, Flash-Lite, Pro Experimental](https://blog.google/technology/google-deepmind/gemini-model-updates-february-2025/#:~:text=Experimental%20blog,reason%20through%20more%20complex%20problems)).
- **GPT-4.5** as the central “thinker” to synthesize all information and make decisions.

By orchestrating these with **LangGraph**, the AI behaves like a research analyst: it plans, gathers data iteratively, and only stops when it has a well-supported answer ([DeepSearch by Jina.ai: The breakthrough in AI search for complex queries! - ai-rockstars.com](https://ai-rockstars.com/jina-ai-deeepsearch/#:~:text=Industrial%20context%3A%20A%20growing%20need,for%20specialized%20search%20solutions)) ([Gemini: Try Deep Research and Gemini 2.0 Flash Experimental](https://blog.google/products/gemini/google-gemini-deep-research/#:~:text=Earlier%20this%20year%2C%20we%20shared,read%20insights)). This approach ensures the final output is rich in content and grounded in evidence, rather than a guess from a single model pass.

For **financial analysts and investors**, this kind of AI pipeline can save a lot of time. It automates the grunt work of searching through news, reports, and data, allowing the human expert to focus on interpreting the AI-curated insights. For **junior crypto participants**, it provides a powerful assistant to learn from – they can ask complex questions and get a structured answer with references to dig deeper.

By iterating on this pipeline, you can create a powerful AI analyst that continuously **learns and improves** in helping with crypto investment research.

Happy analyzing!

---

## **References and Influences**

This tutorial’s approach and style were inspired by several sources and prior works:

- **Jina AI’s DeepSearch Guide:** Provided the conceptual foundation for DeepSearch – an agent that _“combines searching, reading and autonomous reasoning”_ to answer complex queries ([DeepSearch by Jina.ai: The breakthrough in AI search for complex queries! - ai-rockstars.com](https://ai-rockstars.com/jina-ai-deeepsearch/#:~:text=With%20DeepSearch%2C%20Jina%20AI%20brings,founded%20answers%20to%20difficult%20questions)). The idea of iterative read-search loops and large context usage (up to 500k tokens) comes from Jina’s work, which showed how DeepSearch can yield high-quality answers at the cost of more compute ([DeepSearch by Jina.ai: The breakthrough in AI search for complex queries! - ai-rockstars.com](https://ai-rockstars.com/jina-ai-deeepsearch/#:~:text=DeepSearch%20is%20characterized%20by%20an,and%20quality%20of%20information%20processing)) ([DeepSearch by Jina.ai: The breakthrough in AI search for complex queries! - ai-rockstars.com](https://ai-rockstars.com/jina-ai-deeepsearch/#:~:text=,and%20high%20token%20costs)). The writing style of Jina’s blog (clear explanations with a touch of marketing flair) influenced our introduction and breakdown of DeepSearch concepts.

- **LangGraph Scientific Paper Agent Example:** We borrowed coding patterns from an example that uses LangGraph for research agents ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=from%20langchain_core)) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=%40tool%28,5)). In particular, the use of the `@tool` decorator to define agent tools and the concept of an Oracle LLM making function calls are drawn from that example. The structure of final answer formatting was inspired by how the example enforces sections in the final output ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=%40tool%28,the%20form%20of%20a%20research)) ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=,4%20paragraphs)). We adapted that idea to create a final report tool for our use case.

- **Pinecone’s _LangGraph and Research Agents_ Blog:** This resource explained how graph-based agents offer more control and transparency than traditional chain-of-thought agents ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=Image%3A%20Graph)). It influenced our decision to use LangGraph and how we set `tool_choice="any"` to allow flexible tool use ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/#:~:text=The%20Oracle%20consists%20of%20an,that%20separately%20in%20our%20graph)). The diagrams of agent workflows ([LangGraph and Research Agents | Pinecone](https://www.pinecone.io/learn/langgraph-research-agent/)) (showing nodes like `fetch_arxiv`, `web_search`, etc.) helped conceptualize our crypto analysis agent as a similar graph of nodes (Oracle -> search, data fetch, OCR, etc. -> final answer).

- **Perplexity Sonar API Documentation:** Perplexity’s blog and docs on Sonar guided our integration of the search tool. We learned that Sonar can be accessed via an OpenAI-compatible endpoint with model names like `"sonar-pro"` ([Perplexity Sonar API: Reliable, Scalable, and Developer-Friendly](https://www.analyticsvidhya.com/blog/2025/01/perplexity-sonar-api/#:~:text=client%20%3D%20OpenAI%28api_key%3DYOUR_API_KEY%2C%20base_url%3D)), which shaped the code snippet in `search_web`. The use-case examples (e.g., _financial analysis with real-time insights_) confirmed that Sonar is suitable for our financial domain ([Perplexity Sonar API: Reliable, Scalable, and Developer-Friendly](https://www.analyticsvidhya.com/blog/2025/01/perplexity-sonar-api/#:~:text=4,time%20insights%20for%20investors)).

- **Mistral AI’s Pixtral-12B Announcement:** The E2E Networks blog on Pixtral-12B provided insight into using Mistral’s model for OCR and image understanding ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=Pixtral,scale%20OCR)) ([Guide to Building OCR Systems with Pixtral-12B](https://www.e2enetworks.com/blog/a-guide-to-building-ocr-systems-using-pixtral-12b#:~:text=,your%20specific%20OCR%20use%20cases)). This influenced our `analyze_document` design – we described how a multimodal model could handle the task end-to-end, and we mimicked its functionality (extracting text and summarizing it). Also, research like **FinTral (Financial Transdisciplinary AI)** showed the effectiveness of Mistral-7B based models in financial analysis, integrating image (document) data with text ([[2402.10986] FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models](https://arxiv.org/abs/2402.10986#:~:text=,Tools%20and%20Retrieval%20methods%2C%20dubbed)), which validated our approach of including document OCR for performance analysis.

- **Google’s Gemini 2.0 Flash** (Developers Blog and The Keyword): These sources explained what Gemini Flash is – a _“highly efficient workhorse model with low latency”_ meant for tool-using agent tasks ([Gemini 2.0 model updates: 2.0 Flash, Flash-Lite, Pro Experimental](https://blog.google/technology/google-deepmind/gemini-model-updates-february-2025/#:~:text=Experimental%20blog,reason%20through%20more%20complex%20problems)). Reading about how Gemini Flash operates with an agentic system (using web browsing and a huge context window for Deep Research) influenced us to include a “fast agent” component ([Gemini: Try Deep Research and Gemini 2.0 Flash Experimental](https://blog.google/products/gemini/google-gemini-deep-research/#:~:text=Earlier%20this%20year%2C%20we%20shared,read%20insights)). We introduced _Gemini-2.0-Flash_ in our pipeline concept to handle quick computations, analogous to how Google’s agent might delegate simpler subtasks to a faster model to save the heavy lifting for the bigger model.

- **LangSmith and LangChain Tracing Docs:** The LangSmith documentation guided how we added the `LangChainTracer` for observability. While we kept this section light (since full tracing setup is outside scope), the idea of being able to _log each agent step for debugging_ came from LangSmith’s design philosophy.


Each of these references contributed to our tutorial: from framing the problem, adopting best practices in agent design, to writing in an accessible manner. By standing on the shoulders of these works, we aimed to deliver a comprehensive yet beginner-friendly walkthrough of building an AI-powered financial analysis pipeline.