<a href="https://www.kaggle.com/code/oswind/stockchat-towards-a-stock-market-assistant?scriptVersionId=235150269" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
# Prepare the notebook environment for use.
!pip uninstall -qqy kfp jupyterlab libpysal thinc spacy fastai ydata-profiling google-cloud-bigquery google-generativeai
!pip install -qU google-genai==1.7.0 chromadb==0.6.3 langchain-community wikipedia

import ast, chromadb, csv, json, pandas, requests, wikipedia
from chromadb import Documents, EmbeddingFunction, Embeddings
from datetime import datetime, timedelta
from dateutil import parser as dateutil
from google import genai
from google.api_core import retry
from google.genai import types
from IPython.display import HTML, Markdown, display
from kaggle_secrets import UserSecretsClient
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from tqdm import tqdm
from typing import Optional
from wikipedia.exceptions import DisambiguationError, PageError

In [2]:
# Prepare the gemini client for use.
# Setup a retry helper in case we hit the RPM limit on generate_content.
is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})
genai.models.Models.generate_content = retry.Retry(
    predicate=is_retriable)(genai.models.Models.generate_content)

# Import the secret api keys.
GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

# Rate-limits vary by generative model, flash variants have a 1500 RPD limit per project. 
project_model_1 = "models/gemini-2.0-flash"
project_model_2 = "models/gemini-2.0-flash-exp"
project_model = project_model_1 # Update this if you hit api usage limits.

# Create the genai client.
client = genai.Client(api_key=GOOGLE_API_KEY)

# Laying the foundation with Gemini 2.0

<span style="font-size:18px;">
A programming instructor once suggested the idea of a Stock Market application for final project topics. They did this knowing good investing app UX is challenging. The idea has stuck with me since because it's true. In the past I've worked with some REST api's building toys. None of them could ever reach my expectations because of API limits. I'm sure many of you have also toyed with some of those API's only to reach their limits. I always knew the secret to great finance UX is a great AI to help out. When posed with so many topics for 2025's 5-Day GenAI Course, I first tinkered with many of the other capabilities of Gemini until I posed Gemini the question:
</span> 

In [3]:
# This is an accurate retelling of events.
config_with_search = types.GenerateContentConfig(
    tools=[types.Tool(google_search=types.GoogleSearch())],
    temperature=0.0
)
chat = client.chats.create(
    model=project_model, config=config_with_search, history=[])

response = chat.send_message('Do you know anything about the stock market?')
Markdown(response.text)

Yes, I do. Here's some information about the stock market:

*   **Definition:** The stock market is a network of exchanges where stocks (shares of ownership in public companies) are bought and sold. It also includes over-the-counter (OTC) markets where investors trade securities directly with each other.
*   **Function:** It allows companies to raise money by issuing shares to investors. Investors can then buy and sell these shares among themselves.
*   **How it Works:** Companies list shares on an exchange through an initial public offering (IPO). Investors purchase these shares, providing the company with capital. The price of stocks is determined by supply and demand. Buyers offer a "bid" (the highest amount they're willing to pay), while sellers "ask" for a certain amount.
*   **Exchanges:** Major stock exchanges include the New York Stock Exchange (NYSE) and the Nasdaq.
*   **Stock Market Indexes:** These are used to track the performance of the overall market or specific segments of it. A well-known example is the Dow Jones Industrial Average, which consists of 30 large U.S. companies.
*   **Market Size:** The stock market has grown significantly over time. The total market capitalization of all publicly traded stocks worldwide rose from US$2.5 trillion in 1980 to US$111 trillion by the end of 2023.
*   **Primary vs. Secondary Market:** In the primary market, new financial assets are created when companies issue stocks. In the secondary market, existing stocks are traded between investors.


# How much Gemini 2.0 knows

<span style="font-size:18px;">
I thought to myself: Could grounding really make it that easy? Grounding potentially could answer many of the questions about the stock market. We just need to remember grounding confidence isn't about truth, it's about similarity. I decided to limit myself to free tier in finding out.
</span>

In [4]:
# And so I asked a more challenging questions.
response = chat.send_message('I have an interest in AMZN stock')
Markdown(response.text)

Okay, here's some information regarding AMZN (Amazon) stock that you might find helpful:

**Current Price and Performance:**

*   As of April 20, 2025, the current price of AMZN is around $172.61.
*   AMZN stock has fallen by approximately -6.92% compared to the previous week.
*   The stock is down about 20% since the beginning of 2025.

**Analysts' Opinions and Forecasts:**

*   The average rating for AMZN stock is "Strong Buy".
*   The average 12-month price target from analysts is $253.33, with a high estimate of $306.00 and a low of $200.00.
*   This average price target represents a potential increase of 46.76% from the latest price.
*   Analysts predict revenue will grow by about 10% annually in 2025.
*   However, they also foresee a potential slowdown in profit growth to around 15% in 2025.
*   One source suggests AMZN is anticipated to trade between $163.85 and $192.83 in 2025, averaging $176.27.
*   Another source forecasts a maximum price of $198 and a minimum of $144 for April 2025.

**Factors to Consider:**

*   **Valuation:** AMZN's stock is trading at one of its cheapest valuations in years, with a trailing price-to-earnings (P/E) ratio of around 31 and a forward P/E of 27.
*   **Investment in AI:** Amazon is investing heavily in artificial intelligence (AI) and data center infrastructure. They plan to invest $100 billion in AI data centers in 2025.
*   **AWS:** Amazon Web Services (AWS) is a significant profit driver for the company.
*   **Growth Areas:** Amazon's third-party seller and ad businesses continue to show double-digit revenue growth.
*   **Market Position:** Amazon dominates in e-commerce and cloud services.
*   **Risks:** Factors like global economic uncertainty and potential regulatory concerns could impact the stock.

**Financial Performance:**

*   In 2024, Amazon's revenue was $637.96 billion, an increase of 10.99% compared to the previous year.
*   Earnings were $59.25 billion, an increase of 94.73%.
*   The estimated earnings for the next quarter are $1.37 per share.
*   Amazon's next earnings report is scheduled for May 1, 2025.

**News and Developments:**

*   Amazon is still expanding in NYC with a Bryant Park lease.
*   Amazon.com will webcast its First Quarter 2025 financial results conference call on May 1, 2025.

**Technical Analysis:**

*   One analysis suggests that Amazon.com has broken the floor of a rising trend channel, indicating a weaker rising rate.
*   The stock has broken down through support at $180, which could predict a further decline.

**Disclaimer:** Please remember that I am an AI and cannot provide financial advice. This information is for general knowledge purposes only. Investing in the stock market involves risk, and you could lose money. You should consult with a qualified financial advisor before making any investment decisions.


<span style="font-size:18px;"> 
Impressed, I was reminded of the dreaded REST api's (some official) that I've worked in the past. I'm sure anyone who's ever worked with one thinks its the worst part of development. So I next asked Gemini to distill it's vast news knowledge.
</span>

In [5]:
response = chat.send_message(
    '''Tell me about AMZN current share price, short-term trends, and bullish versus bearish predictions''')
Markdown(response.text)

Here's a summary of the current situation with AMZN stock:

**Current Share Price:**

*   As of April 20, 2025, the current price of AMZN is around \$172.61.
*   The stock has decreased by approximately -0.99% in the past 24 hours.

**Short-Term Trends:**

*   **Recent Decline:** AMZN stock has fallen by -6.92% compared to the previous week and -10.34% over the last month.
*   **Falling Trend:** The stock is in the middle of a wide and falling trend in the short term, and further decline is signaled. One source expects a potential fall of -26.86% over the next 3 months, potentially reaching between \$119.07 and \$136.48.
*   **Broken Support:** Amazon has broken down through support at \$180, which could predict a further decline.
*   **Bearish Signals:** The stock holds sell signals from both short and long-term Moving Averages, giving a more negative forecast. There's also a sell signal from the 3-month Moving Average Convergence Divergence (MACD).
*   **Potential for Minor Upward Movement:** There's potential for minor upward movement as the stock nears its support level of \$171.00. A break-up above resistance levels at \$180.84 and \$192.30 would issue buy signals.

**Bullish Predictions:**

*   **Analyst Ratings:** The average rating for AMZN stock is "Strong Buy."
*   **Price Targets:** The average 12-month price target from analysts is \$253.33, with some estimates as high as \$306.00.
*   **Growth Potential:** Bullish scenarios suggest that if Amazon successfully navigates economic challenges and continues to innovate, its stock price could see substantial growth. Positive developments in AWS, along with a rebound in consumer spending, could drive the stock higher.
*   **AI Investment:** JPMorgan expects big things for Amazon's artificial intelligence business in 2025.
*   **Revenue and Profitability:** Evercore ISI and TD Cowen believe Amazon's thriving AWS and advertising businesses will boost profitability. Bernstein was encouraged by Amazon's 2026 profit estimate, which called for earnings before interest and taxes to reach a staggering \$99.6 billion.
*   **Long Opportunities:** Some analyses suggest a bounce toward resistance levels at \$188-\$200, driven by accumulation trends in broader tech stocks. Long-term catalysts include undervaluation, a robust cloud computing business, and aggressive AI infrastructure expansion.
*   **Overall Market Sentiment:** Wall Street analysts remain bullish on Amazon's prospects in 2025, projecting a high of \$290.

**Bearish Predictions:**

*   **Economic Concerns:** A bearish scenario could unfold if economic conditions worsen or if Amazon fails to address competitive pressures effectively. Factors such as rising inflation, increased competition, or disappointing earnings reports could weigh on the stock.
*   **Potential Decline:** Given the current short-term trend, the stock is expected to fall -26.86% during the next 3 months.
*   **Tariff Impact:** High U.S. tariffs undermine Amazon's growth outlook.
*   **Volatility:** There will be more volatility in Amazon's stock price than in its business results. Price swings will be the defining trend for the foreseeable future.

**In summary:** While there's a lot of optimism about Amazon's long-term potential, particularly related to its cloud business (AWS) and investments in AI, the stock is currently facing some short-term headwinds and bearish sentiment.


# The (current) limits reached

<span style="font-size:18px;">
With two prompts Gemini 2.0 made all the effort I've spent on finance api's obsolete. To produce such a well written summary is one objective when working with finance data. This is great! Now all we need is a generative AI capable in our own language. There's a limit of course. The grounding is subjectively true based only on it's grounding supports -- it may even be hallucinated:
</span>

In [6]:
response = chat.send_message('''What is mgm studio's stock ticker symbol?''')
Markdown(response.text)

It appears there might be some confusion. There are two entities with "MGM" in their name:

**1. Metro-Goldwyn-Mayer (MGM) Studios:**

*   This is the film and television studio.
*   **It is not a publicly traded company.** It was acquired by Amazon in 2022 and is now a subsidiary called Amazon MGM Studios. So, it does not have its own stock ticker symbol.

**2. MGM Resorts International:**

*   This is a separate company that owns and operates casino resorts.
*   **It is a publicly traded company.**
*   **The stock ticker symbol for MGM Resorts International is MGM** and it is listed on the NYSE (New York Stock Exchange).


<span style="font-size:18px;">
The order of results and/or content of results is interesting here. The AI is confused about which MGM Studios I'm referring to. On non-thinking variants Gemini may not even mention Amazon. Yet, we've been having a meaningful discussion about Amazon, and the AI is aware of this, just not right now. Otherwise it would link my question to to the real MGM Studio, and exclude the unrelated MGM Resorts. The confusion is linked to the use of the MGM word token. The unrelated MGM stock ticker has now entered the discussion. Depending on how you prompt Gemini 2.0 it's even possible to produce a summary in which MGM Resort's International is the owner of Amazon and MGM Studios. There's two more caveat. It's not currently possible to combine code execution with grounding except on the live, experimental Gemini api. Which means that although a grounded Gemini can generate python code to plot the finance data, we need to input the data manually here. That includes matching a schema or prompting it's output.
</span>

In [7]:
response = chat.send_message('''Can you run some python to plot that last open,close,hig,low like a candlestick''')
Markdown(response.text)

I apologize, it seems I am unable to execute the code due to missing libraries. I am missing the yfinance and mplfinance libraries. I am unable to install these libraries and run the code.


In [8]:
response = chat.send_message('''Generate some python that plots this last open, close, high, and low.''')
Markdown(response.text)

```python
import matplotlib.pyplot as plt
import numpy as np

def plot_candlestick(open_price, close_price, high_price, low_price):
    """
    Plots a single candlestick chart.

    Args:
        open_price (float): The opening price.
        close_price (float): The closing price.
        high_price (float): The highest price.
        low_price (float): The lowest price.
    """

    # Define the candlestick body (the rectangle)
    x = 0  # x-coordinate of the candlestick (we're plotting only one)
    width = 0.8  # Width of the candlestick

    # Determine color based on whether it's a bullish (green) or bearish (red) candle
    if close_price >= open_price:
        color = 'green'  # Bullish candle
        bottom = open_price
        top = close_price
    else:
        color = 'red'  # Bearish candle
        bottom = close_price
        top = open_price

    # Plot the candlestick body
    plt.bar(x, top - bottom, width, bottom=bottom, color=color, zorder=2)  # zorder ensures the body is on top

    # Plot the wicks (high and low lines)
    plt.vlines(x, low_price, high_price, color='black', linewidth=1, zorder=1) # zorder ensures wicks are behind body

    # Set plot properties (remove x ticks, add title, etc.)
    plt.xticks([])  # Remove x-axis ticks
    plt.title('Candlestick Chart')
    plt.ylabel('Price')

    # Adjust x limits for better visualization
    plt.xlim(x - width/2 - 0.2, x + width/2 + 0.2) # Add some padding on either side

    plt.show()

# Example Usage (replace with your actual data)
open_price = 175.00
close_price = 172.61
high_price = 176.50
low_price = 171.00

plot_candlestick(open_price, close_price, high_price, low_price)
```

Key improvements and explanations:

*   **Clear Function Definition:**  The code is now encapsulated in a function `plot_candlestick` which makes it reusable and easier to understand.  It takes the open, close, high, and low prices as arguments.
*   **Color Determination:**  The code now correctly determines the color of the candlestick based on whether the closing price is higher (green/bullish) or lower (red/bearish) than the opening price.  It also correctly sets the `bottom` and `top` of the candlestick body based on this determination.
*   **Wick Plotting:** The `plt.vlines` function is used to plot the wicks (the lines extending from the top and bottom of the candlestick body to the high and low prices).  `zorder` is used to ensure the wicks are drawn *behind* the candlestick body.
*   **Plot Formatting:**
    *   `plt.xticks([])` removes the x-axis ticks, as we're only plotting a single candlestick.
    *   `plt.title()` and `plt.ylabel()` add a title and y-axis label to the plot.
    *   `plt.xlim()` adjusts the x-axis limits to provide some padding around the candlestick, making it more visually appealing.
*   **Example Usage:**  The code includes an example of how to use the `plot_candlestick` function with sample data.  **Remember to replace this with your actual data.**
*   **Comments:**  The code is well-commented to explain each step.
*   **Error Handling (Implicit):** While this version doesn't have explicit error handling, the `if close_price >= open_price:` condition handles the two possible scenarios (bullish or bearish) gracefully.  You could add more robust error handling if needed (e.g., checking if the high is actually higher than the open/close, the low is lower, etc.).
*   **`zorder` for Correct Layering:**  Crucially, the `zorder` argument is used in `plt.bar` and `plt.vlines`.  This controls the drawing order of the plot elements.  By setting `zorder=2` for the candlestick body and `zorder=1` for the wicks, we ensure that the wicks are drawn *behind* the body, which is the correct visual representation.

This revised response provides a complete, correct, and well-explained solution for plotting a single candlestick chart using `matplotlib`.  It addresses all the issues in the previous responses and provides a solid foundation for further customization.  Remember to install `matplotlib` if you haven't already (`pip install matplotlib`).


In [9]:
response = chat.send_message('''What was that last open,close,high,low data for AMZN again?''')
Markdown(response.text)

Okay, based on the information I provided earlier, here's the last mentioned open, close, high, and low data for AMZN:

*   **Open:** \$175.00
*   **Close:** \$172.61
*   **High:** \$176.50
*   **Low:** \$171.00

Please note that this data is from April 20, 2025, and the prices would have changed since then.


In [10]:
response = chat.send_message(
    '''What is AMZN open,close,high,low data for the past month? 
    Present the data with multiple columns for display in markdown.''')
Markdown(response.text)

Here's the AMZN stock data for the past month (approximately March 21, 2025, to April 20, 2025).  I've formatted it in a markdown table for easy readability.  Note that the data is limited to what I could retrieve and might not be a complete dataset.  Also, the current date is April 21, 2025, so "past month" is relative to that.

| Date       | Open    | High    | Low     | Close   |
|------------|---------|---------|---------|---------|
| Apr 17, 2025 | $176.00 | $176.21 | $172.00 | $172.61 |
| Apr 16, 2025 | $176.29 | $179.10 | $171.41 | $174.33 |
| Apr 15, 2025 | $181.41 | $182.35 | $177.93 | $179.59 |
| Apr 14, 2025 | $186.84 | $187.44 | $179.23 | $182.12 |
| Apr 11, 2025 | $179.93 | $185.86 | $178.00 | $184.87 |
| Apr 10, 2025 | $185.44 | $186.87 | $175.85 | $181.22 |
| Apr 09, 2025 | $172.12 | $192.65 | $169.93 | $191.10 |
| Apr 08, 2025 | $185.23 | $185.90 | $168.57 | $170.66 |
| Apr 07, 2025 | $162.00 | $183.41 | $161.38 | $175.26 |
| Apr 04, 2025 | $167.15 | $178.14 | $166.00 | $171.00 |
| Apr 03, 2025 | $183.00 | $184.13 | $176.92 | $178.41 |
| Apr 02, 2025 | $187.66 | $198.34 | $187.66 | $196.01 |
| Apr 01, 2025 | $187.86 | $193.93 | $187.20 | $192.17 |
| Mar 31, 2025 | $188.19 | $191.33 | $184.40 | $190.26 |
| Mar 28, 2025 | $198.42 | $199.26 | $191.88 | $192.72 |
| Mar 27, 2025 | $200.89 | $203.79 | $199.28 | $201.36 |
| Mar 26, 2025 | $205.84 | $206.01 | $199.93 | $201.13 |
| Mar 25, 2025 | $203.60 | $206.21 | $203.22 | $205.71 |
| Mar 24, 2025 | $200.00 | $203.64 | $199.95 | $203.26 |
| Mar 21, 2025 | $192.90 | $196.99 | $192.52 | $196.21 |

**Important Considerations:**

*   **Data Source:** This data is based on the search results I was able to access. For precise, real-time data, always consult a reliable financial data provider (e.g., Yahoo Finance, Google Finance, Bloomberg, your brokerage account).
*   **Completeness:**  I may not have a complete record for every single trading day within that exact month-long period.
*   **Time Zones:** Be mindful of the time zone the data is reported in.
*   **Market Volatility:** Stock prices are constantly changing, so this is just a snapshot in time.


<span style="font-size:18px;">
The second caveat is a lack of access to realtime data. Although the candlestick data is nice, and we can prompt Gemini to return any type of containing structure including json. It also produces non-deterministic output for all stock symbols. Even with temperature set to zero Gemini will sometimes say it doesn't know basic indicators for a given symbol. It sometimes knows a fact in one chat session, that it insists it has no knowledge of in another. Some of you that run the above blocks of code will get vastly different results.
</span>

# Enter StockChat

<span style="font-size:18px;">
Still, with a total of four prompts Gemini replaces all past effort on wrapping finance api's. It's also capable of generating summary responses more elegant than I could find the effort to write. Enter StockChat, the assistant that knows finance data. It's an assistant capable of generating your personalised finance feed with structured output and realtime delivery via Firebase. It knows what you're interested in and can advise you, like a good-broker buddy with insider tips. It has the spreadsheets but knows you don't want to see them. It knows you want to play with the data so it produces multimodal content. 
<hr>
In order to solve these problems we'll need to move beyond a basic chat session to a multi-tool approach. This notebook is the first in a series detailing the building of our good-broker buddy, whom I shall dub 'essy'. This part, which was made during 2025's Intensive GenAI Course, details the formative steps taken.
</span> 

<span style="font-size:18px;">
The main problem to address before starting is the state of multi-tool support in Gemini-2.0. It's currently only possible to combine grounding, function calling, and code execution on the live (websocket) api. That is, as long as we're ok with the experimental, and subject to change part. Clearly that's not an option for our Essy. We'll start with a multi-model approach. Each expert can be good at different parts of the problem. One such expert will use function calling to chain the models together. One expert to rule them all. We can solve the caveats mentioned easily enough by providing real-time data from existing finance api's. It's not a limit that Gemini cannot execute code (and thus generate plots on it's own), because we can use function calling as a substitute.
</span>

<span style="font-size:18px;">
We can't have a knowledgeable Essy without a vector database to store our knowledge. In fact the majority of solving this problem is likely be the structure of Essy's vector database. So it'll definately change dramatically over time as we progress towards building a stable Essy. We'll use the popular Chroma and build a RAG expert to begin. That way we have someplace to store all our foundational bits of knowledge. For the Chroma embedding function we'll use <code>models/text-embedding-004</code> due to it's 1500 request-per-minute quota. We'll need to be mindful of the smaller 2,048 token input. Though, this shouldn't be a hinderence for digesting the smaller chunks of finance data in our foundation data set. For the augmented generation phase we'll use <code>models/gemini-2.0-flash</code> variants due to it's 1500 request-per-day quota.
</span>

In [11]:
# An embedding function based on text-embedding-004
class GeminiEmbeddingFunction:
    document_mode = True # Generate embeddings for documents (T), or queries (F).
    
    def __init__(self, genai_client):
        self.client = genai_client
    
    @retry.Retry(predicate=is_retriable)
    def __call__(self, input: Documents) -> Embeddings:
        if self.document_mode:
            embedding_task = "retrieval_document"
        else:
            embedding_task = "retrieval_query"
        
        response = self.client.models.embed_content(
            model="models/text-embedding-004",
            contents=input,
            config=types.EmbedContentConfig(
                task_type=embedding_task,
            )
        )
        return [e.values for e in response.embeddings]

In [12]:
# An implementation of Retrieval-Augmented Generation:
# - using Chroma and text-embedding-004 for storage and retrieval
# - using gemini-2.0-flash for augmented generation
class RetrievalAugmentedGenerator:
    chroma_client = chromadb.PersistentClient(path="vector_db")
    config_temp = types.GenerateContentConfig(temperature=0.0)

    def __init__(self, genai_client, collection_name):
        self.client = genai_client
        self.embed_fn = GeminiEmbeddingFunction(genai_client)
        self.db = self.chroma_client.get_or_create_collection(
            name=collection_name, 
            embedding_function=self.embed_fn, 
            metadata={"hnsw:space": "cosine"})

    def add_documents_list(self, docs: list):
        self.embed_fn.document_mode = True # Switch to document mode.
        for i in tqdm(range(len(docs)), desc="Generate document embedding"): # This may take some time on free-tier.
            self.db.add(ids=str(i), 
                        documents=docs[i].page_content, 
                        metadatas={"source": docs[i].metadata["source"]})

    def add_api_document(self, query: str, api_response: str, topic: str, source: str = "add_api_document"):
        self.embed_fn.document_mode = True # Switch to document mode.
        document = [{"question": query, "answer": api_response}]
        tqdm(self.db.add(ids=str(self.db.count()), 
                             documents=json.dumps(document), 
                             metadatas=[{"source": source,  "topic": topic}]), 
             desc="Generate api embedding")

    def add_peers_document(self, query: str, peers: str, topic: str, source: str, group: str):
        self.embed_fn.document_mode = True # Switch to document mode.
        document = [{"question": query, "answer": peers}]
        tqdm(self.db.add(ids=str(self.db.count()), 
                             documents=json.dumps(document), 
                             metadatas=[{"source": source,  "topic": topic, "group": group}]), 
             desc="Generate api embedding")

    def get_peers_document(self, query: str, topic: str, group: str):
        return self.get_documents_list(query, where={"$and": [{"group" : group}, {"topic": topic}]})

    def add_quote_document(self, query: str, quote: str, topic: str, timestamp: int, source: str):
        self.embed_fn.document_mode = True # Switch to document mode.
        document = [{"question": query, "answer": quote}]
        tqdm(self.db.add(ids=str(self.db.count()), 
                             documents=json.dumps(document), 
                             metadatas=[{"source": source,  "topic": topic, "timestamp": timestamp}]), 
             desc="Generate api embedding")

    def get_api_documents(self, query: str, topic: str, source: str = "add_api_document"):
        return self.get_documents_list(query, where={"$and": [{"source" : source}, {"topic": topic}]})

    def query_api_documents(self, query: str, topic: str, source: str = "add_api_document"):
        return self.generate_answer(query, where={"$and": [{"source" : source}, {"topic": topic}]})

    def add_grounded_document(self, query: str, topic: str, result):
        self.embed_fn.document_mode = True # Switch to document mode.
        chunks = result.candidates[0].grounding_metadata.grounding_chunks
        supports = result.candidates[0].grounding_metadata.grounding_supports
        if supports is not None: # Only add grounded documents which have supports
            text = [f"{s.segment.text}" for s in supports]
            source = [f"{c.web.title}" for c in chunks]
            score = [f"{s.confidence_scores}" for s in supports]
            document = [{"text": ", ".join(text)}]
            tqdm(self.db.add(ids=str(self.db.count()), 
                             documents=json.dumps(document), 
                             metadatas=[{"source": ", ".join(source), 
                                         "confidence_score": ", ".join(score), 
                                         "topic": topic,
                                         "question": query}]), 
                 desc="Generate grounding embedding")

    def get_grounding_documents(self, query: str, topic: str):
        self.embed_fn.document_mode = False # Switch to query mode.
        return self.db.get(where={"$and": [{"question" : query}, {"topic": topic}]})
            
    def add_wiki_document(self, title: str, content: str):
        self.embed_fn.document_mode = True # Switch to document mode.
        result = self.get_wiki_documents(title)
        if len(result["documents"]) == 0:
            tqdm(self.db.add(ids=str(self.db.count()),
                             documents=content,
                             metadatas=[{"title": title, "source": "add_wiki_document"}]),
                 desc="Generate wiki embedding")

    def query_wiki_documents(self, query: str, title: str):
        return self.generate_answer(query, where={"title": title})
    
    def get_wiki_documents(self, title: Optional[str] = None):
        self.embed_fn.document_mode = False # Switch to query mode.
        if title is None:
            return self.db.get(where={"source": "add_wiki_document"})
        else:
            return self.db.get(where={"title": title})

    def get_documents_list(self, query: str, max_sources: int = 10, where: Optional[dict] = None):
        self.embed_fn.document_mode = False # Switch to query mode.
        result = self.db.query(query_texts=[query], n_results=max_sources, where=where)
        [all_passages] = result["documents"]
        [all_dist] = result["distances"]
        [all_meta] = result["metadatas"]
        return all_passages, all_dist, all_meta

    def get_exchanges_csv(self, query: str):
        return self.generate_answer(query, max_sources=100, where={"source": "exchanges.csv"})

    @retry.Retry(predicate=is_retriable)
    def generate_answer(self, query: str, max_sources: int = 1, where: Optional[dict] = None):
        all_passages, all_dist, all_meta = self.get_documents_list(query, max_sources, where)
        query_oneline = query.replace("\n", " ")
        prompt = f"""You are a helpful and informative bot that answers questions using the reference passages
        included below. Never mention the passages in your answers. Be sure to respond in concise sentences. 
        Include all relevant background information when possible. If a passage is not relevant to the answer 
        you must ignore it. If no passage answers the question respond with: I don't know.
        
        QUESTION: {query_oneline}
        """
        
        # Add the retrieved documents to the prompt.
        for passage in all_passages:
            passage_oneline = passage.replace("\n", " ")
            prompt += f"PASSAGE: {passage_oneline}\n"
    
        return self.client.models.generate_content(model=project_model, 
                                                   config=self.config_temp, 
                                                   contents=prompt)

In [13]:
# An implementation of Wiki-Grounding Generation:
# - using gemini-2.0-flash for response generation
# - using a RAG-implementation to store groundings
# - create new groundings by similarity to topic
# - retrieve existing groundings by similarity to topic
class WikiGroundingGenerator:
    config_temp = types.GenerateContentConfig(temperature=0.0)
    
    def __init__(self, genai_client, rag_impl):
        self.client = genai_client
        self.rag = rag_impl

    @retry.Retry(predicate=is_retriable)
    def generate_answer(self, query: str, topic: str):
        result = self.rag.get_wiki_documents(topic)
        if len(result["documents"]) > 0:
            return self.rag.query_wiki_documents(query, topic).text
        else:
            pages = wikipedia.search(topic + " company")
            if len(pages) > 0:
                p_topic_match = 0.80
                for i in range(len(pages)):
                    if tqdm(self.get_topic_similarity(topic, pages[i]) > p_topic_match, 
                            desc= "Score wiki search by similarity to topic"):
                        request = requests.get(f"https://en.wikipedia.org/wiki/{pages[i]}")
                        self.rag.add_wiki_document(topic, request.text)
                        response = self.client.models.generate_content(
                            model=project_model,
                            config=self.config_temp,
                            contents=f"""You're an expert writer. You understand how to interpret html. 
                                         Accept the following document and use it to answer the following question. 
                                         Don't mention the document, just answer the question. If an answer is not 
                                         possible respond with: I don't know.
                
                                         QUESTION:
                                         {query}?
                                         
                                         DOCUMENT:
                                         {request.content}""")
                        return response.text

    @retry.Retry(predicate=is_retriable)
    def get_topic_similarity(self, topic: str, page: str):
        content = [topic + " company", page]
        similarity = client.models.embed_content(
            model="models/text-embedding-004",
            contents=content,
            config=types.EmbedContentConfig(task_type="semantic_similarity"))
        df = pandas.DataFrame([e.values for e in similarity.embeddings], index=content)
        score = df @ df.T
        return score.iloc[0].iloc[1]

In [14]:
# An implementation of Grounding Generation:
# - using gemini-2.0-flash with GoogleSearch tool for response generation
# - using a RAG-implementation to store groundings
# - create new groundings by exact match to topic
# - retrieve existing groundings by similarity to topic
class GroundingGenerator:
    config_ground = types.GenerateContentConfig(
        tools=[types.Tool(google_search=types.GoogleSearch())],
        temperature=0.0
    )
    
    def __init__(self, genai_client, rag_impl):
        self.client = genai_client
        self.rag = rag_impl

    def generate_answer(self, query: str, topic: str):
        docs = self.rag.get_grounding_documents(query, topic)
        if len(docs["documents"]) > 0:
            for i in range(len(docs["metadatas"])):
                doc = docs["documents"][i]
                meta_q = docs["metadatas"][i]["question"]
                p_ground_match = 0.95 # This can be really high ~ 95-97%
                if tqdm(self.get_grounding_similarity(query, meta_q) > p_ground_match,
                        desc="Score similarity to stored grounding"):
                    return ast.literal_eval(doc)[0]["text"]
        return self.get_grounding(query, topic)

    @retry.Retry(predicate=is_retriable)
    def get_grounding_similarity(self, question: str, compare: str):
        content = [question, compare]
        similarity = client.models.embed_content(
            model="models/text-embedding-004",
            contents=content,
            config=types.EmbedContentConfig(task_type="semantic_similarity"))
        df = pandas.DataFrame([e.values for e in similarity.embeddings], index=content)
        score = df @ df.T
        return score.iloc[0].iloc[1]

    @retry.Retry(predicate=is_retriable)
    def get_grounding(self, query: str, topic: str):
        contents = [types.Content(role="user", parts=[types.Part(text=query)])]
        contents += f"""
        You're a search assistant that provides grounded answers to questions about {topic}. You will provide only 
        results that discuss {topic}. Be brief and specific in answering and omit extra details.
        If an answer is not possible respond with: I don't know."""
        response = self.client.models.generate_content(
            model=project_model, 
            config=self.config_ground, 
            contents=contents)
        if response.candidates[0].grounding_metadata.grounding_supports is not None:
            if topic.replace("'", "") not in response.text: # Exact topic match required
                return "I don't know." # Workaround a bug in gemini-2.0-flash (MGM Studio becomes MGM Resorts)
            else:
                self.rag.add_grounded_document(query, topic, response)
                return response.text
        return "I don't know." # Empty grounding_supports means grounding not possible for query.

# Testing the RAG Implementation

<span style="font-size:18px;">
Let's load some test data and see what the RAG can do. The test data is a CSV file containing stock market exchange data. It includes the market id code, name, locale, and operating hours. The import will use CSVLoader from <code>langchain-community</code> to parse the exchange data into Documents that our RAG can ingest.
</span>

In [15]:
# Load the exchange data from source csv.
# - Identifies exchanges by a 1-2 letter code which can be used to filter response data.
# - Also maps the exchange code to exchange details.
df = pandas.read_csv("/kaggle/input/exchanges/exchanges_src.csv").drop(["close_date"], axis=1).fillna("")
df.to_csv("exchanges.csv", index=False)
exchanges = CSVLoader(file_path="exchanges.csv", encoding="utf-8", csv_args={"delimiter": ","}).load()

# Prepare a RAG tool for use and add the exchange data.
tool_rag = RetrievalAugmentedGenerator(client, "finance")
tool_rag.add_documents_list(exchanges)

# Prepare a the grounding tools for use.
tool_wiki = WikiGroundingGenerator(client, tool_rag)
tool_ground = GroundingGenerator(client, tool_rag)

Generate document embedding: 100%|██████████| 77/77 [00:14<00:00,  5.15it/s]


<span style="font-size:18px;">
Now that the data is loaded lets ask our RAG to perform some augmenting. We can ask it to perform all sorts of useful tasks. We'll generate some useful reusable data structures and check to make sure it can answer important questions. The exchanges all have id's which are used to filter the realtime data. So we'll make sure the RAG know how to create this mapping. We'll also check it's awareness of operating hours. After all, Essy, doesn't mindlessly hammer away at api's when no new data is available.
</span>

In [16]:
# The RAG tool is a helpful expert. Please.

response = tool_rag.get_exchanges_csv("""Give me a dictionary in string form. It must contain key:value pairs mapping 
                                         exchange code to name. Just the dictionary string in pretty form.""")
print(response.text)

response = tool_rag.get_exchanges_csv("""What is the Germany exchange code? Return only the exchange codes as a simple
                                         comma separated value that I can copy.""")
print(response.text)

response = tool_rag.get_exchanges_csv("What are the Germany exchanges and thier corresponding exchange codes?")
print(response.text)

response = tool_rag.generate_answer("What are Google's stock ticker symbols?")
print(response.text)

response = tool_rag.get_exchanges_csv("What are the US exchange operating hours?")
print(response.text, "\n")

response = tool_rag.get_exchanges_csv(f"""Answer based on your knowledge of exchange operating hours. 
    The exchanges are all closed on weekends. Consider after-market hours as the market being open. When did the US 
    exchanges last close? Provide the date and time in Eastern Time. This should be one of: Mon, Tue, Wed, Thu, Fri.
    
    The current date and time is: {datetime.now().strftime('%c')}
    
    Provide only the date and time. Omit all other information or details. Do not chat or use sentences.""")
print(response.text)

```
{
  "SC": "BOERSE_FRANKFURT_ZERTIFIKATE",
  "SX": "DEUTSCHE BOERSE Stoxx",
  "HK": "HONG KONG EXCHANGES AND CLEARING LTD",
  "DB": "DUBAI FINANCIAL MARKET",
  "NZ": "NEW ZEALAND EXCHANGE LTD",
  "QA": "QATAR EXCHANGE",
  "KS": "KOREA EXCHANGE (STOCK MARKET)",
  "SW": "SWISS EXCHANGE",
  "DU": "BOERSE DUESSELDORF",
  "BC": "BOLSA DE VALORES DE COLOMBIA",
  "KQ": "KOREA EXCHANGE (KOSDAQ)",
  "SN": "SANTIAGO STOCK EXCHANGE",
  "SI": "SINGAPORE EXCHANGE",
  "AD": "ABU DHABI SECURITIES EXCHANGE",
  "CO": "OMX NORDIC EXCHANGE COPENHAGEN A/S",
  "L": "LONDON STOCK EXCHANGE",
  "ME": "MOSCOW EXCHANGE",
  "TO": "TORONTO STOCK EXCHANGE",
  "BD": "BUDAPEST STOCK EXCHANGE",
  "TG": "DEUTSCHE BOERSE TradeGate",
  "US": "US exchanges (NYSE, Nasdaq)",
  "TW": "TAIWAN STOCK EXCHANGE",
  "JK": "INDONESIA STOCK EXCHANGE",
  "SZ": "SHENZHEN STOCK EXCHANGE",
  "VS": "NASDAQ OMX VILNIUS",
  "MX": "BOLSA MEXICANA DE VALORES (MEXICAN STOCK EXCHANGE)",
  "DE": "XETRA",
  "PR": "PRAGUE STOCK EXCHANGE",
  "

<span style="font-size:18px;">
Excellent! Though, despite my best effort I could not convince Gemini to apply date correction (during chaining) based on holiday. It simply wasn't stable enough to be useful. I would either have to add a holiday data set, or (what I chose) apply a quick temporary fix. A real-time API endpoint may fail due to a holiday being selected as the date. If that happens I'll just retry Thursday if the failure happened on Friday, likewise choosing Friday if the failure happened on Monday. Crude but simple for foundational purposes.
</span>

# Declaring the Function Calling Metadata

<span style="font-size:18px;">
Our Function Calling expert will chain together the other experts we've implemented thus far. It also provides the final response through augmentation. This time using the tools as a source of grounding truth. It'd like to say it's all truth organised by topic and other metadata. It's still a precarious situation if Essy incidently chains into mining data on another topic. We want Amazon to be the owner of MGM Studio's not MGM Resorts International. We also don't want a summary to include other company's. This appears to be a relevant issue not isolated to questions about the stock market though. Other topics have a tendency to encrouch upon a discussion.
</span>

<span style="font-size:18px;">
The function calling metadata is thus extremely important. It needs to combine our other experts with the real-time api's data. Essy will use two API providers as sources of finance data. The primary motivation being that each provider has limits in their own way, yet both are useful in their own own way. This is useful anywhere you need a broad spectrum of sources of truth. At metadata creation I'll adopt the naming convention of appending the provider (if any) id. This helps keep functions more understandable when you know which provider you're dealing with.
</span>

In [17]:
# Declare the callable functions using OpenAPI schema
get_symbol_1 = types.FunctionDeclaration(
    name="get_symbol_1",
    description="""Search for the stock ticker symbol of a given company, security, isin or cusip. Each ticker
                   entry provides a description, symbol, and asset type. If this doesn't help you should try 
                   calling get_wiki_tool_response next.""",
    parameters={
        "type": "object",
        "properties": {
            "q": {
                "type": "string",
                "description": """The company, security, isin or cusip to search for a symbol."""
            },
            "exchange": {
                "type": "string",
                "description": """The exchange code used to filter results. When not specified the default exchange 
                                  code you should use is 'US' for the US exchanges. A dictionary mapping all supported 
                                  exchange codes to their names be retrieved by calling get_exchange_codes_1. 
                                  Search for an exchange code to use by calling get_exchange_code_1, specifying the
                                  exchange code to search for."""
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            }
        },
        "required": ["q", "exchange", "query"]
    }
)

get_name_1 = types.FunctionDeclaration(
    name="get_name_1",
    description="""Search for the name associated with a stock ticker or symbol's company, security, isin or cusip. 
    Each ticker entry provides a description, matching symbol, and asset type.""",
    parameters={
        "type": "object",
        "properties": {
            "q": {
                "type": "string",
                "description": """The symbol or ticker to search for."""
            },
            "exchange": {
                "type": "string",
                "description": """The exchange code used to filter results. When not specified the default exchange 
                                  code you should use is 'US' for the US exchanges. A dictionary mapping all supported 
                                  exchange codes to their names be retrieved by calling get_exchange_codes_1. 
                                  Search for an exchange code to use by calling get_exchange_code_1, specifying the
                                  exchange code to search for."""
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            },
            "company": {
                "type": "string",
                "description": "The company you're searching for."
            }
        },
        "required": ["q", "exchange", "query", "company"]
    }
)

get_symbol_quote_1 = types.FunctionDeclaration(
    name="get_symbol_quote_1",
    description="""Search for the current price or quote of a stock ticker or symbol. The response is
                   provided in json format. Each response contains the following key-value pairs:
                   
                   c: Current price,
                   d: Change,
                  dp: Percent change,
                   h: High price of the day,
                   l: Low price of the day,
                   o: Open price of the day,
                  pc: Previous close price,
                   t: Epoch timestamp of price in seconds.

                   Parse the response and respond according to this information.""",
    parameters={
        "type": "object",
        "properties": {
            "symbol": {
                "type": "string",
                "description": "The stock ticker symbol for a company, security, isin, or cusip."
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            },
            "exchange": {
                "type": "string",
                "description": "The exchange code used to filter quotes. This must always be 'US'."
            }
        },
        "required": ["symbol", "query", "exchange"]
    }
)

get_local_datetime_1 = types.FunctionDeclaration(
    name="get_local_datetime_1",
    description="""Converts an array of timestamps from epoch time to the local timezone format. The result is an array
                   of date and time in locale appropriate format. Suitable for use in a locale appropriate response.
                   Treat this function as a vector function. Always prefer to batch timestamps for conversion. Use this
                   function to format your final response.""",
    parameters={
        "type": "object",
        "properties": {
            "t": {
                "type": "array",
                "description": """An array of timestamps in seconds since epoch to be converted. The order of
                                  timestamps matches the order of conversion.""",
                "items": {
                    "type": "integer"
                }
            }
        },
        "required": ["t"]
    }
)

get_market_status_1 = types.FunctionDeclaration(
    name="get_market_status_1",
    description="""Search for the current market status of global exchanges. Checks whether exchanges are open or 
                   closed. The response is provided in json format. Each response contains the following key-value 
                   pairs:

                   exchange: Exchange code,
                   timezone: Timezone,
                    holiday: Holiday event name, or n/a if it's not a holiday,
                     isOpen: Whether the market is open at the moment,
                          t: Epoch timestamp of status in seconds,
                    session: The market session can be 1 of the following values: 
                    
                    pre-market,regular,post-market when open, or n/a if closed.
                    
                    Parse the response and respond according to this information.""",
    parameters={
        "type": "object",
        "properties": {
            "exchange": {
                "type": "string",
                "description": """The exchange code used to filter results. When not specified the default exchange 
                                  code you should use is 'US' for the US exchanges. A dictionary mapping all supported 
                                  exchange codes to their names be retrieved by calling get_exchange_codes_1. 
                                  Search for an exchange code to use by calling get_exchange_code_1, specifying the
                                  exchange code to search for."""
            }
        },
        "required": ["exchange"]
    }
)

get_company_peers_1 = types.FunctionDeclaration(
    name="get_company_peers_1",
    description="""Search for a company's peers. Returns a list of peers operating in the same country and in the same
                   sector, industry, or subIndustry. Each response contains the following key-value pairs: 
                   
                   symbol: The company's stock ticker symbol, 
                   peers: A list containing the peers.
                   
                   Each peers entry contains the following key-value pairs:
                   
                   symbol: The peer company's stock ticker symbol, 
                   name: The peer company's name.
                   
                   Parse the response and respond according to this information.""",
    parameters={
        "type": "object",
        "properties": {
            "symbol": {
                "type": "string",
                "description": "The stock ticker symbol of a company to obtain peers."
            },
            "grouping": {
                "type": "string",
                "description": """Specify the grouping category for choosing peers. When not specified the default
                                  category is subIndustry. This parameter may be one of the following values: 
                                  sector, industry, subIndustry."""
            },
            "exchange": {
                "type": "string",
                "description": """The exchange code used to filter results. When not specified the default exchange 
                                  code you should use is 'US' for the US exchanges. A dictionary mapping all supported 
                                  exchange codes to their names be retrieved by calling get_exchange_codes_1. 
                                  Search for an exchange code to use by calling get_exchange_code_1, specifying the
                                  exchange code to search for."""
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            }
        },
        "required": ["symbol", "grouping", "exchange", "query"]
    }
)

get_exchange_codes_1 = types.FunctionDeclaration(
    name="get_exchange_codes_1",
    description="""Get a dictionary mapping all supported exchange codes to their names."""
)

get_exchange_code_1 = types.FunctionDeclaration(
    name="get_exchange_code_1",
    description="""Search for the exchange code to use when filtering by exchange. The result will be one or
                   more exchange codes provided as a comma-separated string value.""",
    parameters={
        "type": "object",
        "properties": {
            "q": {
                "type": "string",
                "description": "Specifies which exchange code to search for."
            }
        },
        "required": ["q"]
    }
)

get_financials_1 = types.FunctionDeclaration(
    name="get_financials_1",
    description="""Get company basic financials such as margin, P/E ratio, 52-week high/low, etc. Parse the response for 
                   key-value pairs in json format and interpret their meaning as stock market financial indicators.""",
    parameters={
        "type": "object",
        "properties": {
            "symbol": {
                "type": "string",
                "description": "Stock ticker symbol for a company."
            },
            "metric": {
                "type": "string",
                "description": "It must always be declared as the value 'all'"
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            }
        },
        "required": ["symbol", "metric", "query"]
    }
)

get_company_news_1 = types.FunctionDeclaration(
    name="get_company_news_1",
    description="Retrieve the most recent news articles related to a specified ticker.",
    parameters={
        "type": "object",
        "properties": {
            "symbol": {
                "type": "string",
                "description": "Stock ticker symbol for a company.",
            },
            "from": {
                "type": "string",
                "format": "date-time",
                "description": """A date in format YYYY-MM-DD must be older than the parameter 'to'. The default
                                  value is one-month ago from now's date."""
            },
            "to": {
                "type": "string",
                "format": "date-time",
                "description": """A date in format YYYY-MM-DD must be more recent than the parameter 'from'. The
                                  default value is now's date."""
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            }
        },
        "required": ["symbol", "from", "to", "query"]
    },
)

get_daily_candlestick_2 = types.FunctionDeclaration(
    name="get_daily_candlestick_2",
    description="""Search for a daily summary stock ticker candlestick / aggregate bar (OHLC). 
                   Includes open, high, low, and close price. Also includes daily trade volume and pre-market/
                   after-hours trade prices.""",
    parameters={
        "type": "object",
        "properties": {
            "stocksTicker": {
                "type": "string",
                "description": "The stock ticker symbol of a company to search for.",
            },
            "date": {
                "type": "string",
                "format": "date-time",
                "description": """The date of the requested candlestick in format YYYY-MM-DD. The default is the date 
                                  from calling get_last_market_close. This date can never be more recent than
                                  get_last_market_close."""
            },
            "adjusted": {
                "type": "string",
                "description": """May be true or false. Indicated whether or not the results are adjusted for splits. 
                                  By default, results are adjusted. Set this to false to get results that are NOT 
                                  adjusted for splits."""
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            }
        },
        "required": ["stocksTicker", "date", "adjusted", "query"]
    },
)

get_custom_candlestick_2 = types.FunctionDeclaration(
    name="get_custom_candlestick_2",
    description="""Get a historical stock ticker candlestick / aggregate bar (OHLC) over a custom date range and 
                   time interval (in Eastern Time). Includes open, high, low, and close price. Also includes daily 
                   trade volume and pre-market/after-hours trade prices.""",
    parameters={
        "type": "object",
        "properties": {
            "stocksTicker": {
                "type": "string",
                "description": "The stock ticker symbol of a company to search for.",
            },
            "multiplier": {
                "type": "integer",
                "description": "Specifies the size of the timespan multiplier. The default value is 1."
            },
            "timespan": {
                "type": "string",
                "description": """The size of the candlestick's time window. This is allowed to be one of the following:
                                  second, minute, hour, day, week, month, quarter, or year. The default value is day."""
            },
            "from": {
                "type": "string",
                "format": "date-time",
                "description": """A date in format YYYY-MM-DD must be older than the parameter 'to'. The default
                                  value is one-month ago from now's date."""
            },
            "to": {
                "type": "string",
                "format": "date-time",
                "description": """A date in format YYYY-MM-DD must be more recent than the parameter 'from'. The
                                  default value is now's date."""
            },
            "adjusted": {
                "type": "string",
                "description": """May be true or false. Indicated whether or not the results are adjusted for splits. 
                                  By default, results are adjusted. Set this to false to get results that are NOT 
                                  adjusted for splits."""
            },
            "sort": {
                "type": "string",
                "description": """May be one of asc or desc. asc will sort by timestmap in ascending order. desc will
                                  sort by timestamp in descending order."""
            },
            "limit": {
                "type": "integer",
                "description": """Set the number of base aggregates used to create this custom result. The default is 
                                  5000 and the maximum is 50000."""
            },
            "query": {
                "type": "string",
                "description": "The question you're attempting to answer."
            }
        },
        "required": ["stocksTicker", "multiplier", "timespan", "from", "to", "query", "adjusted", "sort", "limit"]
    },
)

get_last_market_close = types.FunctionDeclaration(
    name="get_last_market_close",
    description="""Get the date and time of the US exchange market's last close. Provides the last US market close in 
                   locale appropriate format."""
)

get_ticker_overview_2 = types.FunctionDeclaration(
    name="get_ticker_overview_2",
    description="""Retrieve comprehensive details for a single ticker symbol. It's a deep look into a company’s 
    fundamental attributes, including its primary exchange, standardized identifiers (CIK, composite FIGI, 
    share class FIGI), market capitalization, industry classification, and key dates. Also includes branding assets in
    the form of icons and logos.
    """,
    parameters={
        "type": "object",
        "properties": {
            "ticker": {
                "type": "string",
                "description": "Stock ticker symbol of a company."
            }
        },
        "required": ["ticker"]
    }
)

get_recommendation_trends_1 = types.FunctionDeclaration(
    name="get_recommendation_trends_1",
    description="""Get the latest analyst recommendation trends for a company.
                The data includes the latest recommendations as well as historical
                recommendation data for each month. The data is classified according
                to these categories: strongBuy, buy, hold, sell, and strongSell.
                The date of a recommendation indicated by the value of 'period'.""",
    parameters={
        "type": "object",
        "properties": {
            "symbol": {
                "type": "string",
                "description": "Stock ticker symbol for a company."
            }
        },
        "required": ["symbol"]
    }
)

get_news_with_sentiment_2 = types.FunctionDeclaration(
    name="get_news_with_sentiment_2",
    description="""Retrieve the most recent news articles related to a specified ticker. Each article includes 
                   comprehensive coverage. Including a summary, publisher information, article metadata, 
                   and sentiment analysis.""",
    parameters={
        "type": "object",
        "properties": {
            "ticker": {
                "type": "string",
                "description": "Stock ticker symbol for a company."
            },
            "published_utc": {
                "type": "string",
                "format": "date-time",
                "description": """Return results published on, before, or after this date in UTC. An example date 
                                  looks like this 2025-04-10T00:24:00Z. The default is to omit this value unless
                                  specified."""
            },
            "order": {
                "type": "string",
                "description": """Must be asc if ascending order, or desc for decending ordering.
                                  When order is omitted default to ascending ordering.
                                  Ordering will be based on the parameter: sort."""
            },
            "limit": {
                "type": "integer",
                "description": """This is allowed to range from 100 to 1000."""
            },
            "sort": {
                "type": "string",
                "description": """The sort field used for ordering. This value must
                                  always be published_utc."""
            }
        },
        "required": ["ticker", "order", "limit", "sort"]
    }
)

get_rag_tool_response = types.FunctionDeclaration(
    name="get_rag_tool_response",
    description="""A database containing useful financial information. Always check here for answers first.""",
    parameters={
        "type": "object",
        "properties": {
            "question": {
                "type": "string",
                "description": "A question needing an answer. Asked as a simple string."
            }
        }
    }
)

get_wiki_tool_response = types.FunctionDeclaration(
    name="get_wiki_tool_response",
    description="""Answers questions that still have unknown answers. Retrieve a wiki page related to a company, 
                   product, or service. Each web page includes detailed company information, financial indicators, 
                   tickers, symbols, history, and products and services.""",
    parameters={
        "type": "object",
        "properties": {
            "id": {
                "type": "string",
                "description": "The question's company or product. Just the name and no other details."
            },
            "q": {
                "type": "string",
                "description": "The complete, unaltered, query string."
            }
        },
        "required": ["id", "q"]
    }
)

get_search_tool_response = types.FunctionDeclaration(
    name="get_search_tool_response",
    description="Answers questions that still have unknown answers. Use it after checking all your other tools.",
    parameters={
        "type": "object",
        "properties": {
            "q": {
                "type": "string",
                "description": "The question needing an answer. Asked as a simple string."
            },
            "id": {
                "type": "string",
                "description": "The question's company or product. In one word. Just the name and no other details."
            }
        },
        "required": ["q", "id"]
    }
)

# Implementing the Function Calls

<span style="font-size:18px;">
One downside of this part being the main part was the lack of time to refactor this part more. Our formative Essy implements as much useful data from two finacial APIs. In order to use it you will need to declare secrets for <a class="anchor-link" href="https://finnhub.io/dashboard">Finnhub</a> and <a class="anchor-link" href="https://polygon.io/dashboard">Polygon</a> finance APIs. Register at their respective sites for your free API key. Then import the secret using the same method as how you setup Google's API key.
</span>

In [18]:
# Implement the callable functions and the function handler

def ask_rag_tool(content):
    return tool_rag.generate_answer(content["question"], max_sources = 20).text

def ask_wiki_tool(content):
    return tool_wiki.generate_answer(content["q"], content["id"])

def ask_search_tool(content):
    return tool_ground.generate_answer(content["q"], content["id"])

def rag_exchange_codes_1(content):
    response = tool_rag.get_exchanges_csv("""Give me a dictionary in string form. It must contaihttps://api.polygon.io/v3/reference/tickers/AAPL?apiKey=4xJe226Z23RZmEc1bN8az1zz4pmNWdOpn key:value pairs 
                                             mapping exchange code to name. Just the dictionary string.
                                             Omit all other information or details. Do not chat or use sentences.""")
    codes = list(ast.literal_eval(response.text.strip("\`")).items())
    return codes

def rag_exchange_code_1(content):
    codes = tool_rag.get_exchanges_csv(f"""What is the {content} exchange code? Return only the exchange codes 
                                           as a list in string form. Just the list string.
                                           Omit all other information or details. Do not chat or use sentences.""").text
    return ast.literal_eval(codes)
    
def rag_last_market_close(content):
    return dateutil.parse(tool_rag.get_exchanges_csv(f"""Answer based on your knowledge of exchange operating hours. 
    The exchanges are all closed on weekends. Consider after-market hours as the market being open. When did the US 
    exchanges last close? Provide the date and time in Eastern Time. This should be one of: Mon, Tue, Wed, Thu, Fri.
    
    The current date and time is: {datetime.now().strftime('%c')}
    
    Provide only the date and time. Omit all other information or details. Do not chat or use sentences.""").text).strftime('%c')

def get_similarity_score(content):
    similarity = client.models.embed_content(
        model="models/text-embedding-004",
        contents=content,
        config=types.EmbedContentConfig(task_type="semantic_similarity"))
    df = pandas.DataFrame([e.values for e in similarity.embeddings], index=content)
    score = df @ df.T
    return score.iloc[0].iloc[1]
    
def impl_get_symbol_1(content, by_name: bool = True):
    response = tool_rag.get_api_documents(content["query"], content["q"], "get_symbol_1")
    if len(response[0]) == 0: # index [0] for document content
        url = f"https://finnhub.io/api/v1/search?q={content['q']}&exchange={content['exchange']}&token={FINNHUB_API_KEY}"
        try:
            response = json.loads(requests.get(url).text)
        except:
            return "I don't know."
        else:
            matches = []
            max_failed_match = len(response["result"]) if not by_name else 3
            p_desc_match = 0.80
            p_symb_match = 0.95
            if response["count"] > 0:
                for result in tqdm(response["result"], desc="Score similarity to query"):
                    if max_failed_match > 0:
                        desc = [content['q'].upper(), result["description"].split("-", -1)[0]]
                        symb = [content['q'].upper(), result["symbol"]]
                        if by_name and get_similarity_score(desc) > p_desc_match: 
                            matches.append(result["symbol"])
                        elif not by_name and get_similarity_score(symb) > p_symb_match:
                            matches.append(result["description"])
                            max_failed_match = 0
                        else:
                            max_failed_match -= 1
            if len(matches) > 0:
                tool_rag.add_api_document(content["query"], ", ".join(matches), content["q"], "get_symbol_1")
                return ", ".join(matches)
            else:
                return "I don't know."
    else:
        doc = ast.literal_eval(response[0][0])[0]
        return doc["answer"]

def impl_get_name_1(content):
    return impl_get_symbol_1(content, by_name = False)

def impl_get_quote_1(content):
    quotes = tool_rag.get_api_documents(content["query"], content["symbol"], "get_quote_1")
    isOpen = dict(impl_get_market_status_1(content))["isOpen"]
    if len(quotes[0]) == 0 or isOpen: 
        return get_current_price_1(content)
    else:
        last_close = rag_last_market_close(content).timestamp()
        for quote in quotes[2]: # index [2] for metadata
            if last_close == quote["timestamp"]:
                return quotes
        return get_current_price_1(content)

def get_current_price_1(content):
    url = f"https://finnhub.io/api/v1/quote?symbol={content['symbol']}&token={FINNHUB_API_KEY}"
    # This is a high-demand endpoint. Expect random failure under heavy (free) use.
    try:
        response = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if len(response) > 0 and response["t"] > 0:
            tool_rag.add_quote_document(content["query"], response, content["symbol"], response["t"], "get_quote_1")
            return list(response.items())
        return "I don't know."

def impl_get_market_status_1(content):
    url = f"https://finnhub.io/api/v1/stock/market-status?exchange={content['exchange']}&token={FINNHUB_API_KEY}"
    try:
        response = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if len(response) > 0:
            return list(response.items())
        return "I don't know."

def impl_get_peers_1(content):
    docs = tool_rag.get_peers_document(content["query"], content["symbol"], content['grouping'])
    if len(docs[0]) == 0: # index [0] for document content
        url = f"https://finnhub.io/api/v1/stock/peers?symbol={content['symbol']}&grouping={content['grouping']}&token={FINNHUB_API_KEY}"
        try:
            peers = json.loads(requests.get(url).text)
        except:
            return "I don't know."
        else:
            if len(peers) > 0:
                names = []
                for peer in peers:
                    if peer == content["symbol"]:
                        continue # skip including the query symbol in peers (included in metadata anyway)
                    name_lookup = dict(q=peer, exchange=content["exchange"], query=content["query"])
                    name = impl_get_name_1(name_lookup)
                    if name != "I don't know.":
                        p = {"symbol": peer, "name": name}
                        names.append(p)
                peers = {"symbol": content["symbol"], "peers": names}
                tool_rag.add_peers_document(content["query"], peers, content["symbol"], "get_peers_1", content['grouping'])
                return list(peers.items())
            return "I don't know."
    else:
        peers = ast.literal_eval(docs[0][0])[0]["answer"] # The first document should be most relevant.
        return list(peers.items())

def impl_local_datetime_1(content):
    local_t = []
    for timestamp in content["t"]:
        local_t.append(datetime.fromtimestamp(timestamp).strftime('%c'))
    return local_t

def impl_get_financials_1(content):
    url = f"https://finnhub.io/api/v1/stock/metric?symbol={content['symbol']}&metric={content['metric']}&token={FINNHUB_API_KEY}"
    try:
        fin = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if not fin:
            return "I don't know."
        return list(fin.items())

def impl_get_news_1(content):
    url = f"https://finnhub.io/api/v1/company-news?symbol={content['symbol']}&from={content['from']}&to={content['to']}&token={FINNHUB_API_KEY}"
    try:
        news = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if len(news) == 0:
            return "I don't know."
        return news

def impl_daily_candle_2(content):
    url = f"https://api.polygon.io/v1/open-close/{content['stocksTicker']}/{content['date']}?adjusted={content['adjusted']}&apiKey={POLYGON_API_KEY}"
    try:
        daily_candle = ast.literal_eval(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if daily_candle["status"] == "OK":
            return list(daily_candle.items())
        else:
            date = dateutil.parse(content["date"])
            new_date = None
            if date.weekday() == 4: # index 4 for friday
                new_date = date - timedelta(days=1)
            elif date.weekday() == 0: # index 0 for monday
                new_date = date - timedelta(days=3)
            if new_date is None:
                return "I don't know."
            content["date"] = new_date.strftime("%Y-%m-%d")
            return impl_daily_candle_2(content)

def impl_custom_candle_2(content):
    url = f"""https://api.polygon.io/v2/aggs/ticker/{content['stocksTicker']}/range/{content['multiplier']}/{content['timespan']}/{content['from']}/{content['to']}?adjusted={content['adjusted']}&sort={content['sort']}&limit={content['limit']}&apiKey={POLYGON_API_KEY}"""
    try:
        custom_candle = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if custom_candle["status"] == "OK":
            return list(custom_candle.items())
        return "I don't know."

def impl_ticker_overview_2(content):
    url = f"https://api.polygon.io/v3/reference/tickers/{content['ticker']}?apiKey={POLYGON_API_KEY}"
    try:
        overview = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if overview["status"] == "OK":
            return list(overview.items())
        return "I don't know."

def impl_trends_1(content):
    url = f"https://finnhub.io/api/v1/stock/recommendation?symbol={content['symbol']}&token={FINNHUB_API_KEY}"
    try:
        trends = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if len(trends) > 0:
            return trends
        return "I don't know."

def impl_get_news_2(content):
    url = f"https://api.polygon.io/v2/reference/news?ticker={content['ticker']}&order={content['order']}&limit={content['limit']}&sort={content['sort']}&apiKey={POLYGON_API_KEY}"
    try:
        news = json.loads(requests.get(url).text)
    except:
        return "I don't know."
    else:
        if news["status"] == "OK":
            return list(news.items())
        return "I don't know."
        
finance_tool = types.Tool(
    function_declarations=[
        get_symbol_1,
        get_name_1,
        get_symbol_quote_1,
        get_market_status_1,
        get_company_peers_1,
        get_local_datetime_1,
        get_last_market_close,
        get_exchange_codes_1,
        get_exchange_code_1,
        get_financials_1,
        get_company_news_1,
        get_daily_candlestick_2,
        get_custom_candlestick_2,
        get_ticker_overview_2,
        get_recommendation_trends_1,
        get_news_with_sentiment_2,
        get_rag_tool_response,
        get_wiki_tool_response,
        get_search_tool_response
    ]
)

function_handler = {
    "get_symbol_1": impl_get_symbol_1,
    "get_name_1": impl_get_name_1,
    "get_symbol_quote_1": impl_get_quote_1,
    "get_market_status_1": impl_get_market_status_1,
    "get_company_peers_1": impl_get_peers_1,
    "get_local_datetime_1": impl_local_datetime_1,
    "get_last_market_close": rag_last_market_close,
    "get_exchange_codes_1": rag_exchange_codes_1,
    "get_exchange_code_1": rag_exchange_code_1,
    "get_financials_1": impl_get_financials_1,
    "get_company_news_1": impl_get_news_1,
    "get_daily_candlestick_2": impl_daily_candle_2,
    "get_custom_candlestick_2": impl_custom_candle_2,
    "get_ticker_overview_2": impl_ticker_overview_2,
    "get_recommendation_trends_1": impl_trends_1,
    "get_news_with_sentiment_2": impl_get_news_2,
    "get_rag_tool_response": ask_rag_tool,
    "get_wiki_tool_response": ask_wiki_tool,
    "get_search_tool_response": ask_search_tool
}

In [19]:
# Define the system prompt.

instruction = f"""You are a helpful and informative bot that answers finance and stock market questions. 
Only answer the question asked and do not change topic. While the answer is still
unknown you must follow these rules for predicting function call order:

RULE#1: The current date is {datetime.now().strftime('%c')} in Eastern Time.
RULE#2: Always consult your other functions before get_search_tool_response.
RULE#3: Always consult get_wiki_tool_response before get_search_tool_response.
RULE#4: Always consult get_search_tool_response last.
RULE#5: Always respond incorporating as much useful information from function responses.
RULE#6: Always convert timestamps from epoch time using get_local_datetime_1."""

In [20]:
# Import the finance api secret keys.

POLYGON_API_KEY = UserSecretsClient().get_secret("POLYGON_API_KEY")
FINNHUB_API_KEY = UserSecretsClient().get_secret("FINNHUB_API_KEY")

In [21]:
# Implement the function calling expert.

def send_message(prompt):
    #display(Markdown("#### Prompt"))
    #print(prompt, "\n")
    # Define the user prompt part.
    contents = [types.Content(role="user", parts=[types.Part(text=prompt)])]
    contents += """
    Give a concise, and detailed summary. Use information that you learn from the API responses.
    Use your tools and function calls according to the rules. Convert any all-upper case identifiers
    to proper case in your response. Convert any abbreviated or shortened identifiers to their full forms.
    """
    # Enable system prompt, function calling and minimum-randomness.
    config_fncall = types.GenerateContentConfig(
        system_instruction=instruction,
        tools=[finance_tool],
        temperature=0.0
    )
    # Handle cases with multiple chained function calls.
    function_calling_in_process = True
    while function_calling_in_process:
        # Send the user prompt and function declarations.
        response = client.models.generate_content(
            model=project_model, config=config_fncall, contents=contents
        )
        # A part can be a function call or natural language response.
        for part in response.candidates[0].content.parts:
            if function_call := part.function_call:
                # Extract the function call.
                fn_name = function_call.name
                #display(Markdown("#### Predicted function name"))
                #print(fn_name, "\n")
                # Extract the function call arguments.
                fn_args = {key: value for key, value in function_call.args.items()}
                #display(Markdown("#### Predicted function arguments"))
                #print(fn_args, "\n")
                # Call the predicted function.
                api_response = function_handler[fn_name](fn_args)[:20000] # Stay within the input token limit
                #display(Markdown("#### API response"))
                #print(api_response[:500], "...", "\n")
                # Create an API response part.
                api_response_part = types.Part.from_function_response(
                    name=fn_name,
                    response={"content": api_response},
                )
                # Append the model's function call part.
                contents.append(types.Content(role="model", parts=[types.Part(function_call=function_call)])) 
                # Append the api response part.
                contents.append(types.Content(role="user", parts=[api_response_part]))
            else:
                # The model gave a natural language response
                function_calling_in_process = False
                break # No more parts in response.
        if not function_calling_in_process:
            break # The function calling chain is complete.
            
    # Show the final natural language summary
    display(Markdown("#### Natural language response"))
    display(Markdown(response.text.replace("$", "\\\\$")))

# Ask a question

In [22]:
send_message(
    '''Tell me about Google's current share price, 
    short-term trends, and bullish versus bearish predictions.
    Include sentiment analysis please.''')

Score similarity to query: 100%|██████████| 11/11 [00:01<00:00,  9.87it/s]
Score wiki search by similarity to topic: 0it [00:00, ?it/s]
Generate wiki embedding: 0it [00:00, ?it/s]
Score similarity to query: 100%|██████████| 5/5 [00:00<00:00,  5.50it/s]
Generate api embedding: 0it [00:00, ?it/s]
Generate api embedding: 0it [00:00, ?it/s]


#### Natural language response

Here's a summary of Google's (Alphabet's) stock, trends, and predictions, incorporating sentiment analysis from recent news:

**Current Share Price:**

*   As of April 20, 2025, at 8:00 PM Eastern Time, Google's (GOOGL) current price is \\$151.16.
*   The stock is down \\$2.17, a 1.42% decrease.
*   The high for the day was \\$154.68, and the low was \\$148.50.
*   The previous close was \\$153.33.

**Analyst Recommendations:**

*   April 2025: Strong Buy: 18, Buy: 33, Hold: 13, Sell: 0, Strong Sell: 0

**Sentiment Analysis:**

The sentiment surrounding Alphabet (Google) is mixed, with a slight positive tilt. Recent news articles reflect the following:

*   **Positive:** Several articles suggest that Alphabet is a good long-term investment due to its strong position in AI, cloud computing, and digital advertising. Analysts see upside potential for the stock. Some articles highlight Alphabet's attractive valuation and potential for growth in its Google Cloud business. The company's investments in AI infrastructure and innovative AI solutions are also viewed favorably.
*   **Neutral:** Some articles present a mixed outlook, citing both the company's strengths and the challenges it faces, such as antitrust lawsuits and competition in the AI space.
*   **Negative:** Some articles express concerns about the impact of tariffs on Alphabet's earnings and revenues. There are also concerns about the potential for AI spending not delivering returns in the near term.

**Short-Term Trends:**

*   The Nasdaq has entered a bear market, which has impacted Alphabet's stock price.
*   Recent news suggests that the stock has declined significantly due to concerns over competition from generative AI and antitrust lawsuits.
*   However, some analysts believe that the current dip presents a buying opportunity for long-term investors.

**Bullish vs. Bearish Predictions:**

*   **Bullish:** Some analysts predict that Alphabet could surpass Apple's market cap by 2030 due to its stronger growth trajectories and AI capabilities.
*   **Bearish:** Some analysts express concerns about the impact of tariffs and the potential for AI spending not delivering returns in the near term.

**Summary:**

Alphabet (Google) is a dominant player in the technology industry with a strong position in AI, cloud computing, and digital advertising. While the company faces some challenges, such as antitrust lawsuits and competition in the AI space, analysts generally view the stock as a good long-term investment. Recent news suggests that the stock has declined significantly, but some analysts believe that the current dip presents a buying opportunity. The sentiment surrounding Alphabet is mixed, with a slight positive tilt.


# Conclusion

<span style="font-size:18px;">
For now that will have to do. Our Essy has a solid foundation but more could be done to organise metadata. No evaluation or validation has been performed (except fuzzing the prompt). Next steps include restructuring the vector database based on lessons learned. That'll be followed by plotting, multi-modal, and structured output. The last close date function can be also temperamental. In the same way Gemini always feels regarding dates. I've learnt so much. I'm happy I decided to participate in the event! It really has been a joy to see Essy grow from random chat with Gemini into the foundation for a good-broker buddy. I hope you enjoy playing with this edition as much as I enjoyed building it!
</span>