In [9]:
from google.colab import userdata

# Access the API key stored as a secret in Colab
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')

# Use Google Search grounding

In [14]:
# Import the necessary libraries for using the Google Generative AI client.
from google import genai
from google.genai import types

# Initialize the client with the API key retrieved from Colab secrets.
client = genai.Client(api_key=GOOGLE_API_KEY)

# Define the model ID to be used for generating content.
MODEL_ID = "gemini-2.5-flash"

In [19]:
from IPython.display import HTML, Markdown

response = client.models.generate_content(
    model=MODEL_ID,
    contents='Trận đấu đội tuyển Anh với đội tuyển Latvia gần đây nhất có tỉ số là bao nhiêu?',
    config={"tools": [{"google_search": {}}]},
)

# print the response
display(Markdown(f"**Response**:\n {response.text}"))
# print the search details
print(f"Search Query: {response.candidates[0].grounding_metadata.web_search_queries}")
# urls used for grounding
print(f"Search Pages: {', '.join([site.web.title for site in response.candidates[0].grounding_metadata.grounding_chunks])}")

display(HTML(response.candidates[0].grounding_metadata.search_entry_point.rendered_content))

**Response**:
 Trận đấu gần đây nhất giữa đội tuyển Anh và đội tuyển Latvia diễn ra vào ngày 14 tháng 10 năm 2025 (rạng sáng ngày 15 tháng 10 theo giờ Việt Nam) với tỉ số là Anh 5-0 Latvia. Trận đấu này thuộc khuôn khổ vòng loại World Cup 2026 khu vực châu Âu. Với chiến thắng này, đội tuyển Anh đã chính thức giành vé tham dự World Cup 2026.

Search Query: ['kết quả trận đấu đội tuyển Anh với đội tuyển Latvia gần đây nhất', 'England vs Latvia football match scores history']
Search Pages: bongda.com.vn, laodong.vn, englandfootball.com, theguardian.com, nld.com.vn, vietnamnet.vn, vietnamplus.vn


In [15]:
from IPython.display import Markdown

response = client.models.generate_content(
    model=MODEL_ID,
    contents='Trận đấu cuối cùng của Harry Kane cho Tottenham diễn ra vào ngày bao nhiêu?',
)

# print the response
display(Markdown(response.text))

Trận đấu cuối cùng của Harry Kane cho Tottenham là vào ngày **28 tháng 5 năm 2023**.

Đó là trận đấu cuối cùng của mùa giải Premier League 2022/2023, khi Tottenham đánh bại Leeds United với tỷ số 4-1. Kane đã ghi 2 bàn trong trận đấu đó. Anh chính thức chuyển đến Bayern Munich vào tháng 8 năm 2023.

# Grounding with YouTube links

In [16]:
yt_link = "https://www.youtube.com/watch?v=XV1kOFo1C8M"

response = client.models.generate_content(
    model=MODEL_ID,
    contents= types.Content(
        parts=[
            types.Part(text="Summarize this video."),
            types.Part(
                file_data=types.FileData(
                    file_uri=yt_link
                )
            )
        ]
    )
)

Markdown(response.text)

This video introduces "Gemma Chess," an application of Google's Gemma AI model designed to bring a new dimension to the game of chess. The speaker, Ju-yeong Ji from Google DeepMind, clarifies that Gemma is not meant to replace traditional chess engines like AlphaZero, which are highly specialized for calculating the best moves. Instead, Gemma leverages its natural language processing capabilities to enhance the human experience of chess.

Key applications of Gemma in chess include:

1.  **Explaining Chess Analysis:** Gemma can interpret complex technical outputs from traditional chess engines (like PGN data and numerical evaluations) and translate them into understandable plain text explanations. It can detail the strategic and tactical reasons behind specific moves, identify major themes, and highlight potential dangers, making game analysis more accessible to players.
2.  **Storytelling:** Gemma can narrate the progression of chess games, turning a sequence of moves into an engaging story by incorporating details about players, tournaments, and the unfolding drama of the match. This aims to bring games to life in a way that mere move notation cannot.
3.  **Supporting Chess Learning:** Gemma can act as a personalized study partner, explaining chess concepts (such as openings like the Sicilian Defense or positional ideas like "passed pawns") in natural language, and even in different languages like Korean. It can provide tailored feedback based on a player's skill level, clarifying complex ideas and suggesting areas for improvement, akin to having a 24/7 personal chess coach.

In summary, Gemma enhances the chess experience by focusing on understanding, explaining, and narrating the game in a human-like manner, making learning easier, analysis more intuitive, and games more engaging for players of all levels.

In [17]:
yt_link = "https://www.youtube.com/watch?v=XV1kOFo1C8M"

response = client.models.generate_content(
    model=MODEL_ID,
    contents=types.Content(
        parts=[
            types.Part(text="In 2 paragraph, how Gemma models can help on chess games?"),
            types.Part(
                file_data=types.FileData(file_uri=yt_link)
            )
        ]
    )
)

Markdown(response.text)

Gemma models, as large language models, offer a distinct approach to chess compared to traditional engines like Stockfish or AlphaZero. While these engines excel at brute-force calculation and identifying optimal moves through deep search, Gemma's strength lies in its ability to understand, generate, and process human language. This capability allows Gemma to act as a sophisticated explainer and analyst for chess games. It can interpret complex game notations (like PGN) and provide human-understandable explanations for strategic decisions, tactical patterns, and positional nuances. Rather than just presenting a move and its numerical evaluation, Gemma can articulate *why* a move is interesting, the underlying plans, and potential dangers, effectively bridging the gap between raw computational data and human intuition. It can also be integrated with existing chess engines via function calls to query for optimal moves and then interpret those results.

Furthermore, Gemma can significantly enhance chess learning and engagement. It can serve as a personalized chess tutor, explaining openings, tactical motifs, or strategic concepts (like a passed pawn) in plain language, tailored to a user's skill level and even in different languages. Beyond analysis, Gemma can transform game data into engaging narratives, recounting the drama and key moments of famous historical matches or even a user's own games, thereby bringing the game to life in a more relatable way. This narrative capability, combined with its multilingual understanding, makes chess more accessible and enjoyable for a broader audience, fostering deeper understanding and appreciation of the game's intricacies rather than merely focusing on competitive play.

# Grounding information using URL context

In [18]:
prompt = """
  Based on https://ai.google.dev/gemini-api/docs/models, what are the key
  differences between Gemini 1.5, Gemini 2.0 and Gemini 2.5 models?
  Create a markdown table comparing the differences.
"""

config = {
    "tools": [{"url_context": {}}],
}

response = client.models.generate_content(
    contents=[prompt],
    model=MODEL_ID,
    config=config
)

display(Markdown(response.text))

The provided document details Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash-Lite, Gemini 2.0 Flash, and Gemini 2.0 Flash-Lite. There is no mention of a standalone "Gemini 1.5" or "Gemini 2.0" model, but rather different versions of 2.0 and 2.5. I will compare the "Pro" and "Flash" versions within 2.0 and 2.5, as these appear to be the primary distinctions within those generations, and the "Flash-Lite" models are essentially optimized versions of the "Flash" models.

Here's a breakdown of the key differences based on the provided documentation:

| Feature           | Gemini 2.5 Pro                                                                                                                                                                                                                                                                                                 | Gemini 2.5 Flash                                                                                                                                                                                                                                                                                                                                                                           | Gemini 2.0 Flash                                                                                                                                                                                                                                                                                                                                                                    |
| :---------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Description**   | State-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context.                                                                                                                         | Our best model in terms of price-performance, offering well-rounded capabilities. Best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases.                                                                                                                                                                                     | Second-generation workhorse model with a 1 million token context window. Delivers next-gen features and improved capabilities, including superior speed, native tool use, and a 1M token context window.                                                                                                                                                                        |
| **Model Code**    | `gemini-2.5-pro`                                                                                                                                                                                                                                                                                                 | `gemini-2.5-flash`                                                                                                                                                                                                                                                                                                                                                                       | `gemini-2.0-flash`                                                                                                                                                                                                                                                                                                                                                                        |
| **Input Token Limit** | 1,048,576                                                                                                                                                                                                                                                                                                      | 1,048,576                                                                                                                                                                                                                                                                                                                                                                                | 1,048,576                                                                                                                                                                                                                                                                                                                                                                 |
| **Output Token Limit**| 65,536                                                                                                                                                                                                                                                                                                       | 65,536                                                                                                                                                                                                                                                                                                                                                                                 | 8,192                                                                                                                                                                                                                                                                                                                                                                        |
| **Supported Data Types (Inputs)** | Audio, images, video, text, and PDF                                                                                                                                                                                                                                                              | Text, images, video, audio                                                                                                                                                                                                                                                                                                                                                             | Audio, images, video, and text                                                                                                                                                                                                                                                                                                                                                        |
| **Supported Data Types (Outputs)** | Text                                                                                                                                                                                                                                                                                             | Text                                                                                                                                                                                                                                                                                                                                                                                     | Text                                                                                                                                                                                                                                                                                                                                                                                    |
| **Thinking Capability** | Supported                                                                                                                                                                                                                                                                                                      | Supported                                                                                                                                                                                                                                                                                                                                                                                | Experimental                                                                                                                                                                                                                                                                                                                                                                        |
| **URL Context**   | Supported                                                                                                                                                                                                                                                                                                      | Supported                                                                                                                                                                                                                                                                                                                                                                                | Not supported                                                                                                                                                                                                                                                                                                                                                                     |
| **Latest Update** | June 2025                                                                                                                                                                                                                                                                                                        | June 2025                                                                                                                                                                                                                                                                                                                                                                                | February 2025                                                                                                                                                                                                                                                                                                                                                                       |
| **Knowledge Cutoff** | January 2025                                                                                                                                                                                                                                                                                                     | January 2025                                                                                                                                                                                                                                                                                                                                                                             | August 2024                                                                                                                                                                                                                                                                                                                                                                         |
| **Live API** | Not supported | Not supported (for base Gemini 2.5 Flash, but specific `gemini-2.5-flash-native-audio-preview` supports it)| Supported |

In [22]:
# get URLs retrieved for context
print(response.candidates[0].url_context_metadata)

url_metadata=[UrlMetadata(
  retrieved_url='https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf',
  url_retrieval_status=<UrlRetrievalStatus.URL_RETRIEVAL_STATUS_SUCCESS: 'URL_RETRIEVAL_STATUS_SUCCESS'>
)]


# Add PDFs by URL

In [25]:
prompt = """
  Can you give me an overview of the content of this pdf?
  https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf

"""

config = {
    "tools": [{"url_context": {}}],
}

response = client.models.generate_content(
    contents=[prompt],
    model=MODEL_ID,
    config=config
)

display(Markdown(response.text.replace('$','$')))



The provided PDF document is the "2025 Q2 Alphabet Earnings Release". However, the `browse` tool was not able to extract the actual content of the earnings release. Instead, it returned the site map for Alphabet Investor Relations. This suggests that the PDF content is not directly parsable by the tool in a way that would provide the financial details. Therefore, I cannot provide an overview of the earnings release itself.

The document at the provided URL is a 2025 Q2 Alphabet Earnings Release, but the browsing tool could only extract the site map for Alphabet Investor Relations, not the actual financial content. I am unable to provide an overview of the PDF's content based on the information extracted.


# Add images by URL

In [26]:
prompt = """
  Can you help me name of the numbered parts of that instrument, in French?
  https://upload.wikimedia.org/wikipedia/commons/thumb/4/40/Trombone.svg/960px-Trombone.svg.png

"""

config = {
    "tools": [{"url_context": {}}],
}

response = client.models.generate_content(
    contents=[prompt],
    model=MODEL_ID,
    config=config
)

display(Markdown(response.text))

Here are the names of the numbered parts of the trombone in French, based on the provided image:

1.  **Clé d'eau** (Water Key / Spit Valve)
2.  **Embouchure** (Mouthpiece)
3.  **Pavillon** (Bell)
4.  **Coulisse d'accord** (Tuning Slide)
5.  **Coulisse extérieure** (Outer Slide)
6.  **Entretoise de pavillon** (Bell Brace / Stay)
7.  **Coulisse intérieure** (Inner Slide)
8.  **Verrou de coulisse** (Slide Lock / Collar)

# Mix Search grounding and URL context

In [28]:
prompt = """
  Can you give me an overview of the content of this pdf?
  https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf
  Search on the web for the reaction of the main financial analysts, what's the trend?
"""

config = {
  "tools": [
      {"url_context": {}},
      {"google_search": {}}
  ],
}

response = client.models.generate_content(
  contents=[prompt],
  model=MODEL_ID,
  config=config
)

display(Markdown(response.text.replace('$','$')))
display(HTML(response.candidates[0].grounding_metadata.search_entry_point.rendered_content))

I am unable to provide an overview of the content of the PDF directly as the browsing tool could not extract its content. It appears to have returned the surrounding webpage information rather than the PDF's text itself.

However, I can proceed with searching for financial analysts' reactions to Alphabet's 2025 Q2 earnings release.The browsing tool was unable to provide an overview of the PDF content directly.

Regarding financial analysts' reactions to Alphabet's 2025 Q2 earnings release, the trend is largely bullish, with most analysts issuing "buy" or equivalent ratings. Alphabet reported strong financial performance, exceeding analyst expectations for revenue and earnings per share (EPS).

Key takeaways from the analysts' reactions include:
*   **Strong Financial Performance**: Alphabet's Q2 2025 revenue was $96.4 billion, a 14% year-over-year increase, beating the expected $94.0 billion. Net income rose 19% to $28.2 billion, with EPS at $2.33, surpassing estimates.
*   **AI Investments and Returns**: Alphabet is making massive investments in AI, increasing its capital expenditures forecast to $85 billion for 2025. These investments are beginning to show tangible returns, with AI Overviews now having over 2 billion monthly users. Analysts believe Google's Cloud division benefits from strong enterprise demand and AI momentum.
*   **Google Cloud Growth**: Google Cloud revenue was a significant highlight, reaching $13.6 billion (up 32% year-over-year), exceeding expectations.
*   **Advertising Revenue Focus**: Advertising revenue remains a focus, with Google's Search and Other division revenue projected to have risen about 9% to $52.93 billion. However, the rise of generative AI poses a potential threat to traditional search.
*   **Market Reactions**: Initially, the stock saw a slight dip due to concerns over soaring capital expenditures, but quickly rebounded as analysts and institutional investors processed the long-term vision.
*   **Antitrust Concerns**: Wall Street is also monitoring an antitrust case ruling expected in the near future that could potentially impact Google's Chrome browser.