##### Copyright 2025 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Search re-ranking using Gemini embeddings

<a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Search_reranking_using_embeddings.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=30/></a>

<!-- Princing warning Badge -->
<table>
  <tr>
    <!-- Emoji -->
    <td bgcolor="#f5949e">
      <font size=30>⚠️</font>
    </td>
    <!-- Text Content Cell -->
    <td bgcolor="#f5949e">
      <h3><font color=black>This notebook requires paid tier rate limits to run properly.<br>  
(cf. <a href="https://ai.google.dev/pricing#veo2"><font color='#217bfe'>pricing</font></a> for more details).</font></h3>
    </td>
  </tr>
</table>

This notebook demonstrates the use of embeddings to re-rank search results. This walkthrough will focus on the following objectives:



1.   Setting up your development environment and API access to use Gemini.
2.   Using Gemini's function calling support to access the Wikipedia API.
3.   Embedding content via Gemini API.
4.   Re-ranking the search results.


This is how you will implement search re-ranking:


1.   The user will make a search query.
2.   You will use Wikipedia API to return the relevant search results.
3.   The search results will be embedded and their relevance will be evaluated by calculating distance metrics like cosine similarity.
4.   The most relevant search result will be returned as the final answer.

> The non-source code materials in this notebook are licensed under Creative Commons - Attribution-ShareAlike CC-BY-SA 4.0, https://creativecommons.org/licenses/by-sa/4.0/legalcode.

## Setup

First, download and install the Gemini API Python library.

In [1]:
!pip install -U -q google-genai

Also install the `wikipedia` package that will be used during this tutorial.

In [2]:
!pip install -U -q wikipedia

Note: The [`wikipedia` package](https://pypi.org/project/wikipedia/) notes that it was "designed for ease of use and simplicity, not for advanced use", and that production or heavy use should instead "use [Pywikipediabot](http://www.mediawiki.org/wiki/Manual:Pywikipediabot) or one of the other more advanced [Python MediaWiki API wrappers](http://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot#Python)".

In [11]:
import json
import textwrap

from google import genai
from google.genai import types

import wikipedia
from wikipedia.exceptions import DisambiguationError, PageError

import numpy as np

from IPython.display import Markdown

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

### Grab an API Key

Before you can use the Gemini API, you must first obtain an API key. If you don't already have one, create a key with one click in Google AI Studio.

<a class="button button-primary" href="https://aistudio.google.com/app/apikey" target="_blank" rel="noopener noreferrer">Get an API key</a>

In Colab, add the key to the secrets manager under the "🔑" in the left panel. Give it the name `GEMINI_API_KEY`.

Once you have the API key, pass it to the SDK. You can do this in two ways:

* Put the key in the `GEMINI_API_KEY` environment variable (the SDK will automatically pick it up from there).
* Pass the key to `genai.Client(api_key=...)`

In [None]:
from google.colab import userdata

# Or use `os.getenv('GEMINI_API_KEY')` to fetch an environment variable.
GEMINI_API_KEY=userdata.get('GEMINI_API_KEY')

client = genai.Client(api_key=GEMINI_API_KEY)

### Select the model to be used

In [7]:
MODEL_ID = "gemini-2.5-flash" # @param ["gemini-2.5-flash-lite-preview-06-17", "gemini-2.5-flash", "gemini-2.5-pro"] {"allow-input":true, isTemplate: true}

## Define tools

As stated earlier, this tutorial uses Gemini's function calling support to access the Wikipedia API. Please refer to the [docs](https://ai.google.dev/docs/function_calling) to learn more about function calling.

### Define the search function

To cater to the search engine needs, you will design this function in the following way:


*   For each search query, the search engine will use the `wikipedia.search` method to get relevant topics.
*   From the relevant topics, the engine will choose `n_topics(int)` top candidates and will use `gemini-2.5-flash` to extract relevant information from the page.
*   The engine will avoid duplicate entries by maintaining a search history.


In [8]:
def wikipedia_search(search_queries: list[str]) -> list[str]:
  """Search wikipedia for each query and summarize relevant docs."""
  n_topics=3
  search_history = set() # tracking search history
  search_urls = []
  summary_results = []

  for query in search_queries:
    print(f'Searching for "{query}"')
    search_terms = wikipedia.search(query)

    print(f"Related search terms: {search_terms[:n_topics]}")
    for search_term in search_terms[:n_topics]: # select first `n_topics` candidates
      if search_term in search_history: # check if the topic is already covered
        continue

      print(f'Fetching page: "{search_term}"')
      search_history.add(search_term) # add to search history

      try:
        # extract the relevant data by using `gemini-2.0-flash` model
        page = wikipedia.page(search_term, auto_suggest=False)
        url = page.url
        print(f"Information Source: {url}")
        search_urls.append(url)
        page = page.content
        response = client.models.generate_content(
            model=MODEL_ID,
            contents=textwrap.dedent(f"""\
                Extract relevant information
                about user's query: {query}
                From this source:

                {page}

                Note: Do not summarize. Only Extract and return the relevant information
        """))

        urls = [url]
        if response.candidates[0].citation_metadata:
          extra_citations = response.candidates[0].citation_metadata.citation_sources
          extra_urls = [source.url for source in extra_citations]
          urls.extend(extra_urls)
          search_urls.extend(extra_urls)
          print("Additional citations:", response.candidates[0].citation_metadata.citation_sources)
        try:
          text = response.text
        except ValueError:
          pass
        else:
          summary_results.append(text + "\n\nBased on:\n  " + ',\n  '.join(urls))

      except DisambiguationError:
        print(f"""Results when searching for "{search_term}" (originally for "{query}")
        were ambiguous, hence skipping""")
        continue

      except PageError:
        print(f'{search_term} did not match with any page id, hence skipping.')
        continue
        
      except:
        print(f'{search_term} did not match with any page id, hence skipping.')
        continue

  print(f"Information Sources:")
  for url in search_urls:
    print('    ', url)

  return summary_results


In [9]:
example = wikipedia_search(["What are LLMs?"])

Searching for "What are LLMs?"
Related search terms: ['Large language model', 'Retrieval-augmented generation', 'Gemini (chatbot)']
Fetching page: "Large language model"
Information Source: https://en.wikipedia.org/wiki/Large_language_model
Fetching page: "Retrieval-augmented generation"
Information Source: https://en.wikipedia.org/wiki/Retrieval-augmented_generation
Fetching page: "Gemini (chatbot)"
Information Source: https://en.wikipedia.org/wiki/Gemini_(chatbot)
Information Sources:
     https://en.wikipedia.org/wiki/Large_language_model
     https://en.wikipedia.org/wiki/Retrieval-augmented_generation
     https://en.wikipedia.org/wiki/Gemini_(chatbot)


Here is what the search results look like:

In [10]:
from IPython.display import display

for e in example:
  display(to_markdown(e))

> A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation.
> The largest and most capable LLMs are generative pretrained transformers (GPTs).
> LLMs can be fine-tuned for specific tasks or guided by prompt engineering.
> These models acquire predictive power regarding syntax, semantics, and ontologies inherent in human language corpora, but they also inherit inaccuracies and biases present in the data they are trained in.
> An LLM is a type of foundation model (large X model) trained on language.
> LLMs are generally based on the transformer architecture.
> Typically, LLMs are trained with single- or half-precision floating point numbers (float32 and float16).
> The qualifier "large" in "large language model" is inherently vague, as there is no definitive threshold for the number of parameters required to qualify as "large".
> 
> Based on:
>   https://en.wikipedia.org/wiki/Large_language_model

> Large language models (LLMs) are a type of model that rely on static training data. They have pre-existing training data and an internal representation of this data. LLMs can generate responses, output, or synthesize answers to user queries. However, LLMs can provide incorrect information, generate misinformation, or hallucinate. They may also struggle to recognize when they lack sufficient information to provide a reliable response, or misinterpret the context of information they retrieve.
> 
> Based on:
>   https://en.wikipedia.org/wiki/Retrieval-augmented_generation

> *   **Definition/Nature:**
>     *   Generative artificial intelligence chatbots (like Gemini and ChatGPT) are "based on" large language models (LLMs).
>     *   Gemini is described as a "multimodal and more powerful LLM touted as the company's 'largest and most capable AI model'."
> *   **Examples of LLMs mentioned:**
>     *   GPT-3 family
>     *   LaMDA (a prototype LLM)
>     *   PaLM (a newer and more powerful LLM from Google)
>     *   Gemini
> 
> Based on:
>   https://en.wikipedia.org/wiki/Gemini_(chatbot)

### Pass the tools to the model

If you pass a list of functions to the `GenerativeModel`'s `tools` argument,
it will extract a schema from the function's signature and type hints, and then pass schema along to the API calls. In response the model may return a `FunctionCall` object asking to call the function.

Note: This approach only handles annotations of `AllowedTypes = int | float | str | dict | list['AllowedTypes']`

The request to the Gemini model will keep a reference to the function inself, so that it _can_ execute the function locally later.

## Generate supporting search queries

In order to have multiple supporting search queries to the user's original query, you will ask the model to generate more such queries. This would help the engine to cover the asked question on comprehensive levels.

In [16]:
instructions = """You have access to the Wikipedia API which you will be using
to answer a user's query. Your job is to generate a list of search queries which
might answer a user's question. Be creative by using various key-phrases from
the user's query. To generate variety of queries, ask questions which are
related to  the user's query that might help to find the answer. The more
queries you generate the better are the odds of you finding the correct answer.
Here is an example:

user: Tell me about Cricket World cup 2023 winners.

function_call: wikipedia_search(['What is the name of the team that
won the Cricket World Cup 2023?', 'Who was the captain of the Cricket World Cup
2023 winning team?', 'Which country hosted the Cricket World Cup 2023?', 'What
was the venue of the Cricket World Cup 2023 final match?', 'Cricket World cup 2023',
'Who lifted the Cricket World Cup 2023 trophy?'])

The search function will return a list of article summaries, use these to
answer the  user's question.

Here is the user's query: {query}
"""

In order to yield creative and a more random variety of questions, you will set the model's temperature parameter to a value higher. Values can range from [0.0,1.0], inclusive. A value closer to 1.0 will produce responses that are more varied and creative, while a value closer to 0.0 will typically result in more straightforward responses from the model.

## Enable automatic function calling and call the API

Now start a new chat with `enable_automatic_function_calling=True`. With it enabled, the `genai.ChatSession` will handle the back and forth required to call the function, and return the final response:

In [15]:
tools = [wikipedia_search]

config = types.GenerateContentConfig(
    temperature=0.6,
    tools=tools
)

In [18]:
chat = client.chats.create(
    model="gemini-2.5-flash",
    config=config
)

query = "Explain how deep-sea life survives."

res = chat.send_message(instructions.format(query=query))

Searching for "Deep-sea life survival strategies"
Related search terms: ['Sea of Thieves', 'Thalassophobia', 'Whalefall (novel)']
Fetching page: "Sea of Thieves"
Information Source: https://en.wikipedia.org/wiki/Sea_of_Thieves
Fetching page: "Thalassophobia"
Information Source: https://en.wikipedia.org/wiki/Thalassophobia
Fetching page: "Whalefall (novel)"
Information Source: https://en.wikipedia.org/wiki/Whalefall_(novel)
Searching for "Adaptations of deep-sea organisms"
Related search terms: ['Deep-sea fish', 'Deep sea', 'Deep-sea gigantism']
Fetching page: "Deep-sea fish"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish
Fetching page: "Deep sea"
Information Source: https://en.wikipedia.org/wiki/Deep_sea
Fetching page: "Deep-sea gigantism"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_gigantism
Searching for "How do deep-sea creatures survive extreme pressure?"
Related search terms: ['Deep-sea community', 'Deep sea', 'Deep-sea fish']
Fetching page: "Deep-s

In [19]:
to_markdown(res.text)



> Deep-sea life has evolved remarkable adaptations to survive the harsh conditions of its environment, characterized by immense pressure, perpetual darkness, extremely cold temperatures, and scarce food resources.
> 
> Here's how deep-sea organisms manage to thrive:
> 
> 1.  **Pressure Adaptations:**
>     *   **Internal Pressure Equalization:** Deep-sea creatures maintain an internal body pressure that is equal to the external hydrostatic pressure, preventing them from being crushed.
>     *   **Flexible Bodies:** Many species have gelatinous, watery flesh with minimal bone structure and reduced tissue density. This allows their bodies to compress without damage.
>     *   **Absence of Gas-Filled Spaces:** Most deep-sea fish lack swim bladders (gas-filled organs used for buoyancy in shallower waters) as these would collapse under pressure. Instead, some use lipid-rich tissues or have hydrofoil-like fins for lift.
>     *   **Molecular Adaptations:** At a cellular level, their proteins and enzymes are specially adapted to function under high pressure, often being more rigid or having modified structures (e.g., increased salt bridges in actin, higher proportion of unsaturated fatty acids in cell membranes to maintain fluidity). Some use osmolytes like Trimethylamine N-oxide (TMAO) to protect proteins.
> 
> 2.  **Food Acquisition:**
>     *   **Marine Snow:** In the vast majority of the deep sea, the primary food source is "marine snow"—a continuous shower of organic detritus (dead organisms, fecal pellets, etc.) falling from the productive upper layers of the ocean. Organisms filter this snow or scavenge larger food falls.
>     *   **Chemosynthesis:** Around hydrothermal vents and cold seeps, life thrives independently of sunlight through chemosynthesis. Specialized bacteria and archaea convert chemical compounds (like hydrogen sulfide and methane) from the Earth's interior into organic matter. These microorganisms form the base of the food web, supporting dense communities of unique organisms, often through symbiotic relationships (e.g., tube worms hosting chemosynthetic bacteria).
>     *   **Efficient Feeding:** Due to food scarcity, many deep-sea fish have slow metabolisms and unspecialized diets, preferring to "sit and wait" for prey. They often possess large, hinged, and extensible jaws with sharp, recurved teeth to engulf prey of their own size or larger.
> 
> 3.  **Light and Sensory Adaptations:**
>     *   **Bioluminescence:** In the absence of sunlight, many deep-sea creatures produce their own light through bioluminescence. This is used for various purposes: attracting prey (like the anglerfish's glowing lure), finding mates, deterring predators (e.g., by startling or counter-illuminating their undersides to blend with faint overhead light), and communication.
>     *   **Enhanced Vision:** While some deep-sea fish are blind, others have exceptionally large, tubular eyes with highly sensitive rod cells that are adapted to detect the faintest flickers of bioluminescence or silhouettes against the dim light from above.
>     *   **Other Senses:** Given the limited utility of sight, deep-sea organisms heavily rely on other senses. They possess highly developed lateral line systems to detect changes in water pressure and vibrations, an acute sense of smell (olfactory system) to locate food or mates, and sensitive inner ears. Many also have long feelers or tentacles to navigate and find prey in the darkness.
> 
> 4.  **Metabolic and Physical Adaptations:**
>     *   **Slow Metabolism:** To conserve energy in a food-scarce environment, deep-sea organisms generally have very slow metabolisms and often grow slowly and live long lives.
>     *   **Body Shape and Movement:** Their bodies are often elongated with weak, watery muscles and minimal skeletal structures, which allows them to remain suspended in water with little energy expenditure. Their body shapes are generally better suited for periodic bursts of swimming rather than continuous movement.
>     *   **Deep-Sea Gigantism:** Some deep-sea species exhibit gigantism, growing much larger than their shallow-water relatives. This is thought to be an adaptation to colder temperatures, food scarcity (larger size improves foraging ability and metabolic efficiency), and reduced predation pressure.
> 
> 5.  **Reproduction:**
>     *   Finding a mate in the vast, dark deep sea can be challenging. Adaptations include hermaphroditism (being both male and female) or unique reproductive strategies, such as the parasitic male anglerfish, which permanently attaches to the female, ensuring a mate is always available.

That looks like it worked. You can go through the chat history to see the details of what was sent and received in the function calls:

In [28]:
for content in chat._comprehensive_history:
  part = content.parts[0]

  print(f'{content.role} -> ', end='')
  print(json.dumps(part.to_json_dict(), indent=2))
  print('---' * 20)


user -> {
  "text": "You have access to the Wikipedia API which you will be using\nto answer a user's query. Your job is to generate a list of search queries which\nmight answer a user's question. Be creative by using various key-phrases from\nthe user's query. To generate variety of queries, ask questions which are\nrelated to  the user's query that might help to find the answer. The more\nqueries you generate the better are the odds of you finding the correct answer.\nHere is an example:\n\nuser: Tell me about Cricket World cup 2023 winners.\n\nfunction_call: wikipedia_search(['What is the name of the team that\nwon the Cricket World Cup 2023?', 'Who was the captain of the Cricket World Cup\n2023 winning team?', 'Which country hosted the Cricket World Cup 2023?', 'What\nwas the venue of the Cricket World Cup 2023 final match?', 'Cricket World cup 2023',\n'Who lifted the Cricket World Cup 2023 trophy?'])\n\nThe search function will return a list of article summaries, use these to\nans

In the chat history you can see all 4 steps:

1. The user sent the query.
2. The model replied with a `genai.protos.FunctionCall` calling the `wikipedia_search` with a number of relevant searches.
3. Because you set `enable_automatic_function_calling=True` when creating the `genai.ChatSession`, it  executed the search function and returned the list of article summaries to the model.
4. Folliwing the instructions in the prompt, the model generated a final answer based on those summaries.


## [Optional] Manually execute the function call

If you want to understand what happened behind the scenes, this section executes the `FunctionCall` manually to demonstrate.

In [29]:
chat = client.chats.create(
    model=MODEL_ID,
    config=config
)

In [30]:
result = chat.send_message(instructions.format(query=query))

Searching for "How do deep-sea creatures adapt to high pressure?"
Related search terms: ['Deep sea', 'Deep-sea fish', 'Oceanic zone']
Fetching page: "Deep sea"
Information Source: https://en.wikipedia.org/wiki/Deep_sea
Fetching page: "Deep-sea fish"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish
Fetching page: "Oceanic zone"
Information Source: https://en.wikipedia.org/wiki/Oceanic_zone
Searching for "What are the food sources for deep-sea life?"
Related search terms: ['Deep-sea fish', 'Deep-sea wood', 'Deep sea']
Fetching page: "Deep-sea wood"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_wood
Searching for "How do deep-sea organisms survive in the absence of sunlight?"
Related search terms: ['Deep-sea community', 'Hydrothermal vent', 'Abyssal zone']
Fetching page: "Deep-sea community"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_community
Fetching page: "Hydrothermal vent"
Information Source: https://en.wikipedia.org/wiki/Hydrothermal_vent


Initially the model returns a FunctionCall:

In [59]:
fc = result.automatic_function_calling_history[1].parts[0].function_call
fc = fc.to_json_dict()
print(json.dumps(fc, indent=2))


{
  "args": {
    "search_queries": [
      "How do deep-sea creatures adapt to high pressure?",
      "What are the food sources for deep-sea life?",
      "How do deep-sea organisms survive in the absence of sunlight?",
      "What are the unique adaptations of deep-sea animals?",
      "Deep-sea hydrothermal vents and life",
      "Deep-sea chemosynthesis",
      "Bioluminescence in deep-sea creatures",
      "Deep-sea extremophiles survival mechanisms",
      "How do deep-sea fish regulate buoyancy?",
      "What is marine snow and its role in deep-sea ecosystems?"
    ]
  },
  "name": "wikipedia_search"
}


In [62]:
fc['name']

'wikipedia_search'

Call the function with generated arguments to get the results.

In [63]:
summaries = wikipedia_search(**fc['args'])

Searching for "How do deep-sea creatures adapt to high pressure?"
Related search terms: ['Deep sea', 'Deep-sea fish', 'Oceanic zone']
Fetching page: "Deep sea"
Information Source: https://en.wikipedia.org/wiki/Deep_sea
Fetching page: "Deep-sea fish"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish
Fetching page: "Oceanic zone"
Information Source: https://en.wikipedia.org/wiki/Oceanic_zone
Searching for "What are the food sources for deep-sea life?"
Related search terms: ['Deep-sea fish', 'Deep-sea wood', 'Deep sea']
Fetching page: "Deep-sea wood"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_wood
Searching for "How do deep-sea organisms survive in the absence of sunlight?"
Related search terms: ['Deep-sea community', 'Hydrothermal vent', 'Abyssal zone']
Fetching page: "Deep-sea community"
Information Source: https://en.wikipedia.org/wiki/Deep-sea_community
Fetching page: "Hydrothermal vent"
Information Source: https://en.wikipedia.org/wiki/Hydrothermal_vent


Now send the `FunctionResult` to the model.

In [65]:
response = chat.send_message(
    [genai.types.Part.from_function_response(
        name='wikipedia_search',
        response={'result': summaries}
    )]
)

to_markdown(response.text)



> Deep-sea life has evolved a remarkable array of adaptations to survive in an environment characterized by extreme pressure, perpetual darkness, scarce food, and often unique chemical conditions.
> 
> Here's how they manage to thrive:
> 
> 1.  **Adapting to Immense Pressure:**
>     *   **Internal Pressure Equalization:** Deep-sea organisms maintain an internal pressure that matches the crushing external hydrostatic pressure, preventing their bodies from being compressed.
>     *   **Protein Stability:** Their proteins have evolved unique structural changes, such as increased salt bridges and more rigid globular structures, to maintain functionality under high pressure. Specific osmolytes like Trimethylamine N-oxide (TMAO) are also abundant in their cells to protect proteins from destabilization.
>     *   **Membrane Fluidity:** To counteract the reduced fluidity of cell membranes under pressure, they increase the proportion of unsaturated fatty acids in their membrane lipids, ensuring essential biological processes can occur efficiently.
>     *   **Skeletal and Tissue Modifications:** Many deep-sea fish have reduced tissue density, minimal bone structure, and gelatinous, water-filled bodies. This helps them with buoyancy and withstands the constant pressure. Unlike shallow-water fish, most deep-sea species do not have gas-filled swim bladders, as these would be crushed by the pressure. Some may develop a gelatinous layer for buoyancy and to reduce drag.
> 
> 2.  **Finding Food and Energy in Perpetual Darkness:**
>     *   **Marine Snow and Large Food Falls:** With no sunlight for photosynthesis, the primary food source for many deep-sea creatures is "marine snow"—a continuous shower of organic particles (dead organisms, waste, detritus) sinking from the productive upper ocean layers. Larger, less frequent food sources include "whale falls," where the carcasses of dead whales provide a massive, long-lasting feast for specialized communities.
>     *   **Chemosynthesis:** At hydrothermal vents and cold seeps, entire ecosystems flourish independently of sunlight. Here, specialized bacteria perform chemosynthesis, using chemical compounds (like hydrogen sulfide, methane, and other minerals) released from the Earth's interior as an energy source to produce organic matter. These chemosynthetic bacteria form the base of the food web, supporting a diverse array of life, including giant tube worms, clams, and mussels, many of which host these bacteria in symbiotic relationships within their bodies.
>     *   **Vertical Migration:** Some mid-water (mesopelagic) species undertake daily vertical migrations, ascending to the more food-rich surface waters at night to feed and returning to the darker, safer depths during the day.
> 
> 3.  **Coping with Darkness and Sensing the Environment:**
>     *   **Sensory Adaptations:** Many deep-sea organisms have either evolved extremely large, highly sensitive eyes to detect the faintest light (including bioluminescence) or have lost their eyes entirely, relying instead on enhanced senses of touch, smell, and pressure changes to navigate, find food, and avoid predators.
>     *   **Bioluminescence:** The ability to produce their own light (bioluminescence) is widespread. It's used for various purposes: attracting mates or prey, startling or distracting predators (e.g., creating a "smokescreen"), and camouflage through counter-illumination (matching the faint downwelling light to hide their silhouette).
>     *   **Coloration:** Due to the absence of light, vibrant colors are not visible. Many deep-sea creatures are transparent, black, or red. Red appears black in the deep ocean because red light wavelengths are absorbed quickly by water and do not penetrate to these depths, providing effective camouflage.
> 
> 4.  **Metabolic and Behavioral Strategies:**
>     *   **Slow Metabolism:** To conserve energy in an environment with limited food resources, many deep-sea organisms have significantly slower metabolic rates and require less oxygen.
>     *   **Energy Conservation:** Movement is often slow and deliberate. Reproduction rates are also typically slow, which reduces competition for scarce resources.
>     *   **Efficient Feeding:** Many species have large, flexible mouths and expandable stomachs, allowing them to consume large, infrequent meals when prey is encountered, maximizing the intake from rare opportunities.
> 
> 5.  **Extremophile Specializations:**
>     *   Microorganisms thriving in the most extreme deep-sea conditions (extremophiles) exhibit highly specialized survival mechanisms. Their amino acid compositions allow their proteins to function under extreme temperatures and pressures. They have unique metabolic pathways, such as sulfur oxidation, hydrocarbon bioremediation, or even "eating rocks" (like pyrite), enabling them to derive energy from unusual chemical sources in their harsh habitats.

## Re-ranking the search results

Helper function to embed the content:

In [69]:
EMBEDDINGS_MODEL_ID = "gemini-embedding-001"  # @param ["gemini-embedding-001", "text-embedding-004"] {"allow-input": true, "isTemplate": true}

def get_embeddings(content: list[str]) -> np.ndarray:
  embeddings = client.models.embed_content(
    model=EMBEDDINGS_MODEL_ID,
    contents=content,
    config=types.EmbedContentConfig(
      task_type='SEMANTIC_SIMILARITY'
    )
  )
  return embeddings.embeddings[0].values

Please refer to the [embeddings guide](https://ai.google.dev/gemini-api/docs/embeddings) for more information on embeddings.

Your next step is to define functions that you can use to calculate similarity scores between two embedding vectors. These scores will help you decide which embedding vector is the most relevant vector to the user's query.


You will now implement cosine similarity as your metric. Here returned embedding vectors will be of unit length and hence their L1 norm (`np.linalg.norm()`) will be ~1. Hence, calculating cosine similarity is esentially same as calculating their dot product score.

In [76]:
def dot_product(a: np.ndarray, b: np.ndarray):
  return (np.array(a) @ np.array(b).T)

### Similarity with user's query

Now it's time to find the most relevant search result returned by the Wikipedia API.

Use Gemini API to get embeddings for user's query and search results.

In [70]:
search_res = get_embeddings(summaries)
embedded_query = get_embeddings([query])

Calculate similarity score:

In [77]:
sim_value = dot_product(search_res, embedded_query)

using `np.argmax` best candidate is selected.

**Users's Input:** Explain how deep-sea life survives.

**Answer:**

In [79]:
Markdown(summaries[np.argmax(sim_value)])

Here's the extracted information about how deep-sea creatures adapt to high pressure:

*   Deep-sea fish have different adaptations in their proteins, anatomical structures, and metabolic systems to survive in the Deep sea, where the inhabitants have to withstand great amount of hydrostatic pressure.
*   Deep-sea organisms must have the ability to maintain a well-regulated metabolic system in the face of high pressures.
*   Deep sea species must undergo physiological and structural adaptations to preserve protein functionality against pressure, as hydrostatic pressure affects both protein folding and assembly and enzymatic activity.
*   Some Deep-sea fish developed pressure tolerance through the change in mechanism of their α-actin, a protein essential for different cellular functions and a main component for muscle fiber.
*   In some species that live in depths greater than 5 km (3.1 mi), C.armatus and C.yaquinae have specific substitutions (e.g., Q137K and V54A from C.armatus or I67P from C.yaquinae) on the active sites of α-Actin, which are predicted to have importance in pressure tolerance.
*   These specific substitutions result in significant changes in the salt bridge patterns of the protein, allowing for better stabilization in ATP binding and subunit arrangement. Deep sea fish have more salt bridges in their actins compared to fish inhabiting the upper zones of the sea.
*   Specific osmolytes were found to be abundant in deep sea fish under high hydrostatic pressure. For certain chondrichthyans, Trimethylamine N-oxide (TMAO) increased with depth, replacing other osmolytes and urea. TMAO is able to protect proteins from high hydrostatic pressure destabilizing them.
*   The Mariana hadal snailfish developed modification in the Osteocalcin(burlap) gene, where premature termination was found, resulting in an open skull and cartilage-based bone formation. This is an adaptation because closed skulls common in surface organisms cannot withstand the enforcing stress of high hydrostatic pressure, and common bone developments cannot maintain structural integrity under constant high pressure.

Based on:
  https://en.wikipedia.org/wiki/Deep_sea

### Similarity with Hypothetical Document Embeddings (HyDE)

Drawing inspiration from [Gao et al](https://arxiv.org/abs/2212.10496) the objective here is to generate a template answer to the user's query using `gemini-2.5-flash`'s internal knowledge. This hypothetical answer will serve as a baseline to calculate relevance of all the search results.

In [81]:
res = client.models.generate_content(
    model=MODEL_ID,
    contents=f"""Generate a hypothetical answer
to the user's query by using your own knowledge. Assume that you know everything
about the said topic. Do not use factual information, instead use placeholders
to complete your answer. Your answer should feel like it has been written by a human.

query: {query}""")

to_markdown(res.text)

> Oh, deep-sea life, that's truly fascinating! It's like a whole other planet down there, and they've evolved some utterly incredible ways to just... exist.
> 
> First off, the **[immense environmental force]** is immense, right? It's not like anything we experience up here. So, deep-sea creatures have these really special **[type of internal structure or biological compound]** that prevents their cells from just collapsing. Their **[specific body parts or tissues]** are incredibly flexible yet strong, designed to withstand those crushing forces. It’s not just about being squishy; it's about their **[molecular processes]** being able to function perfectly under that extreme compression.
> 
> Then there's the **[lack of primary resource]**, which is probably the biggest challenge after pressure. No sunlight means no photosynthesis. So, they've come up with some truly ingenious ways to get energy. Many just rely on whatever **[type of organic matter]** floats down from the **[upper ocean layer]**, basically scavenging. Others, near **[geological features]**, have developed **[type of metabolic process]** where they get energy from **[specific chemical compounds A]** and **[specific chemical compounds B]** spewing out of the Earth – completely independent of the sun! It's wild.
> 
> And speaking of no light, they don't have typical eyes in many cases, or if they do, they're adapted to pick up the faintest **[type of light emission]**. Many rely on other senses entirely, like detecting **[subtle environmental cues]** or **[specific chemical signals]** in the water. Plus, a lot of them use **[biological light production]** for communication, attracting mates, or even luring prey. Imagine a world where your only light source is your own body or your neighbor's!
> 
> The **[temperature extreme]** is another hurdle. It's usually bone-chillingly cold down there. So, they often have **[biological compounds]** in their bodies that act like natural antifreeze, keeping their **[bodily fluids]** from freezing solid. Their **[metabolic rate]** is also often incredibly slow, which helps them conserve energy in such a resource-scarce and cold environment.
> 
> Finding a mate in the vast, dark emptiness? That's a puzzle! They've got strategies like releasing **[specific chemical attractants]** that drift for miles, or using unique patterns of **[biological light]** to signal their presence. Some are even **[reproductive strategy]**, so they don't have to worry about finding another gender.
> 
> It really goes to show how life finds a way, doesn't it? They've just adapted their entire existence around these extreme conditions.

Use Gemini API to get embeddings for the baseline answer and compare them with search results

In [82]:
hypothetical_ans = get_embeddings([res.text])

Calculate similarity scores to rank the search results

In [83]:
sim_value = dot_product(search_res, hypothetical_ans)

In [84]:
sim_value

0.8530243497361725

using `np.argmax` best candidate is selected.

**Users's Input:** Explain how deep-sea life survives.

**Answer:**

In [85]:
to_markdown(summaries[np.argmax(sim_value)])

> Here's the extracted information about how deep-sea creatures adapt to high pressure:
> 
> *   Deep-sea fish have different adaptations in their proteins, anatomical structures, and metabolic systems to survive in the Deep sea, where the inhabitants have to withstand great amount of hydrostatic pressure.
> *   Deep-sea organisms must have the ability to maintain a well-regulated metabolic system in the face of high pressures.
> *   Deep sea species must undergo physiological and structural adaptations to preserve protein functionality against pressure, as hydrostatic pressure affects both protein folding and assembly and enzymatic activity.
> *   Some Deep-sea fish developed pressure tolerance through the change in mechanism of their α-actin, a protein essential for different cellular functions and a main component for muscle fiber.
> *   In some species that live in depths greater than 5 km (3.1 mi), C.armatus and C.yaquinae have specific substitutions (e.g., Q137K and V54A from C.armatus or I67P from C.yaquinae) on the active sites of α-Actin, which are predicted to have importance in pressure tolerance.
> *   These specific substitutions result in significant changes in the salt bridge patterns of the protein, allowing for better stabilization in ATP binding and subunit arrangement. Deep sea fish have more salt bridges in their actins compared to fish inhabiting the upper zones of the sea.
> *   Specific osmolytes were found to be abundant in deep sea fish under high hydrostatic pressure. For certain chondrichthyans, Trimethylamine N-oxide (TMAO) increased with depth, replacing other osmolytes and urea. TMAO is able to protect proteins from high hydrostatic pressure destabilizing them.
> *   The Mariana hadal snailfish developed modification in the Osteocalcin(burlap) gene, where premature termination was found, resulting in an open skull and cartilage-based bone formation. This is an adaptation because closed skulls common in surface organisms cannot withstand the enforcing stress of high hydrostatic pressure, and common bone developments cannot maintain structural integrity under constant high pressure.
> 
> Based on:
>   https://en.wikipedia.org/wiki/Deep_sea

You have now created a search re-ranking engine using embeddings!

## Next steps

I hope you found this example helpful! Check out more examples in the [Gemini API cookbook](https://github.com/google-gemini/cookbook/) to learn more.