##### Copyright 2024 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Search Wikipedia using ReAct

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google-gemini/cookbook/blob/gemini-1.5-archive/examples/Search_Wikipedia_using_ReAct.ipynb"><img src="https://github.com/google-gemini/cookbook/blob/gemini-1.5-archive/images/colab_logo_32px.png?raw=1" />Run in Google Colab</a>
  </td>
</table>

This notebook is a minimal implementation of [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/abs/2210.03629) with the Google `gemini-1.5-flash` model. You'll use ReAct prompting to configure a model to search Wikipedia to find the answer to a user's question.


In this walkthrough, you will learn how to:


1.   Set up your development environment and API access to use Gemini.
2.   Use a ReAct few-shot prompt.
3.   Use the newly prompted model for multi-turn conversations (chat).
4.   Connect the model to the **Wikipedia API**.
5.  Have conversations with the model (try asking it questions like "how tall is the Eiffel Tower?") and watch it search Wikipedia.


> Note: The non-source code materials on this page are licensed under Creative Commons - Attribution-ShareAlike CC-BY-SA 4.0, https://creativecommons.org/licenses/by-sa/4.0/legalcode.

### Background

  


[ReAct](https://arxiv.org/abs/2210.03629) is a prompting method which allows language models to create a trace of their thinking processes and the steps required to answer a user's questions. This improves human interpretability and trustworthiness. ReAct prompted models generate Thought-Action-Observation triplets for every iteration, as you'll soon see. Let's get started!

## Setup


In [1]:
!pip install -q "google-generativeai>=0.7.2"

In [2]:
!pip install -q wikipedia

  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone


Note: The [`wikipedia` package](https://pypi.org/project/wikipedia/) notes that it was "designed for ease of use and simplicity, not for advanced use", and that production or heavy use should instead "use [Pywikipediabot](http://www.mediawiki.org/wiki/Manual:Pywikipediabot) or one of the other more advanced [Python MediaWiki API wrappers](http://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot#Python)".

In [3]:
import re
import os

import wikipedia
from wikipedia.exceptions import DisambiguationError, PageError

import google.generativeai as genai

To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) quickstart for an example.

In [4]:
from google.colab import userdata
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

## The ReAct prompt

The prompts used in the paper are available at [https://github.com/ysymyth/ReAct/tree/master/prompts](https://github.com/ysymyth/ReAct/tree/master/prompts)

Here, you will be working with the following ReAct prompt with a few minor adjustments.

> Note: The prompt and in-context examples used here are borrowed from [https://github.com/ysymyth/ReAct](https://github.com/ysymyth/ReAct) which is published under a [MIT license](https://opensource.org/licenses/MIT).

In [14]:
model_instructions = """Bir soru-cevap görevini, Düşünce (Thought), Eylem (Action) ve Gözlem (Observation) adımları arasında geçiş yaparak çöz.
- Düşünce, mevcut durumu akıl yürüterek analiz eder.
- Gözlem, bir eylemin çıktısından elde edilen bilgiyi anlamaktır.
- Eylem ise üç tipten biri olabilir:

(1) <search>başlık</search>: Verilen başlığı Vikipedi'de arar ve ilk paragrafı döner. Eğer bulunamazsa benzer başlık önerileri sunar.
(2) <lookup>anahtar kelime</lookup>: Mevcut bağlamda geçen, verilen anahtar kelimeyi içeren bir sonraki cümleyi döner. (Tam eşleşme yapar, kelimeyi kısa tut.)
(3) <finish>cevap</finish>: Cevabı döner ve görevi sonlandırır.
"""


### Few-shot prompting to enable in-context learning with Gemini


While large language models show good understanding of the instructions they are prompted with, they still may perform poorly on complex tasks in a zero-shot setting. Hence, you will now provide a few examples along with your prompt to steer the model's output according to your needs. This in-context learning improves the model's performance significantly.

In [23]:
examples = """
İşte bazı örnekler.

Soru
Pikachu’nun evrimleştiği Pokémon hangisidir?

Düşünce 1
Pikachu’nun evrim geçirdiği Pokémon’u bulmalıyım. Bu yüzden Pikachu’yu aramalıyım.

Eylem 1
<search>Pikachu</search>

Gözlem 1
Pikachu, Elektrik tipi bir Pokémon’dur. Pichu’dan evrimleşir ve bir Thunder Stone (Yıldırım Taşı) ile Raichu’ya evrimleşir.

Düşünce 2
Pikachu, Raichu’ya evrimleşiyormuş. Bu sorunun cevabı Raichu’dur.

Eylem 2
<finish>Raichu</finish>

Soru
Charizard hangi Pokémon’un evrimleşmiş hâlidir?

Düşünce 1
Charizard’ın evrim zincirini öğrenmeliyim. Charizard’ı arayarak hangi Pokémon’dan evrimleştiğini bulabilirim.

Eylem 1
<search>Charizard</search>

Gözlem 1
Charizard, Fire/Flying (Ateş/Uçan) tipi bir Pokémon’dur. Charmander’dan evrimleşen Charmeleon’un son evrimidir.

Düşünce 2
Charizard, Charmeleon’dan evrimleşmiş, o da Charmander’dan geliyor. Ancak doğrudan bir önceki evrim Charmeleon’dur.

Eylem 2
<finish>Charmeleon</finish>

Soru
Mewtwo hangi yöntemle oluşturulmuştur?

Düşünce 1
Mewtwo’nun nasıl yaratıldığını öğrenmem gerekiyor. Mewtwo’yu aramalıyım.

Eylem 1
<search>Mewtwo</search>

Gözlem 1
Mewtwo, Pokémon evreninde genetik mühendislik ile Mew’den klonlanarak oluşturulmuş bir efsanevi Pokémon’dur.

Düşünce 2
Mewtwo, Mew’den genetik mühendislik ile oluşturulmuştur. Cevap budur.

Eylem 2
<finish>Genetik mühendislik ile Mew’den klonlanarak</finish>
"""


Copy the instructions along with examples in a file called `model_instructions.txt`

In [24]:
ReAct_prompt = model_instructions + examples
with open('model_instructions.txt', 'w') as f:
  f.write(ReAct_prompt)

## The Gemini-ReAct pipeline

### Setup

You will now build an end-to-end pipeline to facilitate multi-turn chat with the ReAct-prompted Gemini model.

In [25]:
class ReAct:
  def __init__(self, model: str, ReAct_prompt: str | os.PathLike):
    """Prepares Gemini to follow a `Few-shot ReAct prompt` by imitating
    `function calling` technique to generate both reasoning traces and
    task-specific actions in an interleaved manner.

    Args:
        model: name to the model.
        ReAct_prompt: ReAct prompt OR path to the ReAct prompt.
    """
    self.model = genai.GenerativeModel(model)
    self.chat = self.model.start_chat(history=[])
    self.should_continue_prompting = True
    self._search_history: list[str] = []
    self._search_urls: list[str] = []

    try:
      # try to read the file
      with open(ReAct_prompt, 'r') as f:
        self._prompt = f.read()
    except FileNotFoundError:
      # assume that the parameter represents prompt itself rather than path to the prompt file.
      self._prompt = ReAct_prompt

  @property
  def prompt(self):
    return self._prompt

  @classmethod
  def add_method(cls, func):
    setattr(cls, func.__name__, func)

  @staticmethod
  def clean(text: str):
    """Helper function for responses."""
    text = text.replace("\n", " ")
    return text

### Define tools


As instructed by the prompt, the model will be generating **Thought-Action-Observation** traces, where every **Action** trace could be one of the following tokens:


1.   </search/> : Perform a Wikipedia search via external API.
2.   </lookup/> : Lookup for specific information on a page with the Wikipedia API.
3.   </finish/> : Stop the execution of the model and return the answer.

If the model encounters any of these tokens, the model should make use of the `tools` made available to the model. This understanding of the model to leverage acquired toolsets to collect information from the external world is often referred to as **function calling**. Therefore, the next goal is to imitate this function calling technique in order to allow ReAct prompted Gemini model to access the external groundtruth.

The Gemini API supports function calling and you could use this feature to set up your tools. However, for this tutorial, you will learn to simulate it using `stop_sequences` parameter.


Define the tools:

#### Search
Define a method to perform Wikipedia searches

In [26]:
@ReAct.add_method
def search(self, query: str):
    """Perfoms search on `query` via Wikipedia api and returns its summary.

    Args:
        query: Search parameter to query the Wikipedia API with.

    Returns:
        observation: Summary of Wikipedia search for `query` if found else
        similar search results.
    """
    observation = None
    query = query.strip()
    try:
      # try to get the summary for requested `query` from the Wikipedia
      observation = wikipedia.summary(query, sentences=4, auto_suggest=False)
      wiki_url = wikipedia.page(query, auto_suggest=False).url
      observation = self.clean(observation)

      # if successful, return the first 2-3 sentences from the summary as model's context
      observation = self.model.generate_content(f'Retun the first 2 or 3 \
      sentences from the following text: {observation}')
      observation = observation.text

      # keep track of the model's search history
      self._search_history.append(query)
      self._search_urls.append(wiki_url)
      print(f"Information Source: {wiki_url}")

    # if the page is ambiguous/does not exist, return similar search phrases for model's context
    except (DisambiguationError, PageError) as e:
      observation = f'Could not find ["{query}"].'
      # get a list of similar search topics
      search_results = wikipedia.search(query)
      observation += f' Similar: {search_results}. You should search for one of those instead.'

    return observation

#### Lookup
Look for a specific phrase on the Wikipedia page.

In [27]:
@ReAct.add_method
def lookup(self, phrase: str, context_length=200):
    """Searches for the `phrase` in the lastest Wikipedia search page
    and returns number of sentences which is controlled by the
    `context_length` parameter.

    Args:
        phrase: Lookup phrase to search for within a page. Generally
        attributes to some specification of any topic.

        context_length: Number of words to consider
        while looking for the answer.

    Returns:
        result: Context related to the `phrase` within the page.
    """
    # get the last searched Wikipedia page and find `phrase` in it.
    page = wikipedia.page(self._search_history[-1], auto_suggest=False)
    page = page.content
    page = self.clean(page)
    start_index = page.find(phrase)

    # extract sentences considering the context length defined
    result = page[max(0, start_index - context_length):start_index+len(phrase)+context_length]
    print(f"Information Source: {self._search_urls[-1]}")
    return result

#### Finish
Instruct the pipline to terminate its execution.

In [28]:
@ReAct.add_method
def finish(self, _):
  """Finishes the conversation on encountering <finish> token by
  setting the `self.should_continue_prompting` flag to `False`.
  """
  self.should_continue_prompting = False
  print(f"Information Sources: {self._search_urls}")

### Stop tokens and function calling imitation

Now that you are all set with function definitions, the next step is to instruct the model to interrupt its execution upon encountering any of the action tokens. You will make use of the `stop_sequences` parameter from [`genai.GenerativeModel.GenerationConfig`](https://ai.google.dev/api/python/google/generativeai/GenerationConfig) class to instruct the model when to stop. Upon encountering an action token, the pipeline will simply extract what specific token from the `stop_sequences` argument terminated the model's execution, and then call the appropriate **tool** (function).

The function's response will be added to model's chat history for continuing the context link.

In [29]:
@ReAct.add_method
def __call__(self, user_question, max_calls: int=8, **generation_kwargs):
  """Starts multi-turn conversation with the chat models with function calling

  Args:
      max_calls: max calls made to the model to get the final answer.

      generation_kwargs: Same as genai.GenerativeModel.GenerationConfig
              candidate_count: (int | None) = None,
              stop_sequences: (Iterable[str] | None) = None,
              max_output_tokens: (int | None) = None,
              temperature: (float | None) = None,
              top_p: (float | None) = None,
              top_k: (int | None) = None

  Raises:
      AssertionError: if max_calls is not between 1 and 8
  """

  # hyperparameter fine-tuned according to the paper
  assert 0 < max_calls <= 8, "max_calls must be between 1 and 8"

  if len(self.chat.history) == 0:
    model_prompt = self.prompt.format(question=user_question)
  else:
    model_prompt = user_question

  # stop_sequences for the model to immitate function calling
  callable_entities = ['</search>', '</lookup>', '</finish>']

  generation_kwargs.update({'stop_sequences': callable_entities})

  self.should_continue_prompting = True
  for idx in range(max_calls):

    self.response = self.chat.send_message(content=[model_prompt],
              generation_config=generation_kwargs, stream=False)

    for chunk in self.response:
      print(chunk.text, end=' ')

    response_cmd = self.chat.history[-1].parts[-1].text

    try:
      # regex to extract <function name writen in between angular brackets>
      cmd = re.findall(r'<(.*)>', response_cmd)[-1]
      print(f'</{cmd}>')
      # regex to extract param
      query = response_cmd.split(f'<{cmd}>')[-1].strip()
      # call to appropriate function
      observation = self.__getattribute__(cmd)(query)

      if not self.should_continue_prompting:
        break

      stream_message = f"\nObservation {idx + 1}\n{observation}"
      print(stream_message)
      # send function's output as user's response
      model_prompt = f"<{cmd}>{query}</{cmd}>'s Output: {stream_message}"

    except (IndexError, AttributeError) as e:
      model_prompt = "Please try to generate thought-action-observation traces \
      as instructed by the prompt."

### Test ReAct prompted Gemini model

In [30]:
gemini_ReAct_chat = ReAct(model='gemini-1.5-flash', ReAct_prompt='model_instructions.txt')
# Note: try different combinations of generational_config parameters for variational results
gemini_ReAct_chat("Pokémon evreninde Mewtwo nasıl yaratılmıştır?", temperature=0.2)

Soru:  Bulbasaur'un evrimleşmiş hali nedir?

Düşünce 1: Bulbasaur'un evrim zincirini bulmam gerekiyor.  Bunun için Vikipedi'de Bulbasaur'u aramam lazım.

Eylem 1: <search>Bulbasaur </search>
Information Source: https://en.wikipedia.org/wiki/Bulbasaur

Observation 1
Bulbasaur ( ), known as Fushigidane (Japanese: フシギダネ) in Japan, is a fictional Pokémon species in Nintendo and Game Freak's Pokémon franchise.  First introduced in the video games Pokémon Red and Blue, it was created by Atsuko Nishida with the design finalized by Ken Sugimori.

Düşünce 2:  Gözlem 1, Bulbasaur'un evrimleşmiş halini belirtmiyor.  Daha fazla bilgi bulmak için Bulbasaur'un evrim aşamalarını aramam gerekiyor veya Bulbasaur'un evrimleşmiş halini doğrudan aramayı deneyebilirim.

Eylem 2: <search>Bulbasaur evolution

 </search>

Observation 2
Could not find ["Bulbasaur evolution"]. Similar: ['Bulbasaur', 'List of generation I Pokémon', 'Squirtle', 'Pokémon: Mewtwo Strikes Back – Evolution', 'Charizard', 'Eevee', 'Di

Now, try asking the same question to `gemini-1.5-flash` model without the ReAct prompt.

In [31]:
gemini_ReAct_chat.model.generate_content("Pikachu ismi ne anlama geliyor ve nereden geliyor?").text

'Pikachu ismi Japonca\'dan geliyor.  "Pika" kelimesi, farelerin ve diğer küçük kemirgenlerin çıkardığı sesi taklit ederken, "chu" ise elektrikli bir şok sesini taklit eder.  Yani, ismin kendisi, Pikachu\'nun görünüşünü ve yeteneklerini (elektrik gücü) açıkça yansıtıyor.  Dolayısıyla, isminin direkt bir anlamı yoktur, ancak onomatopeik (ses taklidi) bir bileşimdir.\n'

## Summary

The ReAct prompted Gemini model is grounded by external information sources and hence is less prone to hallucination. Furthermore, Thought-Action-Observation  traces generated by the model enhance human interpretability and trustworthiness by allowing users to witness the model's reasoning process for answering the user's query.


## Next steps


Head over to this [Streamlit app](https://mayochat.streamlit.app/) to interact with a ReAct prompted Gemini bot built with this code.