This notebook shows how to perform [Retrieval Augmented Generation (RAG)](https://arxiv.org/abs/2005.11401) using `magentic` and the `wikipedia` API. Essentially providing context to the LLM which it can use when generating its response. This approach allows us to insert new or private information that was not present in the model's training data. The Wikipedia API is used here for demonstration but the methods shown are applicable to any data source.

In [1]:
import wikipedia

from magentic import prompt, prompt_chain

Define a standard prompt for explaining a topic.

In [2]:
@prompt("Tell me about {topic} in simple terms. If you do not know, just say 'I do not know'.")
def explain_standard(topic: str) -> str:
    ...

The model can explain topics that were in the training data (e.g. Treaty of Versailles), but it cannot explain (or fabricates an explanation for) topics that were not in the training data (e.g. Wagner Coup).


In [3]:
explain_standard("Treaty of Versailles")

'The Treaty of Versailles was a peace agreement signed in 1919 after World War I. It was meant to decide what would happen to the countries involved in the war. Germany, which was blamed for starting the war, had to accept full responsibility and pay a lot of money as reparations. They also had to give up some of their land and reduce their military. Many people in Germany were unhappy with the treaty, and some believe it contributed to the start of World War II.'

In [4]:
explain_standard("Wagner Coup")

'I\'m sorry, but I do not have any information about a "Wagner Coup" in my database. It\'s possible that you may be referring to something specific that I\'m not aware of.'

Wikipedia has up-to-date information on most topics so it can be used to provide context to the LLM.

In [5]:
wikipedia.summary("Wagner Coup")

'On 23 June 2023, the Wagner Group, a Russian government-funded paramilitary and private military company, staged a rebellion after a period of increasing tensions between the Russian Ministry of Defence and the leader of Wagner, Yevgeny Prigozhin.\nWhile Prigozhin was supportive of the Russian invasion of Ukraine, he had previously publicly criticized Defense Minister Sergei Shoigu and Chief of the General Staff Valery Gerasimov, blaming them for the country\'s military shortcomings and accusing them of handing over "Russian territories" to Ukraine. Prigozhin portrayed the rebellion as a response to an alleged attack on his forces by the ministry, and demanded that Shoigu and Gerasimov be turned over to him. In a televised address on 24 June, Russian president Vladimir Putin denounced Wagner\'s actions as treason and pledged to quell the rebellion.\nPrigozhin\'s forces took control of Rostov-on-Don and the headquarters of the Southern Military District in the city. An armored column o

If the topics we receive are always present on Wikipedia then it might make sense to always query wikipedia and add this information to the context.

In [6]:
@prompt("""Based on the following context, tell me about {topic} in simple terms.
If you do not know, just say 'I do not know'.
        
context
---
{context}
""")
def explain_using_context(topic: str, context) -> str:
    ...


def explain_using_wikipedia_context(topic: str) -> str:
    context = wikipedia.summary(topic)
    return explain_using_context(topic, context=context)


explain_using_wikipedia_context("Wagner Coup")

"The Wagner Coup was a rebellion that happened in Russia in June 2023. The Wagner Group, a paramilitary and private military company funded by the Russian government, staged the rebellion. The leader of Wagner, Yevgeny Prigozhin, was unhappy with the Russian Ministry of Defence and publicly criticized them. He accused them of causing military problems and giving away Russian territories to Ukraine. Prigozhin's forces took control of Rostov-on-Don and the headquarters of the Southern Military District. They also advanced towards Moscow with an armored column. The Russian military tried to stop them with air attacks, but the rebels had anti-aircraft systems and were able to repel the attacks. Before reaching Moscow, a settlement was brokered by the Belarusian president, and Prigozhin agreed to end the rebellion. The case of armed rebellion against Prigozhin was closed, and the charges were dropped. Several people were killed and injured during the rebellion."

Alternatively, if we only need to query Wikipedia in some cases then we can provide a `wikipedia_summary` function for the LLM to use as needed. This will avoid querying Wikipedia in some cases. When Wikipedia _is_ queried two calls will be made to the LLM (one that decided to query wikipedia, and one to generate the explanation from the wikipedia summary).

In [7]:
def wikipedia_summary(topic: str) -> str:
    print(f" - Searching Wikipedia for {topic!r}")
    return wikipedia.summary(topic)


@prompt_chain(
    "Tell me about {topic} in simple terms. Only query wikipedia if you do not already know about {topic}.",
    functions=[wikipedia_summary],
)
def explain_using_function_calling(topic: str) -> str:
    ...


explain_using_function_calling("Wagner Coup")

 - Searching Wikipedia for 'Wagner Coup'


'The Wagner Coup was a rebellion that took place on June 23, 2023, led by the Wagner Group, a Russian government-funded paramilitary and private military company. The coup was sparked by tensions between the Russian Ministry of Defence and the leader of Wagner, Yevgeny Prigozhin.\n\nPrigozhin, who supported the Russian invasion of Ukraine, had criticized Defense Minister Sergei Shoigu and Chief of the General Staff Valery Gerasimov for the country\'s military shortcomings. He accused them of giving away "Russian territories" to Ukraine. Prigozhin claimed that the rebellion was in response to an alleged attack on his forces by the ministry and demanded that Shoigu and Gerasimov be handed over to him.\n\nRussian President Vladimir Putin denounced Wagner\'s actions as treason and vowed to suppress the rebellion. Prigozhin\'s forces took control of Rostov-on-Don and the headquarters of the Southern Military District. They also advanced towards Moscow with an armored column, repelling air a

Despite the function call being optional the LLM might decide to use it when it doesn't need to!

In [8]:
explain_using_function_calling("Treaty of Versailles")

 - Searching Wikipedia for 'Treaty of Versailles'


"The Treaty of Versailles was a peace treaty signed in 1919 to end World War I. It was signed in the Palace of Versailles, France. The treaty was important because it ended the state of war between Germany and most of the Allied Powers. Germany was blamed for causing the war and was required to accept responsibility for the damage caused. The treaty also required Germany to disarm, give up territory, and pay reparations to the countries that had been affected by the war. The treaty was controversial because some people thought it was too harsh on Germany, while others thought it was too lenient. The treaty's terms had long-lasting effects, including economic collapse in Germany and the rise of the Nazi Party, which eventually led to World War II."

One way to combat this is to create a simple explain function that returns `None` for unknown answers, then explicitly check whether the model was able to respond without the wikipedia context, and call the LLM with the wikipedia context if not. Essentially we take some responsibility away from the LLM and put it in code instead. This approach makes most sense if the function for fetching context is slow or expensive, or should only be needed infrequently.

In [9]:
@prompt("Tell me about {topic} in simple terms. If you do not know, return null.")
def explain_or_none(topic: str) -> str | None:
    ...


def explain_final(topic: str) -> str:
    explanation = explain_or_none(topic)
    if not explanation:
        print(" - Received null response")
        return explain_using_wikipedia_context(topic)
    return explanation

explain_final("Wagner Coup")

 - Received null response


"The Wagner Coup was a rebellion that happened in Russia in June 2023. The Wagner Group, a paramilitary and private military company funded by the Russian government, staged the rebellion. The leader of Wagner, Yevgeny Prigozhin, was unhappy with the Russian Ministry of Defence and publicly criticized them. He accused them of causing military problems and giving away Russian territories to Ukraine. Prigozhin's forces took control of Rostov-on-Don and the headquarters of the Southern Military District. They also advanced towards Moscow with an armored column. The Russian military tried to stop them, but the rebels had anti-aircraft systems that repelled the attacks. Before reaching Moscow, a settlement was brokered by the Belarusian president, and Prigozhin agreed to end the rebellion. The case of armed rebellion against the rebels was closed, and several people were killed or injured during the conflict."

In [10]:
explain_final("Treaty of Versailles")

"The Treaty of Versailles was a peace treaty signed after World War I. It was signed in 1919 in the Palace of Versailles in France. The treaty placed the blame for the war on Germany and its allies. It imposed heavy reparations on Germany, meaning they had to pay a lot of money to the countries that were affected by the war. The treaty also limited the size of Germany's military and took away some of its territory. Many people believe that the harsh terms of the treaty contributed to the rise of Adolf Hitler and the start of World War II."