# 1. RAG Workshop: Introduction

In This workshop we will use notebooks and python scripts to interactively learn about Large Language Models and RAGs.

Large language models are machine learning models that can generate human-like text. They were trained on large amounts of data and can be used to generate text, answer questions, and more.


## Getting started with jupyter notebook
First of all lets make sure you understand the jupyter notebook interface.
In jupter you can have cells of text or code.
You can type any python code in a cell and press shift + enter to run it.

Interact with the cell below and run it multiple times to see the results.

In [43]:

a = a if a is not None else 2
a = a + 2 
print(a)


from datetime import datetime
datetime.now().strftime('%Y-%m-%d %H:%M:%S')


56


'2024-11-10 17:46:55'

## MistralAI

In whis workshop we will use MistralAI models, its the same concept as OpenAI (chatgpt) or Anthropic (claude).

One can use mistal on your own computer or use the cloud version.
For the sake of this workshop we will use the cloud version as we dont need to download big models.

In [15]:
import os
from mistralai import Mistral

mistral_api_key = os.getenv('MISTRAL_API_KEY')
mistral_client = Mistral(api_key=mistral_api_key)
# the model is the specific model we want to use
model_name = "mistral-small-latest"

In [27]:
def call_mistral_model(message):
    response =mistral_client.chat.complete(
        model = model_name,
        messages = [
            {
                "role": "user",
                "content": message,
            }
            ]
        )
    # extract only the text from the response
    response_text = response.choices[0].message.content
    return response_text

print(call_mistral_model("hello! What is your name?"))

Hello! I don't have a name, but you can call me Assistant if you'd like. How can I help you today?


## TODO: Simple Q&A RAG

Large language models (LLMs) can sometimes hallucinate, presenting false information due to outdated training data. Retrieval-Augmented Generation (RAG) allows us to incorporate external information to mitigate these challenges. In this task, we will create a simple Q&A RAG that utilizes knowledge from a PDF to enrich its answers.

Now that we have the text, we can begin enriching the prompt to make our LLM even smarter!

In [28]:
def create_rag_prompt(message, context):
    return f"""Answer the question only using the provided content.

        Context: {context}

        User Question: {message}

        Be helpful and friendly. If the information cannot be found respond with "I don't know"
        """  

Below you can compare how our LLM differs the answers by the information that you provided

In [30]:
text = """
The weather in Berlin on the 10th of December of 2027 will be 10 degrees Celsius.
"""


def compare_llm_answers(message):
    generic_response = call_mistral_model(message)
    
    rag_prompt = create_rag_prompt(message=message, context=text)
    rag_response = call_mistral_model(rag_prompt)

    print(f"GENERIC RESPONSE:\n {generic_response}")
    print("-" * 10)
    print(f"RAG RESPONSE:\n {rag_response}")

compare_llm_answers("What will be the weather in Berlin on the 10th of December of 2027?")

GENERIC RESPONSE:
 I'm an assistant that operates solely on the data it has been trained on up until 2021, and I don't have real-time or future weather data. Therefore, I can't provide information on the weather in Berlin on December 10, 2027. To find out the weather forecast for a specific date, I would recommend checking a reliable weather website or application closer to the desired date.
----------
RAG RESPONSE:
 The weather in Berlin on the 10th of December, 2027, is expected to be 10 degrees Celsius.


That's it! RAGs enrich the prompt with additional information about the topic to generate responses. The external information can come from various sources, not just PDFs, such as Google search results, social media posts, and more. With that, we’ve built a simple Q&A RAG. In the next chapter, we will scale it up to include even more context.