### Introduction to RAG:

What is **RAG**?

RAG is a concept in natural language processing (NLP) and artificial intelligence (AI) that represents a framework for building conversational AI models. RAG stands for Retrieve, Augment, and Generate, which are the three primary components of this framework.

- **Retrieve:** This component involves retrieving relevant information from a knowledge base, database, or external sources. The goal is to gather context and background information related to the user's query or input.
- **Augment:** In this component, the retrieved information is augmented with additional data, such as user preferences, conversation history, or external knowledge. This step helps to refine the understanding of the user's intent and provides more context for generating a response.
- **Generate:** The final component involves generating a response based on the retrieved and augmented information. This response is typically generated using a machine learning model, such as a language model or a text generator.

#### **Why RAG is needed:**
**Improved accuracy:** RAG helps to improve the accuracy of conversational AI models by providing more context and relevant information.

**Personalization:** By incorporating user preferences and conversation history, RAG enables more personalized responses.

**Efficient knowledge retrieval:**
RAG allows for efficient retrieval of knowledge from external sources, reducing the need for manual knowledge updates.

**Flexibility:** 
The RAG framework can be applied to various conversational AI applications, such as chatbots, voice assistants, and language translation systems.

---
**What we do there:**
- API call to LLM
- Building the tiniest RAG

tools:
- anthropic


### Take a look at

[Anthropic cookbook](https://github.com/anthropics/anthropic-cookbook/)

[Anthropic api reference](https://docs.anthropic.com/en/api/getting-started)

### Python API

[Getting started](https://github.com/anthropics/courses/blob/master/anthropic_api_fundamentals/01_getting_started.ipynb)

[Messages format](https://github.com/anthropics/courses/blob/master/anthropic_api_fundamentals/02_messages_format.ipynb)

[Models](https://github.com/anthropics/courses/blob/master/anthropic_api_fundamentals/03_models.ipynb)

[Parameters](https://github.com/anthropics/courses/blob/master/anthropic_api_fundamentals/04_parameters.ipynb)

[Streaming](https://github.com/anthropics/courses/blob/master/anthropic_api_fundamentals/05_Streaming.ipynb)

[Vision](https://github.com/anthropics/courses/blob/master/anthropic_api_fundamentals/06_vision.ipynb)

---


### 🍜 New ingredients!

- os - module, which provides functions for interacting with the operating system.
- load_dotenv - function from the dotenv library, which loads environment variables from a .env file.

In [None]:
import os
from dotenv import load_dotenv

# load .env
load_dotenv(dotenv_path='../.env')

# get API key
api_key = os.getenv('ANTHROPIC_API_KEY')

print("api_key -> ", api_key)

---
### llm know everything?

let's check can LLM model know what is current day?

* read content from system role
* read content from user role

---

### 🍜 New ingredients!

##### Role
When using Claude, you can dramatically improve its performance by using the system parameter to give it a role. This technique, known as role prompting, is the most powerful way to use system prompts with Claude.

The right role can turn Claude from a general assistant into your virtual domain expert!


##### Content
The content is the actual information or message being passed from one party to another. 

##### Good to read

[Giving Claude a role with a system prompt](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/system-prompts)


In [None]:
from anthropic import Anthropic

client = Anthropic(api_key=api_key)

system_prompt = "You are the helpful assistant. If you don't know the answer, please respond with 'I don't know'."

user_prompt = "What is the current date?"

# make a response
response = client.messages.create(
    model="claude-3-haiku-20240307",
    system=system_prompt,
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": user_prompt
    }]
)

# print the response
print(response.content[0].text)

---
### current date

add some knowledge to our context.

here we use function which returns current date.

---

### 🍜 New ingredients!

<b>Retrieve</b> - we put function <b>date.today()</b> as a context with relevant data.


In [None]:
from anthropic import Anthropic
from datetime import date

client = Anthropic(api_key=api_key)

system_prompt = "You are the helpful assistant. If you don't know the answer, please respond with 'I don't know'."

question = "What is the current date?"

user_prompt = (f"You know that the current date is {date.today()}"
            + question)

# make a response
response = client.messages.create(
    model="claude-3-haiku-20240307",
    system=system_prompt,
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": user_prompt
    }]
)

print("LLM      -> ", response.content[0].text)
print("function -> ", date.today())

---
### specific text

let's add some custom text and ask about data in it

---

### 🍜 New ingredients!

<b>query</b> - question we give

<b>context</b> - relevant date we have

---

In [None]:
from anthropic import Anthropic
from datetime import date

client = Anthropic(api_key=api_key)

context = "Labyrinth is available now for £599."

query = "How much cost Labyrinth?"

user_prompt = (f"""
            You have been tasked with helping us to answer the following query: 
            <query>
            {query}
            </query>
            You have access to the following documents which are meant to provide context as you answer the query:
            <documents>
            {context}
            </documents>
            Please remain faithful to the underlying context, and only deviate from it if you are 100% sure that you know the answer already. 
            Answer the question now, and avoid providing preamble such as 'Here is the answer', etc
            """
            )

# make a response
response = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": user_prompt
    }]
)

print("LLM      -> ", response.content[0].text)
print("context  -> ", context)

### well done!

---
### Conclusion
In this tutorial, we look at all three key components of the RAG pipeline:
1. We create a small database that contains just a string with specific knowledge about cost, and we use the whole string as a relevant chunk, utilizing the "**Retrieve**" component to extract relevant information.
2. We **Augment** the information with a specific prompt that returns 'null' if there is no relevant data.
3. We use an API call to LLM with the whole prompt to **Generate** a response.


In the next tutorial, we will use a Vector Database to retrieve specific parts with relevant information that fit the query.

---
made with <3 by 
[dima dem](https://github.com/dimadem/) |42London