# RAG Introduction (PROTOTYPE)

In this tutorial, we will cover the basics of RAG.

## Environment Setup

Follow our [tutorial on Apptainer](https://www.deeplearningwizard.com/language_model/containers/hpc_containers_apptainer/) to get started. Once you have followed the tutorial till the [Ollama section](https://www.deeplearningwizard.com/language_model/containers/hpc_containers_apptainer/#ollama-gemma-workloads) where you successfully ran `ollama serve` and `ollama run gemma:7b`, you can run the `apptainer shell --nv --nvccli apptainer_container_0.1.sif` command followed by `jupyter lab` to access and run this notebook.

!!! info  "Directory Guide"

    When you shell into the Apptainer `.sif` container, you will need to navigate the directory as you normally would into the Deep Learning Wizard repository that you cloned, requiring you to `cd ..` to go back a few directories and finally reaching the right folder. 

## Import and Test LLM

In [10]:
# Import the Ollama class from the llama_index.llms module.
from llama_index.llms.ollama import Ollama

# Create an instance of the Ollama class. The "gemma:7b" argument specifies the model to be used.
llm = Ollama(model="gemma:7b")

In [11]:
# Set the prompt
prompt = "What is Retrieval Augmented Generation (RAG)?"

# Set the temperature
# Higher values (e.g., 1.0) make the output more random, while lower values (e.g., 0.2) make it more deterministic
temperature = 0.8

# Set the top_k
# This parameter controls the number of tokens considered for each step of the generation process
top_k = 100

# Generate the response
response = llm.complete(prompt, temperature=temperature, top_k=top_k)

print(response)

Retrieval Augmented Generation (RAG) is a language modeling technique that uses the technique of retrieval to improve the performance of generative models. Instead of learning from scratch, RAG models leverage existing text data and information sources to generate new text content.

Here's a breakdown of the key concepts:

**Retrieval:**
- The model extracts relevant information from existing text sources like documents, articles, or code.
- This information is retrieved based on the specific task and query.

**Augmented Generation:**
- The retrieved information is used to augment the model's training data.
- This can involve adding new text examples or fine-tuning existing ones.

**Benefits:**

- **Transfer learning:** RAG models can benefit from knowledge transfer by leveraging existing text data.
- **Task-specificity:** They can be fine-tuned for specific tasks, improving performance compared to general-purpose language models.
- **Data efficiency:** They require less training data 

## Basic LLM

In [4]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

ImportError: cannot import name 'VectorStoreIndex' from 'llama_index.core' (/opt/conda/lib/python3.11/site-packages/llama_index/core/__init__.py)

In [5]:
from llama_index import ServiceContext


In [6]:
from llama_index.chat_engine import SimpleChatEngine


In [7]:
from llama_index.core import SimpleDirectoryReader

ImportError: cannot import name 'SimpleDirectoryReader' from 'llama_index.core' (/opt/conda/lib/python3.11/site-packages/llama_index/core/__init__.py)

In [13]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.core.embeddings import resolve_embed_model
from llama_index.llms.ollama import Ollama

documents = SimpleDirectoryReader("data").load_data()

# bge-m3 embedding model
Settings.embed_model = resolve_embed_model("local:BAAI/bge-small-en-v1.5")

# ollama
Settings.llm = Ollama(model="mistral", request_timeout=30.0)

index = VectorStoreIndex.from_documents(
    documents,
)

ImportError: cannot import name 'VectorStoreIndex' from 'llama_index.core' (/opt/conda/lib/python3.11/site-packages/llama_index/core/__init__.py)

In [10]:
import requests
from bs4 import BeautifulSoup

# Make a request to the website
url = 
res = requests.get(url)
html_page = res.content

# Use BeautifulSoup to parse the HTML content
soup = BeautifulSoup(html_page, 'html.parser')

# Find all text within the HTML, but exclude certain elements
text = soup.find_all(text=True)
output = ''
blacklist = ['[document]', 'noscript', 'header', 'html', 'meta', 'head', 'input', 'script']

for t in text:
    if t.parent.name not in blacklist:
        output += '{} '.format(t)

print(output)


The Emperor's New Clothes .bg-amlit-test{color:#333;background-color:#fdfcdc} Home Short Stories Short Story of the Day 100 Great Short Stories 20 Great American Short Stories Children's Stories Favorite Fairy Tales Short Stories for Middle School Short Stories for High School Teachers' Resources Useful Idioms Study Guides Mystery Stories Science Fiction Dystopian Stories Winter Sports Stories Russian Stories Morality Tales 75  Short  Short Stories 50 Great Feel-Good Stories Civil War Stories World War I Stories Dog Stories Foodie Stories Favorite Short Story Collections Gothic, Ghost, Horror & Weird Library Halloween Stories Christmas Stories Complete Short Story Library Children Short Stories for Children Fairy Tales by Age Mother Goose Just So Stories Feel-Good Children's Stories Children's Christmas Stories The Children's Library Halloween Stories for Children Grimm's Fairy Tales Children's Poems Lullabies Books for Young Readers Pre-K Wordplay! Winnie The Pooh Classroom Reading Sh

  text = soup.find_all(text=True)
