# RAG Pipeline with LlamaIndex

In this notebook we will look into building Basic RAG Pipeline with LlamaIndex. The pipeline has following steps.

1. Setup LLM and Embedding Model.
2. Download Data.
3. Load Data.
4. Index Data.
5. Create Query Engine.
6. Querying.

### Installation

In [None]:
!pip install llama-index
!pip install llama-index-llms-anthropic
!pip install llama-index-embeddings-huggingface

Collecting llama-index
  Downloading llama_index-0.10.56-py3-none-any.whl (6.8 kB)
Collecting llama-index-agent-openai<0.3.0,>=0.1.4 (from llama-index)
  Downloading llama_index_agent_openai-0.2.9-py3-none-any.whl (13 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_cli-0.1.12-py3-none-any.whl (26 kB)
Collecting llama-index-core==0.10.56 (from llama-index)
  Downloading llama_index_core-0.10.56-py3-none-any.whl (15.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.5/15.5 MB[0m [31m54.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting llama-index-embeddings-openai<0.2.0,>=0.1.5 (from llama-index)
  Downloading llama_index_embeddings_openai-0.1.11-py3-none-any.whl (6.3 kB)
Collecting llama-index-indices-managed-llama-cloud>=0.2.0 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.2.5-py3-none-any.whl (9.3 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)
  Downloading llama_index_le

### Setup API Keys

In [None]:
# !pip list

# !pip show llama-index
# !pip show llama-index-llms-anthropic
# !pip show llama-index-embeddings-huggingface

import subprocess

def get_package_version(package_name):
  # Use subprocess to run pip show and capture output
  process = subprocess.Popen(["pip", "show", package_name], stdout=subprocess.PIPE)
  output, _ = process.communicate()

  # Decode output and search for "Version" line
  for line in output.decode().splitlines():
    if line.startswith("Version:"):
      return line.split(":")[1].strip()

  # Return None if version not found
  return None

# Get versions of specific packages
llama_index_version = get_package_version("llama-index")
llms_anthropic_version = get_package_version("llama-index-llms-anthropic")
embeddings_version = get_package_version("llama-index-embeddings-huggingface")

print(f"llama-index: {llama_index_version}")
print(f"llama-index-llms-anthropic: {llms_anthropic_version}")
print(f"llama-index-embeddings-huggingface: {embeddings_version}")


llama-index: 0.10.56
llama-index-llms-anthropic: 0.1.15
llama-index-embeddings-huggingface: 0.2.2


In [None]:
import os
os.environ['ANTHROPIC_API_KEY'] = 'sk-ant-api03---'

### Setup LLM and Embedding model

We will use anthropic latest released `Claude 3 Opus` models

In [None]:
from llama_index.llms.anthropic import Anthropic
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

In [None]:
llm = Anthropic(temperature=0.0, model='claude-3-opus-20240229')
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/777 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

### Download Data

In [None]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2024-07-21 11:05:55--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-07-21 11:05:55 ERROR 404: Not Found.



In [None]:
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
)

### Load Data

In [None]:
documents = SimpleDirectoryReader("./data/paul_graham").load_data()

### Index Data

In [None]:
index = VectorStoreIndex.from_documents(
    documents,
)

### Create Query Engine

In [None]:
query_engine = index.as_query_engine(similarity_top_k=3)

### Test Query

In [None]:
response = query_engine.query("What did author do growing up?")

In [None]:
print(response)


Growing up, the author learned to program computers. He started programming on an IBM 1401 when he was 13 years old in 1968. The author found programming addictive and spent a lot of time doing it during his teenage years.


## MULTI DOC

In [4]:
!pip show llama_index

[0m

In [3]:
from llama_index.llms.anthropic import Anthropic
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

ModuleNotFoundError: No module named 'llama_index'

In [1]:
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
#          This results in nested event-loops when we start an event-loop to make async queries.
#          This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio

nest_asyncio.apply()

import logging
import sys

# Set up the root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)  # Set logger level to INFO

# Clear out any existing handlers
logger.handlers = []

# Set up the StreamHandler to output to sys.stdout (Colab's output)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)  # Set handler level to INFO

# Add the handler to the logger
logger.addHandler(handler)

from IPython.display import display, HTML

In [2]:
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

ModuleNotFoundError: No module named 'llama_index'