# RAG Pipeline with LlamaIndex

In this notebook we will look into building Basic RAG Pipeline with LlamaIndex. The pipeline has following steps.

1. Setup LLM and Embedding Model.
2. Download Data.
3. Load Data.
4. Index Data.
5. Create Query Engine.
6. Querying.

### Installation

In [3]:
%pip install llama-index
%pip install llama-index-llms-bedrock
%pip install llama-index-embeddings-bedrock
%pip install llama-index-embeddings-huggingface

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Collecting llama-index-llms-bedrock
  Using cached llama_index_llms_bedrock-0.1.5-py3-none-any.whl.metadata (687 bytes)
Collecting boto3<2.0.0,>=1.34.26 (from llama-index-llms-bedrock)
  Using cached boto3-1.34.70-py3-none-any.whl.metadata (6.6 kB)
Collecting botocore<1.35.0,>=1.34.70 (from boto3<2.0.0,>=1.34.26->llama-index-llms-bedrock)
  Using cached botocore-1.34.70-py3-none-any.whl.metadata (5.7 kB)
Collecting s3transfer<0.11.0,>=0.10.0 (from boto3<2.0.0,>=1.34.26->llama-index-llms-bedrock)
  Using cached s3transfer-0.10.1-py3-none-any.whl.metadata (1.7 kB)
Using cached llama_index_llms_bedrock-0.1.5-py3-none-any.whl (8.0 kB)
Downloading boto3-1.34.70-py3-none-any.w

### Setup and imports

In [4]:
from llama_index.core import ( 
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
    load_index_from_storage
)
from llama_index.core.settings import Settings
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding, Models

### Setup LLM and Embedding model

We will use the Claude 2 models on bedrock.

In [5]:
llm = Bedrock(model = "anthropic.claude-v2")
embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v1")

In [6]:
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

### Download Data

In [15]:
%cd ~
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

/root
--2024-03-26 19:27:35--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2024-03-26 19:27:35 (49.6 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



In [16]:
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
)

### Load Data

In [17]:
documents = SimpleDirectoryReader("./data/paul_graham").load_data()

### Index Data

In [18]:
index = VectorStoreIndex.from_documents(
    documents,
)

### Create Query Engine

In [19]:
query_engine = index.as_query_engine(similarity_top_k=3)

### Test Query

In [21]:
response = query_engine.query("What did the author do growing up?")
print(response)

Based on the provided context, it seems the author worked on writing short stories and programming in his youth before college. Specifically, the context mentions:

- The author wrote short stories as a beginning writer, which he says were awful and lacked much plot. 

- The author's first experience with programming was using an early version of Fortran on an IBM 1401 computer that was located in his junior high school basement. He tried writing programs to calculate pi approximations and other things that didn't require input data.

So in summary, the main things the author worked on outside of school before college were writing short stories and learning some basic programming on an early computer.
