# Retrieval Augmented Generation
Author: Cristian Velandia

Creation Date: 2024-03-02

RAG Using PINECONE vector DB, OpenAI chat gpt 3.5 and openai embeddings

In [1]:
from pinecone import Pinecone, ServerlessSpec
from openai import OpenAI
from tqdm.auto import tqdm

import ast
import pandas as pd
import json

In [2]:
# get api keys
creds =json.load(open('personal_creds.json'))

PINECONE_API_KEY = creds["PINECONE_API_KEY"]
OPENAI_API_KEY = creds["OPENAI_API_KEY"]

## Setup Pinecone
### Log into API

In [3]:
pinecone = Pinecone(api_key=PINECONE_API_KEY)

# Dfine unique and readable index name
INDEX_NAME = 'aws-docs-vdb-index-openai'

### Create Index for Vector storage

In [4]:
# Check if index already exists and deletes it
if INDEX_NAME in [index.name for index in pinecone.list_indexes()]:
  pinecone.delete_index(INDEX_NAME)

# Recreates index, after creation it is visible at the consle
pinecone.create_index(name = INDEX_NAME, dimension = 1536, metric = 'dotproduct',  spec = ServerlessSpec(cloud = 'aws', region = 'us-west-2')) 

In [5]:
#Create index object
index = pinecone.Index(INDEX_NAME)

### Load Previously Created Dataset
Here we load the dataset with the crpus and vectors to upload them after into pinecone (upsert)

In [6]:
data_path = r"D:\Documents\GitHub\knowledge_pal_assistant\2_outputs\vectors.parquet"
vectors = pd.read_parquet(data_path, engine = "pyarrow")

In [7]:
vectors.head()

Unnamed: 0,id,page_content,metadata,tokens,nostopw_page_content,token_length,index_x,index_y,vect
0,0-0,# AWS::Events::Rule SageMakerPipelineParameter...,{'Header 1': 'AWS::Events::Rule SageMakerPipel...,"[#, AWS, :, :Events, :, :Rule, SageMakerPipeli...",# AWS : :Events : :Rule SageMakerPipelineParam...,39,0,0-0,"[-0.036196205765008926, 0.024841105565428734, ..."
1,0-1,"## Syntax<a name=""aws-properties-events-rule-s...",{'Header 1': 'AWS::Events::Rule SageMakerPipel...,"[#, #, Syntax, <, a, name=, '', aws-properties...",# # Syntax < name= '' aws-properties-events-ru...,114,1,0-1,"[-0.03795338422060013, 0.021498069167137146, 0..."
2,0-2,"## Properties<a name=""aws-properties-events-ru...",{'Header 1': 'AWS::Events::Rule SageMakerPipel...,"[#, #, Properties, <, a, name=, '', aws-proper...",# # Properties < name= '' aws-properties-event...,164,2,0-2,"[-0.03936085104942322, 0.019204849377274513, 0..."
3,1-0,# Automating Amazon SageMaker with Amazon Even...,{'Header 1': 'Automating Amazon SageMaker with...,"[#, Automating, Amazon, SageMaker, with, Amazo...",# Automating Amazon SageMaker Amazon EventBrid...,315,3,1-0,"[-0.0347835049033165, -0.01207298319786787, -0..."
4,1-1,"## Training job state change<a name=""eventbrid...",{'Header 1': 'Automating Amazon SageMaker with...,"[#, #, Training, job, state, change, <, a, nam...",# # Training job state change < name= '' event...,445,4,1-1,"[-0.035871874541044235, 0.0021414232905954123,..."


In [8]:
len(ast.literal_eval(vectors["vect"][0]))

1536

In [9]:
#clean meta, do not include headers
def pop_keys(d, key):
    tmp = d.copy()
    tmp.pop(key)
    return tmp

vectors["metadata"] = vectors["metadata"].apply(pop_keys, args = (["Header 1"])) 
vectors["metadata"] = vectors["metadata"].apply(pop_keys, args = (["Header 2"]))

### Upsert embeddings to Pinecone 

In [10]:
# Create list for batching the upload
prepped = []

# Iterate through data
for i, row in tqdm(vectors.iterrows(), total=vectors.shape[0]):
    
    prepped.append({'id' : row['id'], 'values' : ast.literal_eval(row['vect']), 'metadata' : row['metadata']})

    if len(prepped) >= 250:
        index.upsert(prepped) #Upsert a batch of 250 vectors
        prepped = []


  0%|          | 0/500 [00:00<?, ?it/s]

In [13]:
#Descripe uploaded index
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 500}},
 'total_vector_count': 500}

### Augment Chat GPT 3.5 Queries 
Trhough this section the connection to open AI for embeddings and LLM is performed, after, the prompts are built and tested. Here we will test the RAG capabilites while using pinecone and openai apis

In [14]:
openai_client = OpenAI(api_key=OPENAI_API_KEY)

In [22]:
# First step, get all the relevant infromation from the vector DB (this will be the question that the user will ask)
query = "How to check if an endpoint is KMS encrypted?"

embeddings = openai_client.embeddings.create(input = [query], model = "text-embedding-ada-002")


In [24]:
res = index.query(vector = embeddings.data[0].embedding, top_k = 5, include_metadata = True)

context = [r['metadata']['text'] for r in res['matches']]
print('\n'.join(context)) #Visualize output

## Properties<a name="aws-properties-sagemaker-modelcard-securityconfig-properties"></a>  
`KmsKeyId`  <a name="cfn-sagemaker-modelcard-securityconfig-kmskeyid"></a>
A AWS Key Management Service [key ID](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#key-id-key-id) used to encrypt a model card\.
*Required*: No
*Type*: String
*Update requires*: [Replacement](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-replacement)
 'data source =  D:\Documents\GitHub\knowledge_pal_assistant\0_data\aws-properties-sagemaker-modelcard-securityconfig.md'
## Syntax<a name="aws-properties-sagemaker-modelcard-securityconfig-syntax"></a>  
To declare this entity in your AWS CloudFormation template, use the following syntax:  
### JSON<a name="aws-properties-sagemaker-modelcard-securityconfig-syntax.json"></a>  
```
{
"[KmsKeyId](#cfn-sagemaker-modelcard-securityconfig-kmskeyid)" : String
}
```  
### YAML<a name="aws-p

In [25]:
message = [{
    "role" : "system",
    "content" : "You are Knowledge pal, an expert developer assistant. Your goal is to answer questions about cloud services, coding in different languages, and provide a detailed response everytime. You will provide the source of the answer you are giving and 3 more related sources, those sources can be a path or URL. Answer combining your knowledge with the context provided. If information is not clear or the context is not enough to give an answer, tell the user that you don't have the answer. \n Context:\n {0} \n\n Question: {1}\n Answer: ".format("\n\n---\n\n".join(context), query)
}]

In [26]:
knowledge_pal_response = openai_client.chat.completions.create(
    model = "gpt-3.5-turbo",
    messages = message,
    temperature = 0,
    max_tokens = 636,
    top_p = 1,
    frequency_penalty = 0,
    presence_penalty = 0,
    stop = None
)


In [27]:
print('-' * 80)
print(knowledge_pal_response.choices[0].message.content)

--------------------------------------------------------------------------------
To check if an endpoint is KMS encrypted in AWS SageMaker, you can inspect the `KmsKeyId` property associated with the endpoint. If the `KmsKeyId` property is specified with a valid AWS Key Management Service (KMS) key ID, it indicates that the endpoint is encrypted using that specific KMS key.

Here is the syntax to declare the `KmsKeyId` property in an AWS CloudFormation template:

### JSON
```json
{
  "KmsKeyId": "your_KMS_key_ID_here"
}
```

### YAML
```yaml
KmsKeyId: your_KMS_key_ID_here
```

By providing a valid KMS key ID in the `KmsKeyId` property, you ensure that the endpoint is encrypted using that specific KMS key.

Sources:
1. [AWS CloudFormation Documentation - Update Behaviors](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-update-behaviors.html#update-replacement)
2. [AWS Key Management Service (KMS) Concepts](https://docs.aws.amazon.com/kms/latest/d