# Semantic Cache

Thi is an example of Semantic Cache.  Semantic cache refrs to the idea of using semantic search to match on the key and return the value. This can be used by Tools like Rag to cache teh answers and return teh cached answer, fro all variations of the sqm questions

steps: 

1. Setup :  Embed the questions and store the Q&A pairs
2. R-etrival :  embed the question asked  & do similarity search
3. A -ugmnet :  optionally rerank & add to response
4. G - enerate : ask LLm to answer question based on the retrieved chunks

Questions use #generative-ai-users  or #igiu-innovation-lab slack channel


### Set up variables

In [1]:
from oci.generative_ai_inference import GenerativeAiInferenceClient
from oci.generative_ai_inference.models import OnDemandServingMode, EmbedTextDetails,CohereChatRequest, ChatDetails
import oracledb
import array
import oci
import os,json
from dotenv import load_dotenv
from envyaml import EnvYAML


#####
#make sure your sandbox.yaml & .env file is setup for your environment. You might have to specify the full path depending on  your `cwd` 
#####
SANDBOX_CONFIG_FILE = "sandbox.yaml"
load_dotenv()

LLM_MODEL = "cohere.command-a-03-2025" 

EMBED_MODEL = "cohere.embed-multilingual-v3.0"
llm_service_endpoint= "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"

tablename_prefix = None
compartmentId = None


## chunks we we want to query against 

In [2]:

# The input text to vectorize
qa_pairs = [
    {
        "question": "What is the largest continent by land area?",
        "answer": "Asia is the largest continent, covering about 30% of Earth's land area. It is home to diverse cultures, languages, and ecosystems."
    },
    {
        "question": "Which country has the longest coastline in the world?",
        "answer": "Canada has the longest coastline, stretching over 202,080 kilometers. Its vast coastlines are along the Atlantic, Pacific, and Arctic Oceans."
    },
    {
        "question": "What river is the longest in the world?",
        "answer": "The Nile River is traditionally considered the longest river, flowing over 6,650 kilometers through northeastern Africa. It passes through countries like Egypt and Sudan."
    },
    {
        "question": "Which desert is the largest hot desert in the world?",
        "answer": "The Sahara Desert is the largest hot desert, covering approximately 9.2 million square kilometers. It spans across North Africa from the Atlantic Ocean to the Red Sea."
    },
    {
        "question": "What is the smallest country in the world by land area?",
        "answer": "Vatican City is the smallest country, with an area of just 44 hectares (110 acres). It serves as the spiritual and administrative center of the Roman Catholic Church."
    },
    {
        "question": "Which mountain is the highest in the world above sea level?",
        "answer": "Mount Everest is the highest mountain above sea level, standing at 8,848 meters (29,029 feet). It is part of the Himalayas and located on the border between Nepal and China."
    },
    {
        "question": "What ocean is the deepest in the world?",
        "answer": "The Pacific Ocean is the deepest ocean, with an average depth of about 4,280 meters. The Mariana Trench within it reaches depths of over 10,900 meters."
    },
    {
        "question": "Which two continents are entirely located in the Southern Hemisphere?",
        "answer": "Australia and Antarctica are entirely located in the Southern Hemisphere. Both continents have unique ecosystems and climates."
    },
    {
        "question": "What country has the most time zones?",
        "answer": "France has the most time zones when including its overseas territories. In total, it spans 12 different time zones across various regions worldwide."
    },
    {
        "question": "Which lake is considered the world's largest by surface area?",
        "answer": "Lake Superior, part of North America's Great Lakes, is often considered the largest freshwater lake by surface area. It covers approximately 82,100 square kilometers."
    }
]


## open database connection

In [3]:

scfg = EnvYAML(SANDBOX_CONFIG_FILE)
if scfg is None or "oci" not in scfg or "bucket" not in scfg:
    raise RuntimeError("Invalid sandbox configuration.")
                
config = oci.config.from_file(os.path.expanduser(scfg["oci"]["configFile"]),scfg["oci"]["profile"])
compartmentId = scfg["oci"]["compartment"]
tablename_prefix = scfg["db"]["tablePrefix"]
wallet = os.path.expanduser(scfg["db"]["walletPath"])
                
db = oracledb.connect(  config_dir=scfg["db"]["walletPath"],user= scfg["db"]["username"], password=scfg["db"]["password"], dsn=scfg["db"]["dsn"],wallet_location=scfg["db"]["walletPath"],wallet_password=scfg["db"]["walletPass"])
cursor = db.cursor()

## create tables 

In [4]:
sql = [
	f"""drop table if exists {tablename_prefix}_semantic_cache purge"""	,
 
 
	f"""
		create table {tablename_prefix}_semantic_cache (
		id number,
		question varchar2(4000),
		answer varchar2(4000),
		embedding vector,
		primary key (id)
	)"""
	]
 
for s in sql : 
	cursor.execute(s)

### set up LLM client 

In [5]:

# create a llm client 
llm_client = GenerativeAiInferenceClient(
				config=config, 
				service_endpoint=llm_service_endpoint, 
				retry_strategy=oci.retry.NoneRetryStrategy(),
				timeout=(10,240))	

## Create embeddings

In [6]:
embed_text_detail = EmbedTextDetails()
embed_text_detail.serving_mode = OnDemandServingMode(model_id="cohere.embed-english-v3.0")
embed_text_detail.truncate = embed_text_detail.TRUNCATE_END
embed_text_detail.input_type = embed_text_detail.INPUT_TYPE_SEARCH_DOCUMENT
embed_text_detail.compartment_id = compartmentId
embed_text_detail.inputs = [pair["question"] for pair in qa_pairs] 

response = llm_client.embed_text(embed_text_detail)
embeddings = response.data.embeddings


## insert embedding in database

In [7]:
for i in range(len(embeddings)):
    cursor.execute(f"insert into {tablename_prefix}_semantic_cache values (:1, :2, :3, :4)", 
                   [i,qa_pairs[i]['question'],qa_pairs[i]['answer'], array.array("f",embeddings[i])])
    print(f"inserted {i}-{qa_pairs[i]}")

print("commiting")
db.commit()

inserted 0-{'question': 'What is the largest continent by land area?', 'answer': "Asia is the largest continent, covering about 30% of Earth's land area. It is home to diverse cultures, languages, and ecosystems."}
inserted 1-{'question': 'Which country has the longest coastline in the world?', 'answer': 'Canada has the longest coastline, stretching over 202,080 kilometers. Its vast coastlines are along the Atlantic, Pacific, and Arctic Oceans.'}
inserted 2-{'question': 'What river is the longest in the world?', 'answer': 'The Nile River is traditionally considered the longest river, flowing over 6,650 kilometers through northeastern Africa. It passes through countries like Egypt and Sudan.'}
inserted 3-{'question': 'Which desert is the largest hot desert in the world?', 'answer': 'The Sahara Desert is the largest hot desert, covering approximately 9.2 million square kilometers. It spans across North Africa from the Atlantic Ocean to the Red Sea.'}
inserted 4-{'question': 'What is the 

## read the table 

In [8]:
cursor.execute(f"select id,question,answer from {tablename_prefix}_semantic_cache")
for row in cursor:
	print(f"{row[0]}:{row[1]}:{[row[2]]}")

6:What ocean is the deepest in the world?:['The Pacific Ocean is the deepest ocean, with an average depth of about 4,280 meters. The Mariana Trench within it reaches depths of over 10,900 meters.']
5:Which mountain is the highest in the world above sea level?:['Mount Everest is the highest mountain above sea level, standing at 8,848 meters (29,029 feet). It is part of the Himalayas and located on the border between Nepal and China.']
2:What river is the longest in the world?:['The Nile River is traditionally considered the longest river, flowing over 6,650 kilometers through northeastern Africa. It passes through countries like Egypt and Sudan.']
4:What is the smallest country in the world by land area?:['Vatican City is the smallest country, with an area of just 44 hectares (110 acres). It serves as the spiritual and administrative center of the Roman Catholic Church.']
1:Which country has the longest coastline in the world?:['Canada has the longest coastline, stretching over 202,080 

## Ask A question to answer

ask questions similar to questions above.  change the wording and see if the semantic cahce returns teh right answer

In [9]:
query = input("Ask a question: ").strip().lower()
q=[]
q.append(query)



## embed the query

we nede to do the "R" part of rag - retrieve.  we retrieve in following steps
1. embed the query test
1. do a similarity serach to find the text similar to it 
2. optionally rerank it 

In [10]:

# embed

embed_text_detail.inputs = q
embed_text_detail.input_type = EmbedTextDetails.INPUT_TYPE_SEARCH_QUERY

response = llm_client.embed_text(embed_text_detail)
vec = array.array("f",response.data.embeddings[0])




In [11]:
# simialrity search of embedded text 

 # There are multiple search algorithms: COSINE, DOT, EUCLIDEANN, try different algos
 # try adding the constraint the distance of < 0.5 is something we will need to finetune based on data
cursor.execute(f"""
	select id,question,answer, vector_distance(embedding, :1, COSINE) d 
	from {tablename_prefix}_semantic_cache
	order by d
	fetch first 10 rows only
	""", [vec])

rows =[]
for row in cursor:
	r = [row[0], row[1], row[2], row[3]]
	print(r)
	rows.append(r)


[0, 'What is the largest continent by land area?', "Asia is the largest continent, covering about 30% of Earth's land area. It is home to diverse cultures, languages, and ecosystems.", 0.6209977709219853]
[7, 'Which two continents are entirely located in the Southern Hemisphere?', 'Australia and Antarctica are entirely located in the Southern Hemisphere. Both continents have unique ecosystems and climates.', 0.6243235005566692]
[8, 'What country has the most time zones?', 'France has the most time zones when including its overseas territories. In total, it spans 12 different time zones across various regions worldwide.', 0.7809420502825789]
[6, 'What ocean is the deepest in the world?', 'The Pacific Ocean is the deepest ocean, with an average depth of about 4,280 meters. The Mariana Trench within it reaches depths of over 10,900 meters.', 0.7925850409935986]
[5, 'Which mountain is the highest in the world above sea level?', 'Mount Everest is the highest mountain above sea level, standi

### optionally rerank

In [15]:
# look at cohere reranking example 

## print the response 

In [12]:

print("**************************Chat Result**************************")
print (f"Answer is {rows[0][2]}")
print ("\n other answers:\n")
for chunk in rows[0:3]: 
	print(f"{chunk[3]}:{chunk[1]}")

**************************Chat Result**************************
Answer is Asia is the largest continent, covering about 30% of Earth's land area. It is home to diverse cultures, languages, and ecosystems.

 other answers:

0.6209977709219853:What is the largest continent by land area?
0.6243235005566692:Which two continents are entirely located in the Southern Hemisphere?
0.7809420502825789:What country has the most time zones?


## close the database connections

In [13]:
cursor.close()
db.close()


## Exercise

Add semantic cache to your Rag agent and see the performance difference
1. check to see if the question is already answered
   * decide on the treshold of similar distance
2.  If its under the distance return the matching answer
3.  if not, ask the question to the agent
    *  store the answer
4.  tray asking a different worded questiona nd se eif it hits the cache