## Introduction to Azure Redis Cache for Semantic Search Capabilities 🧠🔎

Unlock the power of **semantic caching** by integrating Azure Redis Cache with OpenAI models! In this lab, you'll learn how to set up semantic caching for LLM responses, reducing latency and cost for repeated or similar queries.

---

![Azure Redis Cache Semantic Search](redis-semantic-caching-scaled.jpg)

---

### 🚀 Install Required Packages

Let's start by installing all the Python packages needed for this lab:  
- `openai` for LLM access  
- `langchain` for LLM orchestration  
- `redis` for cache connectivity  
- `tiktoken` for tokenization  
- `python-dotenv` for environment variable management

---

In [None]:
%pip install openai langchain redis tiktoken python-dotenv langchain-openai redis==4.5.5

### 🛠️ Load Environment & Initialize Variables

Load your environment variables and set up all the configuration needed to connect to Azure OpenAI and Redis.  
This includes API keys, deployment names, and Redis connection info.

---


In [None]:
import openai
import redis
import os
import langchain
from langchain.embeddings import AzureOpenAIEmbeddings
from langchain.globals import set_llm_cache
from langchain.cache import RedisSemanticCache
from langchain_community.chat_models import ChatOpenAI
import time
from dotenv import load_dotenv

print("Loading environment variables...")
load_dotenv()
api_version="2023-05-15"
azure_openai_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
azure_openai_api_key=os.getenv("AZURE_OPENAI_API_KEY")
llm_deployment_name = os.getenv("LLM_DEPLOYMENT_NAME")
llm_model_name = os.getenv("LLM_MODEL_NAME")
embeddings_deployment_name = os.getenv("EMBEDDINGS_DEPLOYMENT_NAME")
embeddings_model_name = os.getenv("EMBEDDINGS_MODEL_NAME")

print(f"Azure OpenAI Endpoint: {azure_openai_endpoint}")
print(azure_openai_api_key)
print(f"LLM Deployment Name: {llm_deployment_name}")
print(f"Embeddings Deployment Name: {embeddings_deployment_name}")

redis_endpoint = os.getenv("REDIS_ENDPOINT")
redis_password = os.getenv("REDIS_PASSWORD")
print(f"Redis Endpoint: {redis_endpoint}")

### 🤖 Initialize LLM & Embeddings

Create your Azure OpenAI LLM and Embeddings objects using the loaded configuration.  
These will be used for generating responses and semantic similarity calculations.

---

In [None]:
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings

llm = AzureChatOpenAI(
    openai_api_key=azure_openai_api_key,
    azure_endpoint=azure_openai_endpoint,
    api_version=api_version,
    azure_deployment=llm_deployment_name,
    model=llm_model_name,
)

embeddings = AzureOpenAIEmbeddings(
    openai_api_key=azure_openai_api_key,
    azure_endpoint=azure_openai_endpoint,
    api_version=api_version,
    azure_deployment=embeddings_deployment_name,
    model=embeddings_model_name,
)


### 🗄️ Connect to Redis & Set Up Semantic Cache

Build the Redis connection URL and configure LangChain to use Redis as a semantic cache.  
This enables fast retrieval of similar responses and reduces redundant LLM calls.

---

In [None]:
from langchain.globals import set_llm_cache
from langchain_community.cache import RedisSemanticCache

# Example for Azure Redis (with SSL enabled)
redis_url = f"rediss://:{redis_password}@{redis_endpoint}"  # SSL port

semantic_cache = RedisSemanticCache(
    redis_url=redis_url,
    embedding=embeddings,
    score_threshold=0.05,
)

set_llm_cache(semantic_cache)


### 📝 Generate a Poem About Cute Kittens (First Query)

Let's generate a poem about cute kittens!  
This first call will go to the LLM and store the result in the semantic cache.
When you run the second call, you'll see faster response times

---

In [None]:
%%time
import redis
response = llm("Please write a poem about cute kittens.")
print(response)

### 📝 Generate a Poem About Cute Puppies (First Query)

Now, generate a poem about cute puppies.  
Since this is a new query but semantically similar to the first query, this will go to the cache

---

In [None]:
%%time
response = llm("Please write a poem about cute puppies.")
print(response)

In [None]:
%%time
response = llm("Please write a poem about pets")
print(response)