Semantic Cache Documentation: https://python.langchain.com/docs/integrations/llms/llm_caching#redis-cache
LLM Documentation: https://python.langchain.com/docs/integrations/llms/azure_openai


In [19]:
%pip install openai langchain redis tiktoken

Collecting tiktoken
  Using cached tiktoken-0.5.1-cp39-cp39-win_amd64.whl.metadata (6.8 kB)
Collecting regex>=2022.1.18 (from tiktoken)
  Using cached regex-2023.10.3-cp39-cp39-win_amd64.whl.metadata (41 kB)
Using cached tiktoken-0.5.1-cp39-cp39-win_amd64.whl (760 kB)
Using cached regex-2023.10.3-cp39-cp39-win_amd64.whl (269 kB)
Installing collected packages: regex, tiktoken
Successfully installed regex-2023.10.3 tiktoken-0.5.1
Note: you may need to restart the kernel to use updated packages.


In [94]:
import openai
import redis
import os
import langchain
from langchain.llms import AzureOpenAI
from langchain.embeddings import OpenAIEmbeddings
from langchain.globals import set_llm_cache
from langchain.cache import RedisSemanticCache
import time


openai_api_base="<your-azure-open-ai-endpoint>"
openai_api_type="azure" # you need this if you're using Azure Open AI service
openai_api_key="<your-azure-openai-key>"
openai_api_version="2023-05-15"

# make sure you have an LLM deployed in your Azure Open AI account. In this example, I used the GPT 3.5 turbo instruct model. My deployment was named "gpt35instruct".
llm = AzureOpenAI(
    deployment_name="gpt35instruct",
    model_name="gpt-35-turbo-instruct",
    openai_api_key=openai_api_key,
    openai_api_base=openai_api_base,
    openai_api_version=openai_api_version,
)
# make sure you have an embeddings model deployed in your Azure Open AI account. In this example, I used the text embedding ada 002 model. My deployment was named "textembedding".
embeddings = OpenAIEmbeddings(
    deployment="textembedding",
    model="text-embedding-ada-002",
    openai_api_key=openai_api_key,
    openai_api_base=openai_api_base,
    openai_api_version=openai_api_version,
)


REDIS_ENDPOINT = "<your-azure-redis-endpoint>" # must include port at the end. e.g. "redisdemo.eastus.redisenterprise.cache.azure.net:10000"
REDIS_PASSWORD = "<your-redis-password>"

# create a connection string for the Redis Vector Store. Uses Redis-py format: https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url
# This example assumes TLS is enabled. If not, use "redis://" instead of "rediss://
redis_url = "rediss://:" + REDIS_PASSWORD + "@"+ REDIS_ENDPOINT

# set up the semantic cache for your llm
set_llm_cache(RedisSemanticCache(redis_url = redis_url, embedding=embeddings, score_threshold=.05))

#note: you can use score_threshold to change how sensitive the semantic cache is. The lower the score, the less likely it is to use a cached result.

In [99]:
%%time
response = llm("Please write a poem about cute puppies")
print(response)



Oh, the sight of a cute puppy,
Makes my heart skip a beat,
With their wagging tails and floppy ears,
They are simply too sweet.

Their tiny paws and curious eyes,
Filled with innocence and glee,
They bring joy to our lives,
In every wag and bark we see.

Their fur, so soft and fluffy,
A delight to touch and hold,
They bring warmth to our hearts,
Even when the world is cold.

Their playful nature knows no bounds,
As they chase their tails in delight,
Their energy is infectious,
A true bundle of pure delight.

They snuggle close and give wet kisses,
With their unconditional love,
They make our days brighter,
And our spirits soar above.

With every clumsy step they take,
And every toy they chew,
They teach us to appreciate,
The simple joys in life, so true.

Oh, how I adore these cute puppies,
With their boundless love and grace,
They are a reminder of the beauty,
In this world, and every place.

So let us cherish these furry friends,
And hold them close with care,
For they are more tha

In [102]:
# calculate tokens required
import tiktoken
encoding = tiktoken.get_encoding('cl100k_base')
response_tokens = len(encoding.encode(response))
query_tokens = len(encoding.encode("Please write a poem about cute puppies"))
print(response_tokens)
print(query_tokens)

235
7
