# Redis Cache for LangChain

This notebook demonstrates how to use the `RedisCache`, `RedisSemanticCache`, and `LangCacheSemanticCache` classes from the langchain-redis package to implement caching for LLM responses.

## Setup

First, let's install the required dependencies and ensure we have a Redis instance running.

In [5]:
%pip install -qU langchain-core langchain-redis "langchain-openai>=1.0.3" "redis<7.0"
%pip install -qU -e "../[langcache]"

Note: you may need to restart the kernel to use updated packages.


Ensure you have a Redis server running. You can start one using Docker with:

```
docker run -d -p 6379:6379 redis:latest
```

Or install and run Redis locally according to your operating system's instructions.

In [6]:
# ruff: noqa: T201
import os

# Use the environment variable if set, otherwise default to localhost
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
print(f"Connecting to Redis at: {REDIS_URL}")

Connecting to Redis at: redis://localhost:6379


## Importing Required Libraries

In [7]:
import time
from getpass import getpass

from langchain_core.globals import set_llm_cache
from langchain_core.outputs import Generation
from langchain_openai import OpenAI, OpenAIEmbeddings

from langchain_redis import RedisCache, RedisSemanticCache, LangCacheSemanticCache

### Set OpenAI API key

In [8]:
# Check if OPENAI_API_KEY is already set in the environment
openai_api_key = os.getenv("OPENAI_API_KEY")

if not openai_api_key:
    print("OpenAI API key not found in environment variables.")
    openai_api_key = getpass("Please enter your OpenAI API key: ")

    # Set the API key for the current session
    os.environ["OPENAI_API_KEY"] = openai_api_key
    print("OpenAI API key has been set for this session.")
else:
    print("OpenAI API key found in environment variables.")

OpenAI API key found in environment variables.


## Using RedisCache

In [9]:
# Initialize RedisCache
redis_cache = RedisCache(redis_url=REDIS_URL)

# Set the cache for LangChain to use
set_llm_cache(redis_cache)

# Initialize the language model
llm = OpenAI(temperature=0)


# Function to measure execution time
def timed_completion(prompt):
    start_time = time.time()
    result = llm.invoke(prompt)
    end_time = time.time()
    return result, end_time - start_time


# First call (not cached)
prompt = "Explain the concept of caching in three sentences."
result1, time1 = timed_completion(prompt)
print(f"First call (not cached):\nResult: {result1}\nTime: {time1:.2f} seconds\n")

# Second call (should be cached)
result2, time2 = timed_completion(prompt)
print(f"Second call (cached):\nResult: {result2}\nTime: {time2:.2f} seconds\n")

print(f"Speed improvement: {time1 / time2:.2f}x faster")

# Clear the cache
redis_cache.clear()
print("Cache cleared")

First call (not cached):
Result: 

Caching is the process of storing frequently accessed data in a temporary storage location for faster retrieval. This helps to reduce the time and resources needed to access the data from its original source. Caching is commonly used in computer systems, web browsers, and databases to improve performance and efficiency.
Time: 0.68 seconds

Second call (cached):
Result: 

Caching is the process of storing frequently accessed data in a temporary storage location for faster retrieval. This helps to reduce the time and resources needed to access the data from its original source. Caching is commonly used in computer systems, web browsers, and databases to improve performance and efficiency.
Time: 0.00 seconds

Speed improvement: 525.11x faster
Cache cleared


## Using RedisSemanticCache

In [10]:
# Initialize RedisSemanticCache
embeddings = OpenAIEmbeddings()
semantic_cache = RedisSemanticCache(
    redis_url=REDIS_URL, embeddings=embeddings, distance_threshold=0.2
)

# Set the cache for LangChain to use
set_llm_cache(semantic_cache)


# Function to test semantic cache
def test_semantic_cache(prompt):
    start_time = time.time()
    result = llm.invoke(prompt)
    end_time = time.time()
    return result, end_time - start_time


# Original query
original_prompt = "What is the capital of France?"
result1, time1 = test_semantic_cache(original_prompt)
print(f"Original query:\nPrompt: {original_prompt}\n")
print(f"Result: {result1}\nTime: {time1:.2f} seconds\n")

# Semantically similar query
similar_prompt = "Can you tell me the capital city of France?"
result2, time2 = test_semantic_cache(similar_prompt)
print(f"Similar query:\nPrompt: {similar_prompt}\n")
print(f"Result: {result2}\nTime: {time2:.2f} seconds\n")

print(f"Speed improvement: {time1 / time2:.2f}x faster")

# Clear the semantic cache
semantic_cache.clear()
print("Semantic cache cleared")

Original query:
Prompt: What is the capital of France?

Result: 

The capital of France is Paris.
Time: 1.37 seconds

Similar query:
Prompt: Can you tell me the capital city of France?

Result: 

The capital of France is Paris.
Time: 0.39 seconds

Speed improvement: 3.48x faster
Semantic cache cleared


## Using LangCacheSemanticCache

Redis LangCache is a managed service that provides a semantic cache for LLM applications. It manages embeddings and vector search for you, allowing you to focus on your application logic. See [our docs](https://redis.io/docs/latest/develop/ai/langcache/) to learn more.

**NOTE:** To run these LangCache examples, you must first create a LangCache instance in Redis Cloud. [Get started with a free Redis Cloud account today](https://redis.io/docs/latest/operate/rc/langcache/#get-started-with-langcache-on-redis-cloud).

In [11]:
# Check if LangCache API key and cache ID are already set in the environment
langcache_api_key = os.getenv("LANGCACHE_API_KEY")
langcache_cache_id = os.getenv("LANGCACHE_CACHE_ID")


if not langcache_api_key or not langcache_cache_id:
    print("LangCache API key or cache ID not found in environment variables.")
    if not langcache_api_key:
        langcache_api_key = getpass("Please enter your LangCache API key: ")
    if not langcache_cache_id:
        langcache_cache_id = input("Please enter your LangCache cache ID: ")

    # Set the API key for the current session
    os.environ["LANGCACHE_API_KEY"] = langcache_api_key
    os.environ["LANGCACHE_CACHE_ID"] = langcache_cache_id
    print("LangCache API key and cache ID have been set for this session.")
else:
    print("LangCache API key and cache ID found in environment variables.")


if not langcache_api_key or not langcache_cache_id:
    print("Not running LangCache examples because we do not have an API key and cache ID.")
    exit(0)

# Initialize LangCacheSemanticCache
semantic_cache = LangCacheSemanticCache(
    cache_id=langcache_cache_id,
    api_key=langcache_api_key,
    distance_threshold=0.2
)

# Set the cache for LangChain to use
set_llm_cache(semantic_cache)


# Function to test semantic cache
def test_semantic_cache(prompt):
    start_time = time.time()
    result = llm.invoke(prompt)
    end_time = time.time()
    return result, end_time - start_time


# Original query
original_prompt = "What is the capital of France?"
result1, time1 = test_semantic_cache(original_prompt)
print(f"Original query:\nPrompt: {original_prompt}\n")
print(f"Result: {result1}\nTime: {time1:.2f} seconds\n")


# Semantically similar query
similar_prompt = "Can you tell me the capital city of France?"
result2, time2 = test_semantic_cache(similar_prompt)
print(f"Similar query:\nPrompt: {similar_prompt}\n")
print(f"Result: {result2}\nTime: {time2:.2f} seconds\n")
print(f"(Similar query) Speed improvement: {time1 / time2:.2f}x faster")

# Clear the semantic cache
semantic_cache.clear()
print("Semantic cache cleared")

LangCache API key and cache ID found in environment variables.
Original query:
Prompt: What is the capital of France?

Result: 

The capital of France is Paris.
Time: 2.02 seconds

Similar query:
Prompt: Can you tell me the capital city of France?

Result: 

The capital of France is Paris.
Time: 0.15 seconds

(Similar query) Speed improvement: 13.78x faster
Semantic cache cleared


## Advanced Usage

### Custom TTL (Time-To-Live)

In [12]:
# Initialize RedisCache with custom TTL
ttl_cache = RedisCache(redis_url=REDIS_URL, ttl=5)  # 60 seconds TTL

# Update a cache entry
ttl_cache.update("test_prompt", "test_llm", [Generation(text="Cached response")])

# Retrieve the cached entry
cached_result = ttl_cache.lookup("test_prompt", "test_llm")
print(f"Cached result: {cached_result[0].text if cached_result else 'Not found'}")

# Wait for TTL to expire
print("Waiting for TTL to expire...")
time.sleep(6)

# Try to retrieve the expired entry
expired_result = ttl_cache.lookup("test_prompt", "test_llm")
if expired_result:
    print(f"Result after TTL: {expired_result[0].text}")
else:
    print("Not found (expired)")

Cached result: Cached response
Waiting for TTL to expire...
Not found (expired)


### Customizing RedisSemanticCache

In [13]:
# Initialize RedisSemanticCache with custom settings
custom_semantic_cache = RedisSemanticCache(
    redis_url=REDIS_URL,
    embeddings=embeddings,
    distance_threshold=0.1,  # Stricter similarity threshold
    ttl=3600,  # 1 hour TTL
    name="custom_cache",  # Custom cache name
)

# Test the custom semantic cache
set_llm_cache(custom_semantic_cache)

test_prompt = "What's the largest planet in our solar system?"
result, _ = test_semantic_cache(test_prompt)
print(f"Original result: {result}")

# Try a slightly different query
similar_test_prompt = "Which planet is the biggest in the solar system?"
similar_result, _ = test_semantic_cache(similar_test_prompt)
print(f"Similar query result: {similar_result}")

# Clean up
custom_semantic_cache.clear()

Original result: 

The largest planet in our solar system is Jupiter.
Similar query result: 

The largest planet in our solar system is Jupiter.


## Conclusion

This notebook demonstrated the usage of `RedisCache` and `RedisSemanticCache` from the langchain-redis package. These caching mechanisms can significantly improve the performance of LLM-based applications by reducing redundant API calls and leveraging semantic similarity for intelligent caching. The Redis-based implementation provides a fast, scalable, and flexible solution for caching in distributed systems.