# CrateDB Caches

## About
Caching outcomes of LLM conversations improves performance and decreases costs.
[LangChain's caching subsystem] covers two different popular caching strategies.
You can use CrateDB for caching LLM responses, choosing either the exact-match
CrateDBCache or the vector-similarity based CrateDBSemanticCache.

### Standard Cache
The standard cache looks for an exact match of the user prompt. It does not use
Semantic Caching, nor does it require a vector search index to be made on the
collection before generation. This will avoid invoking the LLM when the supplied
prompt is exactly the same as one encountered already.

### Semantic Cache
Semantic caching allows users to retrieve cached prompts based on semantic
similarity between the user input and previously cached inputs. Under the hood,
it uses CrateDB as both a cache and a vectorstore. This needs an appropriate
vector search index defined to work.

### CrateDB
[CrateDB] is a distributed and scalable SQL database for storing and analyzing
massive amounts of data in near real-time, even with complex queries. It is
PostgreSQL-compatible, based on Lucene, and inheriting from Elasticsearch.
[CrateDB Cloud] is a fully-managed cloud database available in AWS, Azure,
and GCP.

CrateDB has native support for Vector Search. Use [CrateDB Vector Search] to
semantically cache prompts and responses.

[CrateDB]: https://cratedb.com/database
[CrateDB Cloud]: https://cratedb.com/database/cloud
[CrateDB Vector Search]: https://cratedb.com/docs/guide/feature/search/vector/
[LangChain's caching subsystem]: https://python.langchain.com/docs/integrations/llm_caching/

## Installation

Install the most recent version of the LangChain CrateDB adapter,
and a few other packages that are needed by the tutorial.

In [1]:
%pip install --upgrade langchain-cratedb langchain-openai

## Prerequisites
Because this notebook uses OpenAI's APIs, you need to supply an authentication
token. Either set the environment variable `OPENAI_API_KEY`, or optionally
configure your token here.

In [2]:
import os

_ = os.environ.setdefault(
    "OPENAI_API_KEY", "sk-XJZ7pfog5Gp8Kus8D--invalid--0CJ5lyAKSefZLaV1Y9S1"
)

## Usage

### CrateDBCache

The standard cache `CrateDBCache` uses LangChain's `SQLAlchemyCache` under the hood.

In [3]:
import sqlalchemy as sa
from langchain.globals import set_llm_cache
from langchain_openai import ChatOpenAI

from langchain_cratedb import CrateDBCache

# Configure standard cache.
engine = sa.create_engine("crate://crate@localhost:4200/?schema=testdrive")
set_llm_cache(CrateDBCache(engine))

# Invoke LLM conversation.
llm = ChatOpenAI(model_name="gpt-3.5-turbo")
answer = llm.invoke("What is the answer to everything?")
print(answer.content)

# Turn off cache.
set_llm_cache(None)

The answer to everything is subjective and can vary depending on individual beliefs and perspectives. Some may say the answer to everything is love, others may say it is knowledge or understanding. Ultimately, there may not be one definitive answer to everything.


### CrateDBSemanticCache

The semantic cache `CrateDBSemanticCache` uses `CrateDBVectorStore` under the hood.

In [40]:
import sqlalchemy as sa
from langchain.globals import set_llm_cache
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

from langchain_cratedb import CrateDBSemanticCache

# Configure semantic cache.
engine = sa.create_engine("crate://crate@localhost:4200/?schema=testdrive")
set_llm_cache(
    CrateDBSemanticCache(
        embedding=OpenAIEmbeddings(),
        connection=engine,
        search_threshold=1.0,
    )
)

# Invoke LLM conversation.
llm = ChatOpenAI(model_name="chatgpt-4o-latest")
answer = llm.invoke("What is the answer to everything?")
print(answer.content)

# Turn off cache.
set_llm_cache(None)

Ah, you're referencing the famous science fiction series *The Hitchhiker's Guide to the Galaxy* by Douglas Adams! In the story, the supercomputer Deep Thought determines that the "Answer to the Ultimate Question of Life, the Universe, and Everything" is **42**. However, the actual "Ultimate Question" itself is unknown, leading to much cosmic humor and philosophical pondering.

So, the answer is **42** — but what the question is, well, that's a whole other mystery! 😊
