# Model caches

This notebook covers how to cache results of individual LLM calls using different caches.

In [None]:
from langchain.globals import set_llm_cache
from langchain_openai import OpenAI

# To make the caching really obvious, lets use a slower model.
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", n=2, best_of=2,openai_api_key='OPEN-AI-KEY')

## `In Memory` Cache

In [None]:
from langchain_community.cache import InMemoryCache

set_llm_cache(InMemoryCache())

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
llm("Tell me a joke")

## `SQLite` Cache

In [None]:
!rm .langchain.db

In [None]:
# We can do the same thing with a SQLite cache
from langchain_community.cache import SQLiteCache

set_llm_cache(SQLiteCache(database_path=".langchain.db"))

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
llm("Tell me a joke")

## `Upstash Redis` Cache

### Standard Cache
Use [Upstash Redis](https://upstash.com) to cache prompts and responses with a serverless HTTP API.

In [None]:
import langchain
from langchain_community.cache import UpstashRedisCache
from upstash_redis import Redis

URL = "<UPSTASH_REDIS_REST_URL>"
TOKEN = "<UPSTASH_REDIS_REST_TOKEN>"

langchain.llm_cache = UpstashRedisCache(redis_=Redis(url=URL, token=TOKEN))

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
llm("Tell me a joke")

## `Redis` Cache

### Standard Cache
Use [Redis](/docs/integrations/providers/redis) to cache prompts and responses.

In [None]:
# We can do the same thing with a Redis cache
# (make sure your local Redis instance is running first before running this example)
from langchain_community.cache import RedisCache
from redis import Redis

set_llm_cache(RedisCache(redis_=Redis()))

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
llm("Tell me a joke")

### Semantic Cache
Use [Redis](/docs/integrations/providers/redis) to cache prompts and responses and evaluate hits based on semantic similarity.

In [None]:
from langchain_community.cache import RedisSemanticCache
from langchain_openai import OpenAIEmbeddings

set_llm_cache(
    RedisSemanticCache(redis_url="redis://localhost:6379", embedding=OpenAIEmbeddings())
)

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time, while not a direct hit, the question is semantically similar to the original question,
# so it uses the cached result!
llm("Tell me one joke")

## `GPTCache`

We can use [GPTCache](https://github.com/zilliztech/GPTCache) for exact match caching OR to cache results based on semantic similarity

Let's first start with an example of exact match

In [None]:
import hashlib

from gptcache import Cache
from gptcache.manager.factory import manager_factory
from gptcache.processor.pre import get_prompt
from langchain_community.cache import GPTCache


def get_hashed_name(name):
    return hashlib.sha256(name.encode()).hexdigest()


def init_gptcache(cache_obj: Cache, llm: str):
    hashed_llm = get_hashed_name(llm)
    cache_obj.init(
        pre_embedding_func=get_prompt,
        data_manager=manager_factory(manager="map", data_dir=f"map_cache_{hashed_llm}"),
    )


set_llm_cache(GPTCache(init_gptcache))

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
llm("Tell me a joke")

Let's now show an example of similarity caching

In [None]:
import hashlib

from gptcache import Cache
from gptcache.adapter.api import init_similar_cache
from langchain_community.cache import GPTCache


def get_hashed_name(name):
    return hashlib.sha256(name.encode()).hexdigest()


def init_gptcache(cache_obj: Cache, llm: str):
    hashed_llm = get_hashed_name(llm)
    init_similar_cache(cache_obj=cache_obj, data_dir=f"similar_cache_{hashed_llm}")


set_llm_cache(GPTCache(init_gptcache))

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# This is an exact match, so it finds it in the cache
llm("Tell me a joke")

In [None]:
%%time
# This is not an exact match, but semantically within distance so it hits!
llm("Tell me joke")

## `MongoDB Atlas` Cache

[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS, Azure, and GCP. It has native support for 
Vector Search on the MongoDB document data.
Use [MongoDB Atlas Vector Search](/docs/integrations/providers/mongodb_atlas) to semantically cache prompts and responses.

### `MongoDBCache`
An abstraction to store a simple cache in MongoDB. This does not use Semantic Caching, nor does it require an index to be made on the collection before generation.

To import this cache:

```python
from langchain_mongodb.cache import MongoDBCache
```


To use this cache with your LLMs:
```python
from langchain_core.globals import set_llm_cache

# use any embedding provider...
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings

mongodb_atlas_uri = "<YOUR_CONNECTION_STRING>"
COLLECTION_NAME="<YOUR_CACHE_COLLECTION_NAME>"
DATABASE_NAME="<YOUR_DATABASE_NAME>"

set_llm_cache(MongoDBCache(
    connection_string=mongodb_atlas_uri,
    collection_name=COLLECTION_NAME,
    database_name=DATABASE_NAME,
))
```


### `MongoDBAtlasSemanticCache`
Semantic caching allows users to retrieve cached prompts based on semantic similarity between the user input and previously cached results. Under the hood it blends MongoDBAtlas as both a cache and a vectorstore.
The MongoDBAtlasSemanticCache inherits from `MongoDBAtlasVectorSearch` and needs an Atlas Vector Search Index defined to work. Please look at the [usage example](/docs/integrations/vectorstores/mongodb_atlas) on how to set up the index.

To import this cache:
```python
from langchain_mongodb.cache import MongoDBAtlasSemanticCache
```

To use this cache with your LLMs:
```python
from langchain_core.globals import set_llm_cache

# use any embedding provider...
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings

mongodb_atlas_uri = "<YOUR_CONNECTION_STRING>"
COLLECTION_NAME="<YOUR_CACHE_COLLECTION_NAME>"
DATABASE_NAME="<YOUR_DATABASE_NAME>"

set_llm_cache(MongoDBAtlasSemanticCache(
    embedding=FakeEmbeddings(),
    connection_string=mongodb_atlas_uri,
    collection_name=COLLECTION_NAME,
    database_name=DATABASE_NAME,
))
```

To find more resources about using MongoDBSemanticCache visit [here](https://www.mongodb.com/blog/post/introducing-semantic-caching-dedicated-mongodb-lang-chain-package-gen-ai-apps)

## `Momento` Cache
Use [Momento](/docs/integrations/providers/momento) to cache prompts and responses.

Requires momento to use, uncomment below to install:

In [None]:
%pip install --upgrade --quiet  momento

You'll need to get a Momento auth token to use this class. This can either be passed in to a momento.CacheClient if you'd like to instantiate that directly, as a named parameter `auth_token` to `MomentoChatMessageHistory.from_client_params`, or can just be set as an environment variable `MOMENTO_AUTH_TOKEN`.

In [None]:
from datetime import timedelta

from langchain_community.cache import MomentoCache

cache_name = "langchain"
ttl = timedelta(days=1)
set_llm_cache(MomentoCache.from_client_params(cache_name, ttl))

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
# When run in the same region as the cache, latencies are single digit ms
llm("Tell me a joke")

## `SQLAlchemy` Cache

You can use `SQLAlchemyCache` to cache with any SQL database supported by `SQLAlchemy`.

In [None]:
# from langchain.cache import SQLAlchemyCache
# from sqlalchemy import create_engine

# engine = create_engine("postgresql://postgres:postgres@localhost:5432/postgres")
# set_llm_cache(SQLAlchemyCache(engine))

### Custom SQLAlchemy Schemas

In [None]:
# You can define your own declarative SQLAlchemyCache child class to customize the schema used for caching. For example, to support high-speed fulltext prompt indexing with Postgres, use:

from langchain_community.cache import SQLAlchemyCache
from sqlalchemy import Column, Computed, Index, Integer, Sequence, String, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy_utils import TSVectorType

Base = declarative_base()


class FulltextLLMCache(Base):  # type: ignore
    """Postgres table for fulltext-indexed LLM Cache"""

    __tablename__ = "llm_cache_fulltext"
    id = Column(Integer, Sequence("cache_id"), primary_key=True)
    prompt = Column(String, nullable=False)
    llm = Column(String, nullable=False)
    idx = Column(Integer)
    response = Column(String)
    prompt_tsv = Column(
        TSVectorType(),
        Computed("to_tsvector('english', llm || ' ' || prompt)", persisted=True),
    )
    __table_args__ = (
        Index("idx_fulltext_prompt_tsv", prompt_tsv, postgresql_using="gin"),
    )


engine = create_engine("postgresql://postgres:postgres@localhost:5432/postgres")
set_llm_cache(SQLAlchemyCache(engine, FulltextLLMCache))

## `Cassandra` caches

> [Apache CassandraÂ®](https://cassandra.apache.org/) is a NoSQL, row-oriented, highly scalable and highly available database. Starting with version 5.0, the database ships with [vector search capabilities](https://cassandra.apache.org/doc/trunk/cassandra/vector-search/overview.html).

You can use Cassandra for caching LLM responses, choosing from the exact-match `CassandraCache` or the (vector-similarity-based) `CassandraSemanticCache`.

Let's see both in action. The next cells guide you through the (little) required setup, and the following cells showcase the two available cache classes.

### Required dependency

In [None]:
%pip install --upgrade --quiet "cassio>=0.1.4"

### Connect to the DB

The Cassandra caches shown in this page can be used with Cassandra as well as other derived databases, such as Astra DB, which use the CQL (Cassandra Query Language) protocol.

> DataStax [Astra DB](https://docs.datastax.com/en/astra-serverless/docs/vector-search/quickstart.html) is a managed serverless database built on Cassandra, offering the same interface and strengths.

Depending on whether you connect to a Cassandra cluster or to Astra DB through CQL, you will provide different parameters when instantiating the cache (through initialization of a CassIO connection).

#### Connecting to a Cassandra cluster

You first need to create a `cassandra.cluster.Session` object, as described in the [Cassandra driver documentation](https://docs.datastax.com/en/developer/python-driver/latest/api/cassandra/cluster/#module-cassandra.cluster). The details vary (e.g. with network settings and authentication), but this might be something like:

In [None]:
from cassandra.cluster import Cluster

cluster = Cluster(["127.0.0.1"])
session = cluster.connect()

You can now set the session, along with your desired keyspace name, as a global CassIO parameter:

In [None]:
import cassio

CASSANDRA_KEYSPACE = input("CASSANDRA_KEYSPACE = ")

cassio.init(session=session, keyspace=CASSANDRA_KEYSPACE)

#### Connecting to Astra DB through CQL

In this case you initialize CassIO with the following connection parameters:

- the Database ID, e.g. `01234567-89ab-cdef-0123-456789abcdef`
- the Token, e.g. `AstraCS:6gBhNmsk135....` (it must be a "Database Administrator" token)
- Optionally a Keyspace name (if omitted, the default one for the database will be used)

In [None]:
import getpass

ASTRA_DB_ID = input("ASTRA_DB_ID = ")
ASTRA_DB_APPLICATION_TOKEN = getpass.getpass("ASTRA_DB_APPLICATION_TOKEN = ")

desired_keyspace = input("ASTRA_DB_KEYSPACE (optional, can be left empty) = ")
if desired_keyspace:
    ASTRA_DB_KEYSPACE = desired_keyspace
else:
    ASTRA_DB_KEYSPACE = None

In [None]:
import cassio

cassio.init(
    database_id=ASTRA_DB_ID,
    token=ASTRA_DB_APPLICATION_TOKEN,
    keyspace=ASTRA_DB_KEYSPACE,
)

### Cassandra: Exact cache

This will avoid invoking the LLM when the supplied prompt is _exactly_ the same as one encountered already:

In [None]:
from langchain_community.cache import CassandraCache
from langchain_core.globals import set_llm_cache

set_llm_cache(CassandraCache())

In [None]:
%%time

print(llm.invoke("Why is the Moon always showing the same side?"))

In [None]:
%%time

print(llm.invoke("Why is the Moon always showing the same side?"))

### Cassandra: Semantic cache

This cache will do a semantic similarity search and return a hit if it finds a cached entry that is similar enough, For this, you need to provide an `Embeddings` instance of your choice.

In [None]:
from langchain_openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()

In [None]:
from langchain_community.cache import CassandraSemanticCache
from langchain_core.globals import set_llm_cache

set_llm_cache(
    CassandraSemanticCache(
        embedding=embedding,
        table_name="my_semantic_cache",
    )
)

In [None]:
%%time

print(llm.invoke("Why is the Moon always showing the same side?"))

In [None]:
%%time

print(llm.invoke("How come we always see one face of the moon?"))

#### Attribution statement

>Apache Cassandra, Cassandra and Apache are either registered trademarks or trademarks of the [Apache Software Foundation](http://www.apache.org/) in the United States and/or other countries.

## `Astra DB` Caches

You can easily use [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) as an LLM cache, with either the "exact" or the "semantic-based" cache.

Make sure you have a running database (it must be a Vector-enabled database to use the Semantic cache) and get the required credentials on your Astra dashboard:

- the API Endpoint looks like `https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com`
- the Token looks like `AstraCS:6gBhNmsk135....`

In [None]:
import getpass

ASTRA_DB_API_ENDPOINT = input("ASTRA_DB_API_ENDPOINT = ")
ASTRA_DB_APPLICATION_TOKEN = getpass.getpass("ASTRA_DB_APPLICATION_TOKEN = ")

### Astra DB exact LLM cache

This will avoid invoking the LLM when the supplied prompt is _exactly_ the same as one encountered already:

In [None]:
from langchain.globals import set_llm_cache
from langchain_astradb import AstraDBCache

set_llm_cache(
    AstraDBCache(
        api_endpoint=ASTRA_DB_API_ENDPOINT,
        token=ASTRA_DB_APPLICATION_TOKEN,
    )
)

In [None]:
%%time

print(llm.invoke("Is a true fakery the same as a fake truth?"))

In [None]:
%%time

print(llm.invoke("Is a true fakery the same as a fake truth?"))

### Astra DB Semantic cache

This cache will do a semantic similarity search and return a hit if it finds a cached entry that is similar enough, For this, you need to provide an `Embeddings` instance of your choice.

In [None]:
from langchain_openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()

In [None]:
from langchain_astradb import AstraDBSemanticCache

set_llm_cache(
    AstraDBSemanticCache(
        api_endpoint=ASTRA_DB_API_ENDPOINT,
        token=ASTRA_DB_APPLICATION_TOKEN,
        embedding=embedding,
        collection_name="demo_semantic_cache",
    )
)

In [None]:
%%time

print(llm.invoke("Are there truths that are false?"))

In [None]:
%%time

print(llm.invoke("Is is possible that something false can be also true?"))

## Azure Cosmos DB Semantic Cache

You can use this integrated [vector database](https://learn.microsoft.com/en-us/azure/cosmos-db/vector-database) for caching.

In [None]:
from langchain_community.cache import AzureCosmosDBSemanticCache
from langchain_community.vectorstores.azure_cosmos_db import (
    CosmosDBSimilarityType,
    CosmosDBVectorSearchType,
)
from langchain_openai import OpenAIEmbeddings

# Read more about Azure CosmosDB Mongo vCore vector search here https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search

NAMESPACE = "langchain_test_db.langchain_test_collection"
CONNECTION_STRING = (
    "Please provide your azure cosmos mongo vCore vector db connection string"
)

DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")

# Default value for these params
num_lists = 3
dimensions = 1536
similarity_algorithm = CosmosDBSimilarityType.COS
kind = CosmosDBVectorSearchType.VECTOR_IVF
m = 16
ef_construction = 64
ef_search = 40
score_threshold = 0.9
application_name = "LANGCHAIN_CACHING_PYTHON"


set_llm_cache(
    AzureCosmosDBSemanticCache(
        cosmosdb_connection_string=CONNECTION_STRING,
        cosmosdb_client=None,
        embedding=OpenAIEmbeddings(),
        database_name=DB_NAME,
        collection_name=COLLECTION_NAME,
        num_lists=num_lists,
        similarity=similarity_algorithm,
        kind=kind,
        dimensions=dimensions,
        m=m,
        ef_construction=ef_construction,
        ef_search=ef_search,
        score_threshold=score_threshold,
        application_name=application_name,
    )
)

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

## Azure Cosmos DB NoSql Semantic Cache

You can use this integrated [vector database](https://learn.microsoft.com/en-us/azure/cosmos-db/vector-database) for caching.

In [None]:
from azure.cosmos import CosmosClient, PartitionKey
from langchain_community.cache import AzureCosmosDBNoSqlSemanticCache

URI = 'COSMOSDB_URI'
KEY = 'COSMOSDB_KEY'
client = CosmosClient(URI, credential=KEY)

indexing_policy = {
    "indexingMode": "consistent",
    "includedPaths": [{"path": "/*"}],
    "excludedPaths": [{"path": '/"_etag"/?'}],
    "vectorIndexes": [{"path": "/embedding", "type": "quantizedFlat"}],
}

vector_embedding_policy = {
    "vectorEmbeddings": [
        {
            "path": "/embedding",
            "dataType": "float32",
            "distanceFunction": "cosine",
            "dimensions": 1536,
        }
    ]
}

partition_key = PartitionKey(path="/id")
cosmos_container_properties_test = {"partition_key": partition_key}
cosmos_database_properties_test = {}

#create llm cache
set_llm_cache(
    AzureCosmosDBNoSqlSemanticCache(
        cosmos_client=test_client,
        embedding=FakeEmbeddings(),
        vector_embedding_policy=vector_embedding_policy,
        indexing_policy=indexing_policy,
        cosmos_container_properties=cosmos_container_properties_test,
        cosmos_database_properties=cosmos_database_properties_test,
    )
)

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time, while not a direct hit, the question is semantically similar to the original question,
# so it uses the cached result!
llm("Tell me a joke")

## `Elasticsearch` Cache
A caching layer for LLMs that uses Elasticsearch.

First install the LangChain integration with Elasticsearch.

In [None]:
%pip install -U langchain-elasticsearch

Use the class `ElasticsearchCache`.

Simple example:

In [None]:
from langchain.globals import set_llm_cache
from langchain_elasticsearch import ElasticsearchCache

set_llm_cache(
    ElasticsearchCache(
        es_url="http://localhost:9200",
        index_name="llm-chat-cache",
        metadata={"project": "my_chatgpt_project"},
    )
)

The `index_name` parameter can also accept aliases. This allows to use the 
[ILM: Manage the index lifecycle](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-lifecycle-management.html)
that we suggest to consider for managing retention and controlling cache growth.

Look at the class docstring for all parameters.

### Index the generated text

The cached data won't be searchable by default.
The developer can customize the building of the Elasticsearch document in order to add indexed text fields,
where to put, for example, the text generated by the LLM.

This can be done by subclassing end overriding methods.
The new cache class can be applied also to a pre-existing cache index:

In [None]:
import json
from typing import Any, Dict, List

from langchain.globals import set_llm_cache
from langchain_core.caches import RETURN_VAL_TYPE
from langchain_elasticsearch import ElasticsearchCache


class SearchableElasticsearchCache(ElasticsearchCache):
    @property
    def mapping(self) -> Dict[str, Any]:
        mapping = super().mapping
        mapping["mappings"]["properties"]["parsed_llm_output"] = {
            "type": "text",
            "analyzer": "english",
        }
        return mapping

    def build_document(
        self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE
    ) -> Dict[str, Any]:
        body = super().build_document(prompt, llm_string, return_val)
        body["parsed_llm_output"] = self._parse_output(body["llm_output"])
        return body

    @staticmethod
    def _parse_output(data: List[str]) -> List[str]:
        return [
            json.loads(output)["kwargs"]["message"]["kwargs"]["content"]
            for output in data
        ]


set_llm_cache(
    SearchableElasticsearchCache(
        es_url="http://localhost:9200", index_name="llm-chat-cache"
    )
)

When overriding the mapping and the document building, 
please only make additive modifications, keeping the base mapping intact.

## Optional Caching
You can also turn off caching for specific LLMs should you choose. In the example below, even though global caching is enabled, we turn it off for a specific LLM

In [None]:
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", n=2, best_of=2, cache=False)

In [None]:
%%time
llm("Tell me a joke")

In [None]:
%%time
llm("Tell me a joke")

## Optional Caching in Chains
You can also turn off caching for particular nodes in chains. Note that because of certain interfaces, its often easier to construct the chain first, and then edit the LLM afterwards.

As an example, we will load a summarizer map-reduce chain. We will cache results for the map-step, but then not freeze it for the combine step.

In [None]:
llm = OpenAI(model_name="gpt-3.5-turbo-instruct")
no_cache_llm = OpenAI(model_name="gpt-3.5-turbo-instruct", cache=False)

In [None]:
from langchain_text_splitters import CharacterTextSplitter

text_splitter = CharacterTextSplitter()

In [None]:
with open("../../how_to/state_of_the_union.txt") as f:
    state_of_the_union = f.read()
texts = text_splitter.split_text(state_of_the_union)

In [None]:
from langchain_core.documents import Document

docs = [Document(page_content=t) for t in texts[:3]]
from langchain.chains.summarize import load_summarize_chain

In [None]:
chain = load_summarize_chain(llm, chain_type="map_reduce", reduce_llm=no_cache_llm)

In [None]:
%%time
chain.run(docs)

When we run it again, we see that it runs substantially faster but the final answer is different. This is due to caching at the map steps, but not at the reduce step.

In [None]:
%%time
chain.run(docs)

In [None]:
!rm .langchain.db sqlite.db

## OpenSearch Semantic Cache
Use [OpenSearch](https://python.langchain.com/docs/integrations/vectorstores/opensearch/) as a semantic cache to cache prompts and responses and evaluate hits based on semantic similarity.

In [None]:
from langchain_community.cache import OpenSearchSemanticCache
from langchain_openai import OpenAIEmbeddings

set_llm_cache(
    OpenSearchSemanticCache(
        opensearch_url="http://localhost:9200", embedding=OpenAIEmbeddings()
    )
)

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
llm("Tell me a joke")

In [None]:
%%time
# The second time, while not a direct hit, the question is semantically similar to the original question,
# so it uses the cached result!
llm("Tell me one joke")

## Cache classes: summary table

**Cache** classes are implemented by inheriting the [BaseCache](https://api.python.langchain.com/en/latest/caches/langchain_core.caches.BaseCache.html) class.

This table lists all 20 derived classes with links to the API Reference.


| Namespace ðŸ”» | Class |
|------------|---------|
| langchain_astradb.cache | [AstraDBCache](https://api.python.langchain.com/en/latest/cache/langchain_astradb.cache.AstraDBCache.html) |
| langchain_astradb.cache | [AstraDBSemanticCache](https://api.python.langchain.com/en/latest/cache/langchain_astradb.cache.AstraDBSemanticCache.html) |
| langchain_community.cache | [AstraDBCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.AstraDBCache.html) |
| langchain_community.cache | [AstraDBSemanticCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.AstraDBSemanticCache.html) |
| langchain_community.cache | [AzureCosmosDBSemanticCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.AzureCosmosDBSemanticCache.html) |
| langchain_community.cache | [CassandraCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.CassandraCache.html) |
| langchain_community.cache | [CassandraSemanticCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.CassandraSemanticCache.html) |
| langchain_community.cache | [GPTCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.GPTCache.html) |
| langchain_community.cache | [InMemoryCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.InMemoryCache.html) |
| langchain_community.cache | [MomentoCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.MomentoCache.html) |
| langchain_community.cache | [OpenSearchSemanticCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.OpenSearchSemanticCache.html) |
| langchain_community.cache | [RedisSemanticCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.RedisSemanticCache.html) |
| langchain_community.cache | [SQLAlchemyCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.SQLAlchemyCache.html) |
| langchain_community.cache | [SQLAlchemyMd5Cache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.SQLAlchemyMd5Cache.html) |
| langchain_community.cache | [UpstashRedisCache](https://api.python.langchain.com/en/latest/cache/langchain_community.cache.UpstashRedisCache.html) |
| langchain_core.caches | [InMemoryCache](https://api.python.langchain.com/en/latest/caches/langchain_core.caches.InMemoryCache.html) |
| langchain_elasticsearch.cache | [ElasticsearchCache](https://api.python.langchain.com/en/latest/cache/langchain_elasticsearch.cache.ElasticsearchCache.html) |
| langchain_mongodb.cache | [MongoDBAtlasSemanticCache](https://api.python.langchain.com/en/latest/cache/langchain_mongodb.cache.MongoDBAtlasSemanticCache.html) |
| langchain_mongodb.cache | [MongoDBCache](https://api.python.langchain.com/en/latest/cache/langchain_mongodb.cache.MongoDBCache.html) |
