# Caching
LangChain provides an optional caching layer for chat models. This is useful for two reasons:

It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times. It can speed up your application by reducing the number of API calls you make to the LLM provider.

In [None]:
from dotenv import load_dotenv
load_dotenv()

In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo")

from langchain.globals import set_llm_cache

## In Memory Cache

In [None]:
%%time
from langchain.cache import InMemoryCache

set_llm_cache(InMemoryCache())

# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
llm.predict("Tell me a joke")

## SQLite Cache

In [None]:
!rm .langchain.db

In [None]:
# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache

set_llm_cache(SQLiteCache(database_path=".langchain.db"))

In [None]:
%%time
# The first time, it is not yet in cache, so it should take longer
# llm.predict("Tell me a joke")
llm.invoke("Tell me a joke")

In [None]:
%%time
# The second time it is, so it goes faster
llm.invoke("Tell me a joke")