# Caching

- Author: [Joseph](https://github.com/XaviereKU)
- Peer Review : [Teddy Lee](https://github.com/teddylee777), [BAEM1N](https://github.com/BAEM1N)
- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/04-Model/02-Cache.ipynb) [![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/04-Model/02-Cache.ipynb)

## Overview

`LangChain` provides optional caching layer for LLMs.

This is useful for two reasons:
- When requesting the same completions multiple times, it can **reduce the number of API calls** to the LLM provider and thus save costs.
- By **reduing the number of API calls** to the LLM provider, it can **improve the running time of the application.**

In this tutorial, we will use `gpt-4o-mini` OpenAI API and utilize two kinds of cache, **InMemoryCache** and **SQLite Cache** .  
At end of each section we will compare wall times between before and after caching.

Optionally, we will use local LLM served with VLLM.

### Table of Contents

- [Overview](#overview)
- [Environement Setup](#environment-setup)
- [InMemoryCache](#inmemorycache)
- [SQlite Cache](#sqlite-cache)
- [(Optional) With local model](#optional-with-local-model)
- [(Optional) InMemoryCache + Local LLM](#optional-inmemorycache--local-llm)
- [(Optional) SQLite Cache + Local LLM](#optional-sqlite-cache--local-llm)
----

## Environment Setup

Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.

**[Note]**
- `langchain-opentutorial` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. 
- You can checkout the [`langchain-opentutorial`](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details.

In [1]:
%%capture --no-stderr
!pip install langchain-opentutorial

In [2]:
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "langsmith",
        "langchain",
        "langchain_core",
        "langchain_community",
        "langchain_openai",
        # "vllm", # this is for optional section
    ],
    verbose=False,
    upgrade=False,
)

In [3]:
# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "Your API KEY",
        "LANGCHAIN_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "Caching",
    }
)

Environment variables have been set successfully.


In [4]:
# Alternatively, one can set environmental variables with load_dotenv
from dotenv import load_dotenv


load_dotenv()

False

In [5]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate

# Create model
llm = ChatOpenAI(model_name="gpt-4o-mini")

# Generate prompt
prompt = PromptTemplate.from_template(
    "Sumarize about the {country} in about 200 characters"
)

# Create chain
chain = prompt | llm

In [6]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response.content)

Failed to multipart ingest runs: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"detail":"Invalid token"}')trace=cc32ca74-11e6-466b-a13a-48d2b26101f0,id=cc32ca74-11e6-466b-a13a-48d2b26101f0; trace=cc32ca74-11e6-466b-a13a-48d2b26101f0,id=02c668f8-d14d-4ded-8ec8-8a89c9cbb70c; trace=cc32ca74-11e6-466b-a13a-48d2b26101f0,id=fcba4fb1-5c84-48db-95cb-f7dbe006208c


South Korea is a highly developed country in East Asia known for its technological advancements, vibrant culture, and economic prosperity. It has a rich history, beautiful landscapes, and a strong emphasis on education. The country is also a major player in the global economy and a leading exporter of electronics, automobiles, and other goods.
CPU times: total: 46.9 ms
Wall time: 1.2 s


Failed to multipart ingest runs: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"detail":"Invalid token"}')trace=cc32ca74-11e6-466b-a13a-48d2b26101f0,id=cc32ca74-11e6-466b-a13a-48d2b26101f0; trace=cc32ca74-11e6-466b-a13a-48d2b26101f0,id=fcba4fb1-5c84-48db-95cb-f7dbe006208c
Failed to multipart ingest runs: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"detail":"Invalid token"}')trace=24f75e9f-19f0-40c8-a65b-654a96e640a7,id=24f75e9f-19f0-40c8-a65b-654a96e640a7; trace=24f75e9f-19f0-40c8-a65b-654a96e640a7,id=402fd17a-3e29-41f2-b67e-5d244065d264; trace=24f75e9f-19f0-40c8-a65b-654a96e640a7,id=818d688d-545c-460f-9882-214d23817033
Failed to multipart ingest runs: langs

## InMemoryCache
First, cache the answer to the same question using InMemoryCache.

In [18]:
from langchain_core.globals import set_llm_cache
from langchain_core.caches import InMemoryCache

# Set InMemoryCache
set_llm_cache(InMemoryCache())

In [8]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response.content)

South Korea is a technologically advanced country known for its fast-paced lifestyle, vibrant culture, and delicious cuisine. It is a leader in industries such as electronics, automotive, and entertainment. The country also has a rich history and beautiful landscapes, making it a popular destination for tourists.
CPU times: total: 0 ns
Wall time: 996 ms


Now we invoke the chain with the same question.

In [10]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response.content)

South Korea is a technologically advanced country known for its fast-paced lifestyle, vibrant culture, and delicious cuisine. It is a leader in industries such as electronics, automotive, and entertainment. The country also has a rich history and beautiful landscapes, making it a popular destination for tourists.
CPU times: total: 0 ns
Wall time: 3 ms


Note that if we set InMemoryCache again, the cache will be lost and the wall time will increase

In [11]:
set_llm_cache(InMemoryCache())

In [12]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response.content)

South Korea is a tech-savvy, modern country known for its vibrant culture, delicious cuisine, and booming economy. It is a highly developed nation with advanced infrastructure, high standards of living, and a strong emphasis on education. The country also has a rich history and is famous for its K-pop music and entertainment industry.
CPU times: total: 0 ns
Wall time: 972 ms


## SQLite Cache
Now, we cache the answer to the same question by using SQLite cache.

In [13]:
from langchain_community.cache import SQLiteCache
from langchain_core.globals import set_llm_cache
import os

# Create cache directory
if not os.path.exists("cache"):
    os.makedirs("cache")

# Set SQLiteCache
set_llm_cache(SQLiteCache(database_path="cache/llm_cache.db"))

In [14]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response.content)

South Korea is a technologically advanced country in East Asia, known for its booming economy, vibrant pop culture, and rich history. It is home to K-pop, Samsung, and delicious cuisine like kimchi. The country also faces tensions with North Korea and strives for reunification.
CPU times: total: 31.2 ms
Wall time: 953 ms


Now we invoke the chain with the same question.

In [15]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response.content)

South Korea is a technologically advanced country in East Asia, known for its booming economy, vibrant pop culture, and rich history. It is home to K-pop, Samsung, and delicious cuisine like kimchi. The country also faces tensions with North Korea and strives for reunification.
CPU times: total: 375 ms
Wall time: 375 ms


Note that if we use SQLite Cache, setting caching again does not delete store cache

In [16]:
set_llm_cache(SQLiteCache(database_path="cache/llm_cache.db"))

In [17]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response.content)

South Korea is a technologically advanced country in East Asia, known for its booming economy, vibrant pop culture, and rich history. It is home to K-pop, Samsung, and delicious cuisine like kimchi. The country also faces tensions with North Korea and strives for reunification.
CPU times: total: 0 ns
Wall time: 4.01 ms


## (Optional) With local model
In this optional section, we utilize `docker` to serve local LLM model.
Note that this used miniconda to set environment easily.

### Device & Serving information - Windows
- CPU : AMD 5600X
- OS : Windows 10 Pro
- RAM : 32 Gb
- GPU : Nividia 3080Ti, 12GB VRAM
- CUDA : 12.6
- Driver Version : 560.94
- Docker Image : nvidia/cuda:12.4.1-cudnn-devel-ubuntu20.04
- model : Qwen/Qwen2.5-0.5B-Instruct
- Python version : 3.10
- docker run script :
    ```
    docker run -itd --name vllm --gpus all --entrypoint /bin/bash -p 6001:8888 nvidia/cuda:12.4.1-cudnn-devel-ubuntu20.04
    ```
- vllm serving script : 
    ```
    python3 -m vllm.entrypoints.openai.api_server --model='Qwen/Qwen2.5-0.5B-Instruct' --served-model-name 'qwen-2.5' --port 8888 --host 0.0.0.0 --gpu-memory-utilization 0.80 --max-model-len 4096 --swap-space 1 --dtype bfloat16 --tensor-parallel-size 1 
    ```

### Device & Serving information - Mac OS
- Device : M2 Macbook Air 15
- RAM : 16GB
- macOS : Sequoia 15.1.1
- Docker Image : 
Build from the [docker image]['https://docs.vllm.ai/en/latest/getting_started/arm-installation.html'] written by official vLLM.

In [18]:
from langchain_community.llms import VLLMOpenAI

# create model using OpenAI compatible class VLLMOpenAI
llm = VLLMOpenAI(
    model="qwen-2.5", openai_api_key="EMPTY", openai_api_base="http://localhost:6001/v1"
)

# Generate prompt
prompt = PromptTemplate.from_template(
    "Sumarize about the {country} in about 200 characters"
)

# Create chain
chain = prompt | llm

## (Optional) InMemoryCache + Local LLM
Same InMemoryCache section above, we set InMemoryCache.

In [19]:
from langchain_core.globals import set_llm_cache
from langchain_core.caches import InMemoryCache

# Set InMemoryCache
set_llm_cache(InMemoryCache())

Invoke chain with local LLM, do note that we print **response** not **response.content**

In [20]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response)

.
South Korea is a country in East Asia, with a population of approximately 55.2 million as of 2023. It borders North Korea to the east, Japan to the northeast, and China to the southeast. The country is known for its advanced technology, leading industries, and significant contributions to South Korean culture. It is often referred to as the "Globe and a Couple" due to its diverse landscapes, rich history, and frontiers with neighboring countries. South Korea's economy is growing, with a strong technological sector and a strong economy, making it a significant player on the global stage. Overall, South Korea is a significant global player, with a rich history, advanced technology, and a cultural influence. With its advanced technology and unique culture, South Korea is a fascinating country to explore. Its diverse landscapes, rich history, and remarkable economic performance have made it a popular destination for travelers. South Korea's contribution to the global economy and its stro

Now we invoke chain again, with the same question.

In [21]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response)

.
South Korea is a country in East Asia, with a population of approximately 55.2 million as of 2023. It borders North Korea to the east, Japan to the northeast, and China to the southeast. The country is known for its advanced technology, leading industries, and significant contributions to South Korean culture. It is often referred to as the "Globe and a Couple" due to its diverse landscapes, rich history, and frontiers with neighboring countries. South Korea's economy is growing, with a strong technological sector and a strong economy, making it a significant player on the global stage. Overall, South Korea is a significant global player, with a rich history, advanced technology, and a cultural influence. With its advanced technology and unique culture, South Korea is a fascinating country to explore. Its diverse landscapes, rich history, and remarkable economic performance have made it a popular destination for travelers. South Korea's contribution to the global economy and its stro

## (Optional) SQLite Cache + Local LLM
Same as SQLite Cache section above, set SQLite Cache.  
Note that we set db name to be **vllm_cache.db** to distinguish from the cache used in SQLite Cache section.

In [23]:
from langchain_community.cache import SQLiteCache
from langchain_core.globals import set_llm_cache
import os

# Create cache directory
if not os.path.exists("cache"):
    os.makedirs("cache")

# Set SQLiteCache
set_llm_cache(SQLiteCache(database_path="cache/vllm_cache.db"))

Invoke chain with local LLM, again, note that we print **response** not **response.content**

In [24]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response)

.

South Korea, a nation that prides itself on its history, culture, and natural beauty. Known for its bustling cityscapes, scenic valleys, and delicious cuisine. A major player in South East Asia and a global hub for technology, fashion, and entertainment. Home to industries like electronics, automotive, and media. With a strong economy, South Korea is among the top economies in the world, known for its efficient and inclusive societies. A country that has been a significant player in global politics for decades. The country is also home to many influential figures like Kim Jong-un and Kim Jong-un, who have led North Korea and the country’s military. Known for its national sports, including football (soccer), baseball, and gymnastics. South Korea is also home to many museums, art galleries, and historical sites, showcasing the country’s rich cultural heritage. The country is a leader in technology, with many leading companies based in the South Korean capital, Seoul. The South Korean 

Now we invoke chain again, with the same question.

In [25]:
%%time
# Invoke chain
response = chain.invoke({"country": "South Korea"})
print(response)

.

South Korea, a nation that prides itself on its history, culture, and natural beauty. Known for its bustling cityscapes, scenic valleys, and delicious cuisine. A major player in South East Asia and a global hub for technology, fashion, and entertainment. Home to industries like electronics, automotive, and media. With a strong economy, South Korea is among the top economies in the world, known for its efficient and inclusive societies. A country that has been a significant player in global politics for decades. The country is also home to many influential figures like Kim Jong-un and Kim Jong-un, who have led North Korea and the country’s military. Known for its national sports, including football (soccer), baseball, and gymnastics. South Korea is also home to many museums, art galleries, and historical sites, showcasing the country’s rich cultural heritage. The country is a leader in technology, with many leading companies based in the South Korean capital, Seoul. The South Korean 