<a href="https://colab.research.google.com/github/colinmcnamara/Learning_Langchain_Pub/blob/main/caching_embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip -q install langchain openai tiktoken faiss-gpu
# user pip install -q faiss-cpu if you don't have GPU's on your instance

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m13.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.6/73.6 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m61.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.5/85.5 MB[0m [31m10.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25h

# Caching Embeddings

Embeddings can be stored or temporarily cached to avoid needing to recompute them.

Caching embeddings can be done using a `CacheBackedEmbeddings`.

The cache backed embedder is a wrapper around an embedder that caches
embeddings in a key-value store.

The text is hashed and the hash is used as the key in the cache.


The main supported way to initialized a `CacheBackedEmbeddings` is `from_bytes_store`. This takes in the following parameters:

- underlying_embedder: The embedder to use for embedding.
- document_embedding_cache: The cache to use for storing document embeddings.
- namespace: (optional, defaults to `""`) The namespace to use for document cache. This namespace is used to avoid collisions with other caches. For example, set it to the name of the embedding model used.

**Attention**: Be sure to set the `namespace` parameter to avoid collisions of the same text embedded using different embeddings models.

In [None]:
!pip install -q colab_env

  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for colab_env (setup.py) ... [?25l[?25hdone


In [None]:
import colab_env
import langchain
import openai
import os

Mounted at /content/gdrive


In [None]:
openai_api_key=os.environ['OPENAI_API_KEY']

In [None]:
from langchain.storage import InMemoryStore, LocalFileStore, RedisStore
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings

## Using with a vectorstore

First, let's see an example that uses the local file system for storing embeddings and uses FAISS vectorstore for retrieval.

In [None]:
from langchain.document_loaders import TextLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS

In [None]:
underlying_embeddings = OpenAIEmbeddings()

In [None]:
fs = LocalFileStore("./cache/")

cached_embedder = CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings, fs, namespace=underlying_embeddings.model
)

The cache is empty prior to embedding

In [None]:
list(fs.yield_keys())

[]

Load the document, split it into chunks, embed each chunk and load it into the vector store.

In [None]:
raw_documents = TextLoader("/content/gdrive/MyDrive/state_of_the_union.txt").load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)

create the vectorstore

In [None]:
%%time
db = FAISS.from_documents(documents, cached_embedder)

CPU times: user 1.14 s, sys: 105 ms, total: 1.24 s
Wall time: 3.3 s


If we try to create the vectostore again, it'll be much faster since it does not need to re-compute any embeddings.

In [None]:
%%time
db2 = FAISS.from_documents(documents, cached_embedder)

CPU times: user 59.4 ms, sys: 3.19 ms, total: 62.6 ms
Wall time: 106 ms


And here are some of the embeddings that got created:

In [None]:
list(fs.yield_keys())[:5]

['text-embedding-ada-0025ba09d7e-6a58-5c76-b038-5d8636e5ea25',
 'text-embedding-ada-00281426526-23fe-58be-9e84-6c7c72c8ca9a',
 'text-embedding-ada-002b793db35-a909-5ba0-8c51-314dc776017d',
 'text-embedding-ada-00201dbc21f-5e4c-5fb5-8d13-517dbe7a32d4',
 'text-embedding-ada-002464862c8-03d2-5854-b32c-65a075e612a2']

## In Memory

This section shows how to set up an in memory cache for embeddings. This type of cache is primarily
useful for unit tests or prototyping. Do **not** use this cache if you need to actually store the embeddings.

In [None]:
store = InMemoryStore()

In [None]:
underlying_embeddings = OpenAIEmbeddings()
embedder = CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings, store, namespace=underlying_embeddings.model
)

In [None]:
%%time
embeddings = embedder.embed_documents(["hello", "goodbye"])

CPU times: user 31.4 ms, sys: 4.28 ms, total: 35.7 ms
Wall time: 2.78 s


The second time we try to embed the embedding time is only 2 ms because the embeddings are looked up in the cache.

In [None]:
%%time
embeddings_from_cache = embedder.embed_documents(["hello", "goodbye"])

CPU times: user 2.58 ms, sys: 0 ns, total: 2.58 ms
Wall time: 2.66 ms


In [None]:
embeddings == embeddings_from_cache

True

## File system

This section covers how to use a file system store.

In [None]:
fs = LocalFileStore("./test_cache/")

In [None]:
embedder2 = CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings, fs, namespace=underlying_embeddings.model
)

In [None]:
%%time
embeddings = embedder2.embed_documents(["hello", "goodbye"])

CPU times: user 14.5 ms, sys: 1.27 ms, total: 15.8 ms
Wall time: 522 ms


In [None]:
%%time
embeddings = embedder2.embed_documents(["hello", "goodbye"])

CPU times: user 3.83 ms, sys: 880 µs, total: 4.71 ms
Wall time: 16.4 ms


Here are the embeddings that have been persisted to the directory `./test_cache`.

Notice that the embedder takes a namespace parameter.

In [None]:
list(fs.yield_keys())

['text-embedding-ada-002e885db5b-c0bd-5fbc-88b1-4d1da6020aa5',
 'text-embedding-ada-0026ba52e44-59c9-5cc9-a084-284061b13c80']