# GPTCache and Weaviate ♻️

This notebook shows how to configure GPTCache to use Weaviate as the set vector store.

## Library Imports

In [1]:
from gptcache import cache
from gptcache.manager import get_data_manager, CacheBase, VectorBase
from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation
from gptcache.embedding import OpenAI
import weaviate
import os
import sqlite3
from gptcache.adapter import openai
import timeit

## Configuration

### Use OpenAI for the embedding

In [2]:
openai_embedding_fn = OpenAI().to_embeddings

### Use SQLite to cache the requests/responses

In [3]:
cache_base = CacheBase("sqlite")

  Base = declarative_base()


#### See what is currently in the SQLite database:

In [4]:
def dump_cache_content():
    cursor = sqlite3.connect("sqlite.db").cursor()
    tables = cursor.execute('SELECT * FROM sqlite_master WHERE type="table"').fetchall()

    for table in tables:
        # Print the table name as a delimiter
        print(f"Results for table {table[1]}:")
        print("------------------------")

        # Execute a SELECT * query for the table
        cursor.execute(f"SELECT * FROM {table[1]}")
        results = cursor.fetchall()

        # Print the results
        for row in results:
            print(row)

        # Print a blank line to separate the output for each table
        print()

In [5]:
# The database is currently empty

dump_cache_content()

Results for table gptcache:
------------------------



### Connect to Weaviate

In [None]:
url = os.getenv("WEAVIATE_URL") # URL to your Weaviate instance
api_key = os.getenv("WEAVIATE_API_KEY") # authentication key -- ignore if you don't have this configured
auth_config = weaviate.AuthApiKey(api_key=api_key)
vector_base = VectorBase("weaviate", url=url, auth_client_secret=auth_config)

#### Create a Weaviate client to query the database outside of GPTCache

In [None]:
weaviate_client = weaviate.Client(url=url, auth_client_secret=auth_config)

#### Create class and test connection

In [None]:
weaviate_class = "GPTCache"
weaviate_client.schema.get(class_name=weaviate_class)

#### Confirm the Weaviate database is empty

In [None]:
weaviate_client.data_object.get(class_name=weaviate_class)

### Initialize the cache

In [None]:
data_manager = get_data_manager(cache_base, vector_base)

cache.init(
    embedding_func=openai_embedding_fn,
    data_manager=data_manager,
    similarity_evaluation=SearchDistanceEvaluation(max_distance=1)
)

cache.set_openai_key()

Note:

In `similarity_evaluation`, we set `max_distance=1` to make the similarity threshold calculation "work" using this evaluation metric and cosine similarity (Weaviate's default similarity metric).

References:

1. [Calculating rank threshold](https://github.com/zilliztech/GPTCache/blob/03a059704443961ae5b6ca243e3edc2dc15aeb2a/gptcache/adapter/adapter.py#L98C1-L107C10)

2. [Applying the rank threshold](https://github.com/zilliztech/GPTCache/blob/03a059704443961ae5b6ca243e3edc2dc15aeb2a/gptcache/adapter/adapter.py#L158C1-L176C18)

### Calculate the time it takes to query the LLM

In [None]:
def timeit_decorator(func):
    def wrapper(*args, **kwargs):
        # Time the execution of the function
        start_time = timeit.default_timer()
        result = func(*args, **kwargs)
        end_time = timeit.default_timer()

        # Print the time taken
        print(f"Time taken to run {func.__name__}: {end_time - start_time:.2f} seconds")

        return result

    return wrapper

In [None]:
@timeit_decorator
def get_openai_response(question):
    # Call the OpenAI API to get a response
    result = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": question}],
    )
    # Extract the response from the API result
    response = result["choices"][0]["message"]["content"]

    # Return the response
    return response

## Query Time

#### Let's first ask "Who is Barrack Obama"

In [None]:
question = "Who is Barrack Obama?"

get_openai_response(question)

#### Let's ask the same question again and note how long the response takes

In [None]:
get_openai_response(question)

# notice how it went from 4.12 seconds to 0.56 seconds

#### Let's rephrase the same question:

In [None]:
rephrase_question = "Tell me more about Barrak Obama"
get_openai_response(rephrase_question)

# This question is very similar to the above question and the response time is still very quick

#### Now let's look at the content stored in the SQLite database:

In [None]:
dump_cache_content()

# The question and answered is stored in the database

In [None]:
# weaviate_client.data_object.get(class_name=weaviate_class, with_vector=True)

weaviate_client.data_object.get(class_name=weaviate_class)

### Examples that didn't perform very well

##### Starting with non-fictional characters

In [None]:
new_question = "Who is Joe Biden?"

get_openai_response(new_question)

In [None]:
another_new_question = "Who is Taylor Swift?"

get_openai_response(another_new_question)

In [None]:
get_openai_response("Who is Miley Cyrus?")

##### Trying with fictional characters

In [None]:
get_openai_response("Who is Antman?")

In [None]:
get_openai_response("Who is Spiderman?")

## Questions

1. Why are Barrak Obama, Joe Biden, Taylor Swift, and Miley Cyrus semantically similar?


2. How can we tweak semantic caching so that "Who is Barrack Obama" and "Who is Joe Biden" are semantically distinct? Do we need a more sophisticated distance evaluation metric?