<a href="https://colab.research.google.com/github/Kiana-M/notebooks/blob/main/vector-db/gptcache.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chat

[This example](https://gptcache.readthedocs.io/en/latest/bootcamp/openai/chat.html) will show you how to chat with GPT, the original example is on [OpenAI Example](https://platform.openai.com/docs/guides/chat/introduction), the difference is that we will teach you how to cache the  response for exact and similar matches with **gptcache**, it will be very simple, you just need to add an extra step to initialize the cache.


In [1]:
! pip install -q gptcache

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/120.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m120.0/120.0 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
! pip install --upgrade openai

Collecting openai
  Downloading openai-0.27.8-py3-none-any.whl (73 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/73.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.6/73.6 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: openai
Successfully installed openai-0.27.8


Before running the example, make sure the `OPENAI_API_KEY` environment variable is set by executing `echo $OPENAI_API_KEY`. If it is not already set, it can be set by using `export OPENAI_API_KEY=YOUR_API_KEY` on Unix/Linux/MacOS systems or `set OPENAI_API_KEY=YOUR_API_KEY` on Windows systems.

> We can run `os.environ` to set the environment variable in colab.

In [4]:
import os
os.environ["OPENAI_API_KEY"] = "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"

Then we can learn the usage and acceleration effect of gptcache by the following code, which consists of three parts, the original openai way, the exact search and the similar search.


## OpenAI API original usage

In [6]:
import time
import openai


def response_text(openai_resp):
    return openai_resp['choices'][0]['message']['content']


question = 'what is faiss?'

# OpenAI API original usage
start_time = time.time()
response = openai.ChatCompletion.create(
  model='gpt-3.5-turbo',
  messages=[
    {
        'role': 'user',
        'content': question
    }
  ],
)
print(f'Question: {question}')
print("Time consuming: {:.2f}s".format(time.time() - start_time))
print(f'Answer: {response_text(response)}\n')

Question: what is faiss?
Time consuming: 2.79s
Answer: Faiss is an open-source library for efficient similarity search and clustering of dense vectors. It was developed by Facebook's AI Research (FAIR) team to optimize and accelerate large-scale similarity searches for machine learning applications.

Faiss utilizes techniques such as inverted indexing, hierarchical clustering, product quantization, and Polysemous codes to efficiently handle similarity search tasks. It provides both CPU and GPU implementations, allowing users to choose the best option for their hardware setup.

The library is widely used in various domains, including image and video search, recommendation systems, natural language processing, and computer vision tasks. It offers fast and memory-efficient solutions for searching and clustering large datasets with high-dimensional vectors, enabling efficient nearest neighbor searches and similarity-based retrieval operations.



## OpenAI API + GPTCache, exact match cache

Initalize the cache to run GPTCache and import `openai` form `gptcache.adapter`, which will automatically set the map data manager to match the exact cahe, more details refer to [build your cache](https://gptcache.readthedocs.io/en/dev/usage.html#build-your-cache).

And if you ask ChatGPT the exact same two questions, the answer to the second question will be obtained from the cache without requesting ChatGPT again.

In [9]:
import time


def response_text(openai_resp):
    return openai_resp['choices'][0]['message']['content']

print("Cache loading.....")

# To use GPTCache, that's all you need
# -------------------------------------------------
from gptcache import cache
from gptcache.adapter import openai

cache.init()
cache.set_openai_key()
# -------------------------------------------------

question = "what are some common exclusion criteria for phase 3 NSCLC trials for patients with metastatic, EGFR"
for _ in range(2):
    start_time = time.time()
    response = openai.ChatCompletion.create(
      model='gpt-3.5-turbo',
      messages=[
        {
            'role': 'user',
            'content': question
        }
      ],
    )
    print(f'Question: {question}')
    print("Time consuming: {:.2f}s".format(time.time() - start_time))
    print(f'Answer: {response_text(response)}\n')

Cache loading.....
Question: what are some common exclusion criteria for phase 3 NSCLC trials for patients with metastatic, EGFR
Time consuming: 5.28s
Answer: There are several common exclusion criteria for phase 3 NSCLC trials in patients with metastatic EGFR:

1. Prior systemic therapy: Patients who have received prior systemic therapy for advanced NSCLC may be excluded as they may have already been exposed to different treatment regimens that could impact the outcomes of the trial.

2. Prior EGFR-targeted therapy: Patients who have previously received EGFR-targeted therapy, such as tyrosine kinase inhibitors (TKIs) like gefitinib, erlotinib, or osimertinib, may be excluded to avoid off-target effects and potential drug resistance.

3. Presence of brain metastases: Patients with active or untreated brain metastases may be excluded due to the potential for compromised neurological function and the need for specific treatment strategies for central nervous system involvement.

4. Non-m

## OpenAI API + GPTCache, similar search cache

Set the cache with `embedding_func` to generate embedding for the text, and `data_manager` to manager the cache data, `similarity_evaluation` to evaluate the similarities, more details refer to [build your cache](https://gptcache.readthedocs.io/en/dev/usage.html#build-your-cache).

After obtaining an answer from ChatGPT in response to several similar questions, the answers to subsequent questions can be retrieved from the cache without the need to request ChatGPT again.

In [14]:
import time


def response_text(openai_resp):
    return openai_resp['choices'][0]['message']['content']

from gptcache import cache
from gptcache.adapter import openai
from gptcache.embedding import Onnx
from gptcache.manager import CacheBase, VectorBase, get_data_manager
from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation

print("Cache loading.....")

onnx = Onnx()
data_manager = get_data_manager(CacheBase("sqlite"), VectorBase("faiss", dimension=onnx.dimension))
cache.init(
    embedding_func=onnx.to_embeddings,
    data_manager=data_manager,
    similarity_evaluation=SearchDistanceEvaluation(),
    )
cache.set_openai_key()

questions = [
    "what is faiss?",
    "can you explain what faiss is",
    "can you tell me more about faiss",
    "what is the purpose of faiss"
]

for question in questions:
    start_time = time.time()
    response = openai.ChatCompletion.create(
        model='gpt-3.5-turbo',
        messages=[
            {
                'role': 'user',
                'content': question
            }
        ],
    )
    print(f'Question: {question}')
    print("Time consuming: {:.2f}s".format(time.time() - start_time))
    print(f'Answer: {response_text(response)}\n')

Cache loading.....
Question: what is faiss?
Time consuming: 2.42s
Answer: Faiss is an open-source library for efficient similarity search and clustering of dense vectors. It is designed to efficiently handle large-scale datasets with millions or billions of vectors. Faiss implements various state-of-the-art algorithms and data structures, such as Index Flat, Index IVF, and Index HNSW, to enable fast similarity searches and nearest neighbor queries. It is primarily used in the field of deep learning and information retrieval for tasks like image search, document retrieval, and recommendation systems.

Question: can you explain what faiss is
Time consuming: 0.81s
Answer: Faiss is an open-source library for efficient similarity search and clustering of dense vectors. It is designed to efficiently handle large-scale datasets with millions or billions of vectors. Faiss implements various state-of-the-art algorithms and data structures, such as Index Flat, Index IVF, and Index HNSW, to enabl

## how exact is similarity search?

In [18]:
import time


def response_text(openai_resp):
    return openai_resp['choices'][0]['message']['content']

from gptcache import cache
from gptcache.adapter import openai
from gptcache.embedding import Onnx
from gptcache.manager import CacheBase, VectorBase, get_data_manager
from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation

print("Cache loading.....")

onnx = Onnx()
data_manager = get_data_manager(CacheBase("sqlite"), VectorBase("faiss", dimension=onnx.dimension))
cache.init(
    embedding_func=onnx.to_embeddings,
    data_manager=data_manager,
    similarity_evaluation=SearchDistanceEvaluation(),
    )
cache.set_openai_key()

questions = [
    "what are some common inclusion criteria for phase 3 NSCLC trials for patients with metastatic, EGFR",
    "tell me some common inclusion criteria for phase 3 NSCLC trials for patients with metastatic, EGFR",
    "what are some common inclusion criteria for phase 3 NSCLC trials for metastatic, EGFR patients",
    "what are some common exclusion criteria for phase 3 NSCLC trials for patients with metastatic, EGFR",
    "give me some light-hearted movie recomendations"
]

for question in questions:
    start_time = time.time()
    response = openai.ChatCompletion.create(
        model='gpt-3.5-turbo',
        messages=[
            {
                'role': 'user',
                'content': question
            }
        ],
    )
    print(f'Question: {question}')
    print("Time consuming: {:.2f}s".format(time.time() - start_time))
    print(f'Answer: {response_text(response)}\n')

Cache loading.....
Question: what are some common inclusion criteria for phase 3 NSCLC trials for patients with metastatic, EGFR
Time consuming: 4.66s
Answer: Some common inclusion criteria for phase 3 NSCLC (non-small cell lung cancer) trials for patients with metastatic EGFR-mutant NSCLC (non-small cell lung cancer) may include the following:

1. Histologically or cytologically confirmed diagnosis of non-small cell lung cancer.
2. Advanced or metastatic stage of the disease.
3. Presence of EGFR (epidermal growth factor receptor) mutation.
4. Age within a certain range (e.g., 18-75 years).
5. Adequate performance status (e.g., ECOG performance status 0-2).
6. Adequate organ function (e.g., renal, hepatic, hematopoietic, etc.).
7. Availability of archived tissue or willingness to undergo a new biopsy for molecular testing.
8. Measurable or evaluable disease as per RECIST (Response Evaluation Criteria In Solid Tumors) criteria.
9. Adequate washout period since any previous anti-cancer t