In this exercise , we will build a embeddings DB and use an LLM to interact with it. This tiem we use google GenAI embedding function. 

In [45]:
import chromadb
from chromadb.utils import embedding_functions
import os
from openai import OpenAI
from dotenv import load_dotenv
import json

#### Building the Embeddings DB

In [44]:
load_dotenv()
api_key = os.getenv("GOOGLE_API_KEY")
open_api_key = os.getenv("OPENAI_API_KEY")
embedding_func  = embedding_functions.GoogleGenerativeAiEmbeddingFunction(api_key=api_key)

In [25]:
CHROMA_PATH = "car_review_embeddings"
COLLECTION_NAME = "review_db"

In [19]:

client = chromadb.PersistentClient(CHROMA_PATH)

try:
    client.delete_collection(name=COLLECTION_NAME)
except ValueError:
    print("Collection does not exist. Creating Now")

collection = client.create_collection(name=COLLECTION_NAME,
                                      embedding_function=embedding_func,
                                      metadata={"hnsw:space": "cosine"},
                                      )

The dataset is built from the prvious exercise

In [28]:
with open('archive/car_reviews.json', 'r') as openfile:
 
    # Reading from json file
    data_dict = json.load(openfile)

In [29]:
ids = list(data_dict.keys())
reviews = [data_dict[id]['review'] for id in ids]
metadata = [data_dict[id]['metadata'] for id in ids]

In [32]:
print(f"{ids[0]}\n{reviews[0]}\n{metadata[0]}")

review0
 On my trip to Maui I rented this van to drive to see if I would buy one in the future.    The handling and acceleration was decent for the  van with 4 kids.  It was my second day and was still driving it around and enjoying my vacation.    Was not aware of all the functions on the key controls.   Just lock and unlock.  I must have accidentally hit the open truck key and was not aware of it beenopen.  There was no beeping sound or flashing alert on the dash like my Toyota van.  I drove the car out of the car port and the trunk window got shattered because it was to high.   I believe this to be a design default.    My lost
{'Review_Title': 'No beeping alerts', 'Rating': 1.0, 'Vehicle_Year': 2017, 'Vehicle_Model': 'Dodge'}


In [22]:
collection.add(
    ids=ids,
    documents=reviews,
    metadatas=metadata,
)

To find the positive reviews we simply try to scan the documents with __Rating__ metadata greater than or equal to 3 

In [34]:
question = """
What's the key to great customer satisfaction
based on detailed positive reviews?
"""

good_reviews = collection.query(query_texts=[question],
n_results=10,
include=["documents"],
where={"Rating": {"$gte": 3}},
)

In [35]:
good_reviews['documents']

[[' Very  pleased! Great deal! So far so good!',
  ' its fun to drive',
  ' Best power for the price. And it looks great.',
  " All shoppers out there, look for the best value for your dollar, don't allow unnecessary upgrades, extended warranties, you can get these items later on your own. Take time to decide what you want.",
  ' Car is awesome!!!',
  ' Pure fun to drive need to drive more',
  ' Great practical vehicle.  Love the stow and go storage under the floor and radio with hard drive for my music.  Has front and rear heated seats and USB ports.  Kids love it!',
  ' If you purchase this car you want regreted.',
  ' gt package is the way to go,,,plenty of power, good fuel economy',
  ' This is the most fun I ever had driving a car........I wish they made it when I was younger !']]

In [36]:
reviews_str = ",".join(good_reviews["documents"][0])

#### Interacting with LLM

Our objective is to generate a natural language response to a question based on a context , drawn from the embeddings we produced.

In [50]:
context = """
... You are a customer success employee at a large
...  car dealership. Use the following car reviews
...  to answer questions in bullet points: {}
... """

question = """
... What's the key to great customer satisfaction
...  based on detailed positive reviews?
... """

In [51]:
open_ai= OpenAI(
    api_key = open_api_key,
)
ask_llm = open_ai.chat.completions.create(
                            model="gpt-3.5-turbo",
                            messages=[
                            {"role": "system", "content": context.format(reviews_str)},
                            {"role": "user", "content": question},
                            ],
                            temperature=0,
                            n=1,
                            )


In [62]:
print(ask_llm.choices[0].message.content)

- Providing a great deal and best value for the price
- Offering a fun driving experience
- Avoiding unnecessary upgrades and extended warranties
- Allowing customers time to decide what they want
- Including practical features like stow and go storage, heated seats, and USB ports
- Ensuring plenty of power and good fuel economy
- Creating a fun and enjoyable driving experience
