In [None]:
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# NVIDIA Retail Product Advisor AI Workflow


### Table of Contents
1. [Introduction](#introduction)
2. [Getting Started with LLMs](#llm)
3. [Retail Product Data (or Bring Your Own Data)](#data)
4. [Converting Product Data to Embeddings](#embedding)
5. [Retrieving the Right Products](#retrieval)
6. [Retrieval Augmented Generation (RAG)](#rag)
7. [Facilliating Conversational Flow with Function Calling & Tools](#function-calling)
8. [Product Advisor](#product-advisor)
9. [Deployment with FastAPI and React](#deployment)


### Introduction <a name="introduction"></a>

Generative AI & LLMs enable Retailers to build novel and innovative solutions that empower internal employees, reduce costs, and revolutionize the customer experience. As the world’s most advanced platform for accelerated computing, NVIDIA provides hardware and software designed to accelerate development and deployment of generative AI applications. 

Retailers have a lot of data, specifically a lot of data about products. And Retailers often have many products, sometimes in the realm of millions. As a customer (and as a Retail employee), it can be hard to navigate these vast product catalogs. And one may have questions about different products or questions comparing and contrasting different products.

To purchase something in a pre-internet world, one would go into the Retail brick-and-mortar store where a sales associate intimiately familiar with that store's products could assist and advise them. These sales associates are experts and help suggest interesting products - maybe ones the customer hasn't thought of - and guide the customer on their journey.

In a post-internet world, this personal touch and customer experience has been lost due to the scale of millions of products that sales associates can't reasonably be expected to memorize as well as the means of interacting with websites through a web browser or mobile device. LLMs - combined with Retrieval Augmented Generation (RAG) techiques - are very capable of ingesting information about a product (or several products) and responding to customers to answer questions, provide more details, and compare and contrast different products. 

LLMs and GenAI provide an incredible opportunity for Retailers to:
* Empower and augment employees and help them better understand products,
* Reduce costs of customer assistance, and
* Provide a delightful customer experience.

In the NVIDIA Retail Product Advisor AI Workflow, we'll show how to developed an LLM-powered RAG application that can ingest product catalog data, reason about how to respond or which tools to use, as well as retrieve appropriate products and answer questions about them. Specifically, we'll cover:

* How to use Large Language Models (LLMs)
* Using LLMs with Retail product data
* How to create embeddings from product information
* How to use those embeddings to retrieve products most similar to a given query
* How to empower LLMs to make decisions like responding normally or using tools like Search/Shopping Cart APIs
* How to put these pieces together into a single `ProductAdvisor` utility class
* How to deploy this in a FastAPI backend and interact with that backend through a chatbox in a React application

To start, we'll import several modules and libraries we'll use througout this notebook.

In [1]:
from IPython.display import Image
from IPython.core.display import HTML
import json
from langchain.utils.math import cosine_similarity
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
import logging
import openai
from openai import OpenAI
from openai_function_calling import Function, Parameter
from openai_function_calling.tool_helpers import ToolHelpers
import os
import numpy as np
import pandas as pd
from pandas import DataFrame
from pydantic import BaseModel
import random
from typing import Optional

We'll also import several helpful utility data models, functions, and classes from the `nvretail` package included in this workflow.

In [2]:
from nvretail.cart import (
    Cart, add_to_cart_function, remove_from_cart_function, 
    modify_item_in_cart_function, view_cart_function, 
)
from nvretail.catalog import (
    Catalog, Embeddings, Product, Products, search_function
)
from nvretail.generate import (
    FunctionMetadata, FunctionResult, Message, Messages, ProductAdvisor
)

### Getting Started with LLMs <a name="llm"></a>

There are several ways to interact with LLMs. Two of the more common ways are: 

* **Self-managed** - Downloaded LLM model, optimize with [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM), deploy with [Triton Inference Server](https://github.com/triton-inference-server/server) or [NeMo Inference Microservice](https://developer.nvidia.com/nemo-microservices-early-access), and interact with that deployed model using HTTP or gPRC. 
* **Hosted API** - Using HTTP calls, interact via API with a model hosted e.g. [NVIDIA NeMo LLM Service API](https://developer.nvidia.com/nemo-llm-service-early-access), [NVIDIA AI Playground](https://catalog.ngc.nvidia.com/ai-foundation-models), [OpenAI](https://openai.com/blog/openai-api), etc.

In this notebook and workflow, we will use the latter to get started in <2 minutes while also allowing flexibility for the former. Specifically, we will be using the [Mixtral 8x7B Instruct](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/mixtral-8x7b) model which can be found and accessed using NVIDIA AI Playground as well as OpenAI.


You can receive API keys for both models by navigating to the below links and following the instructions.

* [NVIDIA_API_KEY](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/mixtral-8x7b/api) - Look for the "Generate Key" box on the middle right-hand side
* [OPENAI_API_KEY](https://platform.openai.com/account/api-keys) - Create an account, log in, and select API keys on the left hand side

Once you have your keys, set the following environment variables:

```bash
export NVIDIA_API_KEY=...
export OPENAI_API_KEY=...
```

Below, we configure OpenAI to use the key from the environment variable.

In [3]:
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
openai.api_key = OPENAI_API_KEY

LLMs generally provide two methods for receiving inputs and creating outputs:

* **Completion** - Given the textual input, generate N output tokens.
* **Chat** - Given a `list` of messages resembling `{"role": "user", "content": "What is your favorite color?"}`, generate N output tokens.
  * As we want to facillitate a conversational interaction between an AI Assistant and a customer, we'll use **Chat** models through this notebook and workflow.

While self-managed LLMs and hosted LLMs provide HTTP APIs, higher level frameworks have been developed to abstratct these APIs and make it easier for developers to quickly compose more complex pipelines. Some example LLM orchestration frameworks are:

* [LangChain](https://www.langchain.com/)
* [LlamaIndex](https://www.llamaindex.ai/)
* [Haystack](https://www.haystackteam.com/)

In this notebook and workflow, we will use LangChain with its [LangChain Expression Language (LECL)](https://python.langchain.com/docs/expression_language/). Despite this opinionated choice, swapping in another LLM orchestration or using more complex pipelines is trivial.

In the code below, we create a `list` of messages - the first element representing a system prompt giving instructions to the LLM and the second element representing a user prompt with a variadic `{input}` variable. Next, we construct our prompt from these messages, instantiate our LLM using the [NVIDIA LangChain endpoints](https://python.langchain.com/docs/integrations/providers/nvidia), and construct our chain.

The `StrOutputParser` takes the output of `llm` and parses it to a `str` type. Lastly, we invoke the chain - mapping the `input` key to the user input.

In [4]:
messages = [
    ('system', 'You are a helpful AI chatbot.'), 
    ('user', "{input}"),
]

In [5]:
user_input = "Can you recommend a coffee mug?"

prompt = ChatPromptTemplate.from_messages(messages)
llm = ChatNVIDIA(model="mixtral_8x7b")
chain = prompt | llm | StrOutputParser()
response: str = chain.invoke({"input": user_input})

In [6]:
print(response)

Of course! I'd be happy to help you find a coffee mug. Here's a recommendation that's popular on Amazon:

The Contigo Autoseal West Loop Stainless Steel Travel Mug is a highly-rated option that's perfect for coffee drinkers on-the-go. It has a vacuum-insulated stainless steel body that keeps drinks hot for up to 5 hours or cold for up to 12 hours. The Autoseal technology ensures that your drink stays inside the mug and doesn't spill, even when you're carrying it in your bag. It's also easy to clean and comes in a variety of colors.

However, if you're looking for something more unique or personalized, there are many other options available, such as mugs with funny sayings, motivational quotes, or even custom photo mugs. Just let me know if you have any specific preferences or requirements, and I can help you find the perfect mug!


<br>
<br>

We now have the basics of a LLM pipeline working. LLMs are incredibly powerful for generating text - however, they are prone to a) using information it has seen during training and b) hallunicating fake information and data.

We can overcome the first hurdle by taking a pre-trained model and fine-tuning it on data we expect it see during deployment. This is a very useful strategy for improving the quality of responses and constraining the output of LLMs to a particular domain.

But this doesn't resolve the second hurdle - even if fine-tuned on a particular dataset, the LLM may still be prone hallunicating. And if we update our product catalog and add, modify, or remove products, we will need to fine-tune again.  

A simpler but powerful technique for overcoming this hurdle is by providing information (called **context**) and instructions to the LLM on how to use that data. While there is a possibility the LLM will still hallunicate, this technique generally constrains the LLM to the information it has immediate access to. It also has the added benefit of allowing us to use the latest data - simply refresh the product catalog and the context provided to the LLM will change.

### Retail Product Data (or Bring Your Own Data) <a name="data"></a>

Below, we'll load Retail data. This workflow provides by default a dataset of 100 products gathered from the NVIDIA Gear Store.

To use a different dataset, construct a CSV file using the following format:

```
category,subcategory,name,description,url,price,image

...
```

And modify the `filename` variable below.

In [7]:
def load_data(filename: str) -> DataFrame:
    """Load data"""
    # load data
    df = pd.read_csv(filename)
    df.columns = [i.lower() for i in df.columns]

    # process data
    df = df.astype({"price": np.float32})
    return df

In [8]:
filename = "/debug/data/gear-store.csv"
df = load_data(filename)

In [9]:
df.head()

Unnamed: 0,category,subcategory,name,description,url,price,image
0,NVIDIA Electronics,Geforce,GEFORCE NOW $50 MEMBERSHIP GIFT CARD,GeForce NOW gift cards can be redeemed for eit...,https://gear.nvidia.com/GeForce-NOW-50-Members...,50.0,https://gear.nvidia.com/GetImage.ashx?Path=%7e...
1,NVIDIA Electronics,Geforce,NVIDIA® GEFORCE RTX™ 4090,The NVIDIA® GeForce RTX® 4090 is the ultimate ...,https://gear.nvidia.com/NVIDIA-GeForce-RTX-409...,1439.0,https://gear.nvidia.com/GetImage.ashx?Path=%7e...
2,NVIDIA Electronics,Geforce,NVIDIA® GEFORCE RTX™ 4080,The NVIDIA® GeForce RTX™ 4080 delivers the ult...,https://gear.nvidia.com/NVIDIA-GeForce-RTX-408...,1079.0,https://gear.nvidia.com/GetImage.ashx?Path=%7e...
3,NVIDIA Electronics,Geforce,NVIDIA® GEFORCE RTX™ 4070,Get equipped for stellar gaming and creating w...,https://gear.nvidia.com/NVIDIA-GeForce-RTX-407...,494.0,https://gear.nvidia.com/GetImage.ashx?Path=%7e...
4,NVIDIA Electronics,Geforce,NVIDIA® GEFORCE™ RTX 4060TI 8GB,"Game, stream, create. The GeForce RTX™ 4060 Ti...",https://gear.nvidia.com/NVIDIA-GeForce-RTX-406...,359.0,https://gear.nvidia.com/GetImage.ashx?Path=%7e...


Additionally, we'll define a `pydantic.BaseModel` class to use to represent each product.

In [10]:
class Product(BaseModel):
    """Product"""

    name: str
    description: str
    url: str
    price: float
    image: str
    ratings: list[int]

Products = list[Product]

### Converting Product Data to Embeddings <a name="embedding"></a>

To use these products in more complex LLM pipelines, we first must convert each of them to an [embedding](https://en.wikipedia.org/wiki/Word_embedding), a numerical representation of that text.

The below helper function takes our `DataFrame`, constructs a `Product` for each row, and returns a list of products.

In [11]:
def create_products(df: DataFrame) -> Products:
    """create products"""
    products = []
    for _, row in df.iterrows():
        product = Product(
            name=row["name"],
            description=row["description"],
            url=row["url"],
            price=row["price"],
            image=row["image"],
            ratings=[random.randint(4, 5) for _ in range(random.randint(10, 50))],
        )
        products.append(product)
    return products

products: Products = create_products(df)

Next, we'll construct a prompt template and format the prompt with the respective details of each `Product`.

```python
Name: {name}
Description: {description}
URL: {url}
Price: {price}
Rating: {rating}
```

In [12]:
def product_to_prompt(p: Product) -> str:
    """Converts a product to context"""
    prompt_template = """Name: {name}
    Description: {description}
    URL: {url}
    Price: {price}
    Rating: {rating}"""
    prompt = prompt_template.format(
        name=p.name,
        description=p.description,
        url=p.url,
        price=p.price,
        rating=round(sum(p.ratings) / len(p.ratings), 2),
    )
    return prompt

products_prompts: list[str] = [product_to_prompt(p) for p in products]

Next, we'll convert each of these prompts to an embedding. Specifically, we'll use the [NVIDIA Retrieval QA Embedding](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nvolve-40k) model accessed through NVIDIA AI Playground and wrapped using the [NVIDIA LangChain endpoints](https://python.langchain.com/docs/integrations/providers/nvidia).

The NVIDIA Retrieval QA Embedding Model is a transformer encoder - a finetuned version of E5-Large-Unsupervised, with 24 layers and an embedding size of 1024, which is trained on private and public datasets as described in the Dataset and Training section. It supports a maximum input of 512 tokens.

Embedding models for text retrieval are typically trained using a bi-encoder architecture. This involves encoding a pair of sentences (for example, query and chunked passages) independently using the embedding model. Contrastive learning is used to maximize the similarity between the query and the passage that contains the answer, while minimizing the similarity between the query and sampled negative passages not useful to answer the question.

We'll instantiate our `embedder`, specifying `model_type="passage"` to take advantage of the contrastive learning capabilities of NVOLVE QA 40K model. And lastly, we'll use the `.embed_documents(...)` method to transform our list of prompts into list of embeddings.

In [13]:
embedder = NVIDIAEmbeddings(model="nvolveqa_40k", model_type="passage")

In [14]:
Embeddings = list[list[float]]

embeddings: Embeddings = embedder.embed_documents(products_prompts)
print(len(embeddings), len(embeddings[0]))

99 1024


### Retrieving the Right Products <a name="retrieval"></a>

With our products converted to embeddings, we can now search across the product catalog using a given input `query`. We'll first convert `query` to an embedding (specifying `model_type="query"` to take advantage of the contrastive learning capabilities of the NVOLVE QA 40K model) by using the `embed_query(...)` method. Next, we'll caclulate the cosine distance of the `query_embedding` from the rest of the embeddings and identify the `top_k` closest embeddings. Lastly, we'll identify and return the `top_k` products respectively mapped to those `top_k` embeddings.

In [15]:
def search(
    query: str, embeddings: Embeddings, 
    products: Products, top_k: int = 2
    ) -> Products:
    """
    This function can be used for searching and retrieval.
    But it can be made as generic and wide as possible/needed.
    """
    query = query.lower().strip()

    query_embedding = NVIDIAEmbeddings(
        model="nvolveqa_40k", model_type="query"
    ).embed_query(query)
    similarity_scores = cosine_similarity([query_embedding], embeddings)[0]
    indices = list(np.argpartition(similarity_scores, -top_k)[-top_k:])
    products = [products[index] for index in indices]
    return products

In [16]:
results = search("Can you recommend a coffee mug?", embeddings, products)
results

[Product(name='14 OZ. A NEW BREED OF INNOVATION MUG', description='14 oz. ceramic mug features a barrel design, large handle, matte exterior finish and gloss colored interior.\n\nProduct Details: \n\n3-5/8" H x 3-5/8 (5 w/handle)"\nHand wash recommended\nMicrowave safe', url='https://gear.nvidia.com/14-oz-A-New-Breed-of-Innovation-Mug-P567.aspx', price=9.5, image='https://gear.nvidia.com/GetImage.ashx?Path=%7e%2fAssets%2fProductImages%2fNV00-0467-LIM_Full.jpg&maintainAspectRatio=true&width=800', ratings=[4, 4, 5, 4, 4, 4, 5, 5, 5, 5, 4]),
 Product(name='14 OZ. VISUAL PURR-CEPTION MUG', description='Everyone loves cats. Keep an eye on those felines with this 14 oz. deep learning mug inspired by NVIDIA Engineer Robert Bond.\n\nProduct Details: \n\n3-5/8" H x 3-5/8 (5 w/handle)"\nHand wash recommended\nMicrowave safe\nCannot be shipped to APAC', url='https://gear.nvidia.com/14-oz-Visual-Purr-Ception-Mug-P614.aspx', price=12.0, image='https://gear.nvidia.com/GetImage.ashx?Path=%7e%2fAssets

In [17]:
Image(url= results[0].image, width=300, height=300)

To make loading product data and searching across it easier, we've created a helper `Catalog` utility class which implements several helpful methods. The full implementation of this class can be found in the `nvretail/catalog.py` module.

In [18]:
catalog = Catalog("/debug/data/gear-store.csv")

In [19]:
catalog.search("Can you recommend a coffee mug?")

[Product(name='14 OZ. A NEW BREED OF INNOVATION MUG', description='14 oz. ceramic mug features a barrel design, large handle, matte exterior finish and gloss colored interior.\n\nProduct Details: \n\n3-5/8" H x 3-5/8 (5 w/handle)"\nHand wash recommended\nMicrowave safe', url='https://gear.nvidia.com/14-oz-A-New-Breed-of-Innovation-Mug-P567.aspx', price=9.5, image='https://gear.nvidia.com/GetImage.ashx?Path=%7e%2fAssets%2fProductImages%2fNV00-0467-LIM_Full.jpg&maintainAspectRatio=true&width=800', ratings=[5, 5, 5, 5, 4, 5, 4, 4, 5, 4, 4, 4, 4, 5, 4, 4, 5, 5, 4, 5, 4, 5, 4, 4, 4, 5, 4]),
 Product(name='14 OZ. VISUAL PURR-CEPTION MUG', description='Everyone loves cats. Keep an eye on those felines with this 14 oz. deep learning mug inspired by NVIDIA Engineer Robert Bond.\n\nProduct Details: \n\n3-5/8" H x 3-5/8 (5 w/handle)"\nHand wash recommended\nMicrowave safe\nCannot be shipped to APAC', url='https://gear.nvidia.com/14-oz-Visual-Purr-Ception-Mug-P614.aspx', price=12.0, image='https:/

In [20]:
Image(url= results[0].image, width=300, height=300)

### Retrieval Augmented Generation (RAG) <a name="rag"></a>

Let's revist the LLM chain we construction in [2. Getting Started with LLMs](#llm).

The LLM hallunicated and introduced several coffee mugs that are not within our product catalog. Now, let's use embeddings and retrieval to first retrieve products that are most similar to some given user query. And we can use information about those products to create **context** and then instruct the LLM to respond or answer questions using that context.

In [21]:
user_input = "Can you recommend a coffee mug?"
products = catalog.search(user_input)
print(products)

[Product(name='14 OZ. A NEW BREED OF INNOVATION MUG', description='14 oz. ceramic mug features a barrel design, large handle, matte exterior finish and gloss colored interior.\n\nProduct Details: \n\n3-5/8" H x 3-5/8 (5 w/handle)"\nHand wash recommended\nMicrowave safe', url='https://gear.nvidia.com/14-oz-A-New-Breed-of-Innovation-Mug-P567.aspx', price=9.5, image='https://gear.nvidia.com/GetImage.ashx?Path=%7e%2fAssets%2fProductImages%2fNV00-0467-LIM_Full.jpg&maintainAspectRatio=true&width=800', ratings=[5, 5, 5, 5, 4, 5, 4, 4, 5, 4, 4, 4, 4, 5, 4, 4, 5, 5, 4, 5, 4, 5, 4, 4, 4, 5, 4]), Product(name='14 OZ. VISUAL PURR-CEPTION MUG', description='Everyone loves cats. Keep an eye on those felines with this 14 oz. deep learning mug inspired by NVIDIA Engineer Robert Bond.\n\nProduct Details: \n\n3-5/8" H x 3-5/8 (5 w/handle)"\nHand wash recommended\nMicrowave safe\nCannot be shipped to APAC', url='https://gear.nvidia.com/14-oz-Visual-Purr-Ception-Mug-P614.aspx', price=12.0, image='https://

In [22]:
messages = [
    ('system', "You are an AI chatbot that helps customers. Respond only using the following context:\n{context}"), 
    ('user', "{input}"),
]

In [23]:
prompt = ChatPromptTemplate.from_messages(messages)


In [24]:
llm = ChatNVIDIA(model="mixtral_8x7b")
chain = prompt | llm | StrOutputParser()
response: str = chain.invoke(
    {"input": user_input, 
     "context": catalog.products_to_context(products)}
)

In [25]:
print(response)

Certainly! I would be happy to recommend a coffee mug from our selection.

If you're looking for a unique and innovative design, I recommend the "14 OZ. A NEW BREED OF INNOVATION MUG". This mug features a barrel design, large handle, matte exterior finish, and gloss colored interior. It has a capacity of 14 oz and is both hand wash recommended and microwave safe. You can find it on our website for $9.50 and it has a rating of 4.44 out of 5.

Alternatively, if you're a cat lover, you might be interested in the "14 OZ. VISUAL PURR-CEPTION MUG". This mug is inspired by NVIDIA Engineer Robert Bond and features a deep learning mug design with a cat on it. It also has a capacity of 14 oz, and it is hand wash recommended and microwave safe. However, please note that this mug cannot be shipped to APAC. It is available for $12.00 and it has a rating of 4.56 out of 5.

Both of these mugs are 3-5/8" H x 3-5/8 (5 w/handle)" and have similar features, the main difference is the design and price.

P

### Facilliating Conversational Flow with Function Calling & Tools <a name="function-calling"></a>

LLMs have incredible conversational capabilities and RAG allows us to parametrically retrieve and provide different context to an LLM. However, natural language conversations are very complex with many different avenues and paths. Consider the below user input:

`user_query = "Can you add three of these to my cart?"`

In [26]:
user_query = "Can you add three of these to my cart?"
products = catalog.search(user_query)

messages = [
    ('system', "You are an AI chatbot that helps customers. Respond only using the following context:\n{context}"), 
    ('user', "{input}"),
]

prompt = ChatPromptTemplate.from_messages(messages)
llm = ChatNVIDIA(model="mixtral_8x7b")
chain = prompt | llm | StrOutputParser()
response: str = chain.invoke(
    {"input": user_query, 
     "context": catalog.products_to_context(products)}
)
print(response)

Sure! I have added three of the Cotton Canvas Tote Bags to your cart. The total for the cooler and the tote bags is $117.00. You can view your cart and proceed to checkout by clicking on the cart icon at the top right corner of the page. Here is the link to your cart: <https://gear.nvidia.com/cart.aspx>

The Igloo Seadrift Cooler is a great choice for keeping your drinks and food cool for longer periods of time. It has a 36 can capacity and features MaxCold® insulation with 25% more foam. The cooler also has a classic colorblock design and several convenient pockets and compartments for storing your belongings. It is PVC and Phthalate free and has a PEVA heat-sealed lining. The cooler has a rating of 4.59 out of 5.

The Cotton Canvas Tote Bag is a versatile and stylish bag that is perfect for carrying your everyday essentials. It is made of durable cotton canvas and has a size of 15" W x 16" H. The tote bag has a rating of 4.52 out of 5.

I hope this helps! Let me know if you have any 

It wouldn't be productive to convert `user_query` to an embedding and search across the catalog. Instead, in this scenario we want the LLM to identify the correct product in the context of the conversation, the correct quantity of the units, and we'd like to structure a call to a Shopping Cart API.

We can achieve this using Function Calling.

Below, we create a `Cart`, utility class emulating a shopping cart with methods like `view_cart`, `add_to_cart`, `modify_item_in_cart`, and `remove_from_cart`.

In [27]:
class Cart:
    """Shopping cart"""

    def __init__(self):
        self.items = {}

    def __str__(self) -> str:
        return str(self.items)

    def __repr__(self) -> str:
        return str(self.items)

    def reset(self):
        """Reset"""
        self.items = {}

    def view_cart(self) -> str:
        """View cart"""
        return f"The following items are in your cart: {str(self.items)}"

    def add_to_cart(self, name: str, quantity: int = 1) -> str:
        """Add to cart"""
        if name in self.items.keys():
            self.items[name] = self.items[name] + quantity
            quantity = self.items[name]
        else:
            self.items[name] = quantity
        return f"{quantity} unit(s) of of {name} have been added to your cart."

    def remove_from_cart(self, name: str) -> str:
        """Remove from cart"""
        if name in self.items.keys():
            self.items.pop(name, None)
            response = (
                f"{name} has been removed from your cart. You have no units remaining."
            )
        else:
            response = f"You don't have any units of {name} in your cart."

        return response

    def modify_item_in_cart(self, name: str, quantity: int) -> str:
        """Modify item in cart"""
        self.items[name] = quantity
        return f"You now have {quantity} unit(s) of of {name} in your cart."

We'll create and define a `Function` using the [openai-function-calling](https://github.com/jakecyr/openai-function-calling/tree/master) package.

In [28]:
add_to_cart_function = Function(
    name="add_to_cart",
    description="Add item to the customer's shopping cart",
    parameters=[
        Parameter(
            name="name",
            type="string",
            description="The name of the item to add to the customer's shopping cart",
        ),
        Parameter(
            name="quantity",
            type="integer",
            description="The quantity of that item to add to the customer's shopping cart",
        ),
    ],
    required_parameters=["name", "quantity"],
)

We'll instantiate our `cart` and create an OpenAI `client` for sending requests. We'll define our tools. And use `ToolHelpers.from_functions(tools)` to convert our tool from a `Function` to an input format needed by the OpenAI API.

In [29]:
client = OpenAI()
cart = Cart()

In [30]:
messages = [{'role': 'system', 'content': 'You are a helpful AI chatbot.'}, 
            {'role': 'user', 'content': 'Can you add 3 units of 14 OZ. NVIDIA LOGO MUG to my cart?'}]

In [31]:
tools = [add_to_cart_function]

completion = client.chat.completions.create(
    model="gpt-3.5-turbo-1106",
    messages=messages,
    tools=ToolHelpers.from_functions(tools),
    tool_choice="auto",
    temperature=0.0,
)

In [32]:
completion

ChatCompletion(id='chatcmpl-8keOJq4GtiXcaafm27Pb1VawwJpLI', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_pDNidKPAmIFAJwHMCcW9jzsf', function=Function(arguments='{"name":"14 OZ. NVIDIA LOGO MUG","quantity":3}', name='add_to_cart'), type='function')]))], created=1706129043, model='gpt-3.5-turbo-1106', object='chat.completion', system_fingerprint='fp_b57c83dd65', usage=CompletionUsage(completion_tokens=27, prompt_tokens=112, total_tokens=139))

In the `completion` object, we can access if a tool was used or called. Additionally, we can identify the name of the tool to be used and the arguments extracted.

In [33]:
fn_name = completion.choices[0].message.tool_calls[0].function.name
fn_args = json.loads(
    completion.choices[0].message.tool_calls[0].function.arguments
)
print(f"Function: {fn_name}")
print(f"Arguments: {fn_args}")

Function: add_to_cart
Arguments: {'name': '14 OZ. NVIDIA LOGO MUG', 'quantity': 3}


Writing logic to handle when and which tool was called is trivial.

In [34]:
if fn_name == "add_to_cart":
    fn_result = cart.add_to_cart(**fn_args)
    fn_result = FunctionResult(fn_result=fn_result)

In [35]:
print(f"Results: {fn_result.fn_result}")

Results: 3 unit(s) of of 14 OZ. NVIDIA LOGO MUG have been added to your cart.


Below, we define several different scenarios we might encounter and provide the LLM with access to different tools it might need to respond.

* view_cart_function
* add_to_cart_function
* remove_from_cart_function
* modify_item_in_cart_function
* search_function

In [36]:
cart.reset()

user_messages = [
    {'role': 'user', 'content': 'Can help me find a coffee mug?'},
    {'role': 'user', 'content': 'Can you add NVIDIA LOGO MUG to my cart?'},
    {'role': 'user', 'content': 'Actually, can you change that to 3 units of NVIDIA LOGO MUG?'},
    {'role': 'user', 'content': 'On second thought, can you remove the NVIDIA LOGO MUG from my cart?'},
    {'role': 'user', 'content': "What's in my cart?"},
]

tools = [
    view_cart_function, add_to_cart_function, remove_from_cart_function, 
    modify_item_in_cart_function, search_function
]

def execute_tool(
    fn_metadata: FunctionMetadata, cart: Cart, 
    catalog: Catalog
) -> tuple[FunctionResult, Products]:
    """Execute tool"""
    products: Products = []

    if fn_metadata.fn_name == "view_cart":
        fn_result = cart.view_cart()
    elif fn_metadata.fn_name == "add_to_cart":
        fn_result = cart.add_to_cart(**fn_metadata.fn_args)
    elif fn_metadata.fn_name == "remove_from_cart":
        fn_result = cart.remove_from_cart(**fn_metadata.fn_args)
    elif fn_metadata.fn_name == "modify_item_in_cart":
        fn_result = cart.modify_item_in_cart(**fn_metadata.fn_args)
    elif fn_metadata.fn_name == "search":
        products = catalog.search(**fn_metadata.fn_args)
        fn_result = catalog.products_to_context(products)
    fn_result = FunctionResult(fn_result=fn_result)

    return fn_result, products

for user_message in user_messages:
    messages = [{'role': 'system', 'content': 'You are a helpful AI chatbot.'}, user_message]
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=messages,
        tools=ToolHelpers.from_functions(tools),
        tool_choice="auto",
        temperature=0.0,
    )
    use_tool = (
        completion.choices[0].finish_reason == "tool_calls"
        or completion.choices[0].message.tool_calls
    )
    function_metadata = FunctionMetadata(
            fn_name=completion.choices[0].message.tool_calls[0].function.name,
            fn_args=json.loads(
                completion.choices[0].message.tool_calls[0].function.arguments
            ),
    )
    fn_result, products = execute_tool(function_metadata, cart, catalog)
    print(f"Function: {function_metadata.fn_name}")
    print(f"Arguments: {function_metadata.fn_args}")
    print(f"Results: {fn_result.fn_result}")
    print("-" * 79)
    
    

Function: search
Arguments: {'query': 'coffee mug'}
Results: Name: 14 OZ. NVIDIA LOGO MUG
        Description: 14 oz. ceramic mug features a barrel design, large handle, matte exterior finish and gloss colored interior.

Product Details: 

3-5/8" H x 3-5/8 (5 w/handle)"
Hand wash recommended 
Microwave safe
        URL: https://gear.nvidia.com/14-oz-NVIDIA-Logo-Mug-P228.aspx
        Price: 7.5
        Rating: 4.44
Name: 14 OZ. VISUAL PURR-CEPTION MUG
        Description: Everyone loves cats. Keep an eye on those felines with this 14 oz. deep learning mug inspired by NVIDIA Engineer Robert Bond.

Product Details: 

3-5/8" H x 3-5/8 (5 w/handle)"
Hand wash recommended
Microwave safe
Cannot be shipped to APAC
        URL: https://gear.nvidia.com/14-oz-Visual-Purr-Ception-Mug-P614.aspx
        Price: 12.0
        Rating: 4.56
-------------------------------------------------------------------------------
Function: add_to_cart
Arguments: {'name': 'NVIDIA LOGO MUG', 'quantity': 1}
Results: 1

### Product Advisor <a name="product-advisor"></a>

Building upon the previous sections, let's put it all together. We've provided a helper `ProductAdvisor` utility class that takes several inputs and exposes `.chat(...)` method to make interacting with the Product Advisor easier. We also implement logic to use the results of our tools to inform how the LLM should respond.

In [37]:
client = OpenAI()
cart = Cart()
catalog = Catalog("/debug/data/gear-store.csv")

product_advisor = ProductAdvisor(client=client, cart=cart, catalog=catalog)

In [38]:
messages = [
        Message(role="system", content="You are an AI chatbot that helps customers answer questions about products."),
        Message(role="user", content="Hello!"),
    ]

In [39]:
# Should respond normally without calling a specific tool
messages, fn_metadata, products = product_advisor.chat(messages)

print(f"Function: {fn_metadata.fn_name}")
print(f"Arguments: {fn_metadata.fn_args}")
print(f"Response: {messages[-1].content}")

Function: None
Arguments: None
Response: Hello! I'm glad you've reached out for assistance. I'm here to help answer any questions you have about our products. What can I help you with today?


In [40]:
# Let's test different scenarios that might trigger each of our tools
user_messages = [
    "Hello!",  # respond normally
    'Can help me find a coffee mug?',  # search
    'Can you add NVIDIA LOGO MUG to my cart?',  # add_to_cart
    'Actually, can you change that to 3 units of NVIDIA LOGO MUG?',  # modify_item_in_cart
    "What's in my cart?",  # view_cart
    'On second thought, can you remove the NVIDIA LOGO MUG from my cart?',  # remove_from_cart
]

for user_message in user_messages:
    messages = [
        Message(role="system", content="You are an AI chatbot that helps customers answer questions about products."),
        Message(role="user", content=user_message),
    ]
    messages, fn_metadata, products = product_advisor.chat(messages)

    print(f"Function: {fn_metadata.fn_name}")
    print(f"Arguments: {fn_metadata.fn_args}")
    print(f"Products: {[p.name for p in products]}")
    print(f"Response: {messages[-1].content}")
    print("-" * 79)

Function: None
Arguments: None
Products: []
Response: Hello! I'm glad you've reached out for assistance. I'm here to help answer any questions you have about our products. What can I help you with today?
-------------------------------------------------------------------------------
Function: search
Arguments: {'query': 'coffee mug'}
Products: ['14 OZ. NVIDIA LOGO MUG', '14 OZ. VISUAL PURR-CEPTION MUG']
Response: Sure, I'd be happy to help you find a coffee mug! Here are two options that you might like:

1. The 14 OZ. NVIDIA LOGO MUG is a 14 oz. ceramic mug with a barrel design, large handle, matte exterior finish, and gloss colored interior. It is hand wash recommended and microwave safe. This mug is priced at $7.50 and has a rating of 4.54 out of 5. You can find more information and purchase it here: <https://gear.nvidia.com/14-oz-NVIDIA-Logo-Mug-P228.aspx>
2. The 14 OZ. VISUAL PURR-CEPTION MUG is a 14 oz. deep learning mug inspired by NVIDIA Engineer Robert Bond. It features a cute 

### Deployment with FastAPI and React <a name="deployment"></a>

Finally, we deploy the Product Advisor using a FastAPI backend and a React application frontend.

```bash
# build and run both services
docker-compose build chatbot-service frontend-service
docker-compose up -d chatbot-service frontend-service

# inspect logs
docker-compose logs -f chatbot-service
docker-compose logs -f frontend-service
```

Navigate to [http://localhost:3000](http://localhost:3000) for the frontend.

Navigate to [http://localhost:5001/docs](http://localhost:5001/docs) for documentation for the backend.