<a href="https://colab.research.google.com/github/himanshu2s/EmployeeRegistration/blob/master/Solve_Business_Problems_with_AI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Solve Business Problems with AI

## Objective
Develop a proof-of-concept application to intelligently process email order requests and customer inquiries for a fashion store. The system should accurately categorize emails as either product inquiries or order requests and generate appropriate responses using the product catalog information and current stock status.

You are encouraged to use AI assistants (like ChatGPT or Claude) and any IDE of your choice to develop your solution. Many modern IDEs (such as PyCharm, or Cursor) can work with Jupiter files directly.

## Task Description

### Inputs

Google Spreadsheet **[Document](https://docs.google.com/spreadsheets/d/14fKHsblfqZfWj3iAaM2oA51TlYfQlFT4WKo52fVaQ9U)** containing:

- **Products**: List of products with fields including product ID, name, category, stock amount, detailed description, and season.

- **Emails**: Sequential list of emails with fields such as email ID, subject, and body.

### Instructions

- Implement all requirements using advanced Large Language Models (LLMs) to handle complex tasks, process extensive data, and generate accurate outputs effectively.
- Use Retrieval-Augmented Generation (RAG) and vector store techniques where applicable to retrieve relevant information and generate responses.
- You are provided with a temporary OpenAI API key granting access to GPT-4o, which has a token quota. Use it wisely or use your own key if preferred.
- Address the requirements in the order listed. Review them in advance to develop a general implementation plan before starting.
- Your deliverables should include:
   - Code developed within this notebook.
   - A single spreadsheet containing results, organized across separate sheets.
   - Comments detailing your thought process.
- You may use additional libraries (e.g., langchain) to streamline the solution. Use libraries appropriately to align with best practices for AI and LLM tools.
- Use the most suitable AI techniques for each task. Note that solving tasks with traditional programming methods will not earn points, as this assessment evaluates your knowledge of LLM tools and best practices.

### Requirements

#### 1. Classify emails
    
Classify each email as either a _**"product inquiry"**_ or an _**"order request"**_. Ensure that the classification accurately reflects the intent of the email.

**Output**: Populate the **email-classification** sheet with columns: email ID, category.

#### 2. Process order requests
1.   Process orders
  - For each order request, verify product availability in stock.
  - If the order can be fulfilled, create a new order line with the status “created”.
  - If the order cannot be fulfilled due to insufficient stock, create a line with the status “out of stock” and include the requested quantity.
  - Update stock levels after processing each order.
  - Record each product request from the email.
  - **Output**: Populate the **order-status** sheet with columns: email ID, product ID, quantity, status (**_"created"_**, **_"out of stock"_**).

2.   Generate responses
  - Create response emails based on the order processing results:
      - If the order is fully processed, inform the customer and provide product details.
      - If the order cannot be fulfilled or is only partially fulfilled, explain the situation, specify the out-of-stock items, and suggest alternatives or options (e.g., waiting for restock).
  - Ensure the email tone is professional and production-ready.
  - **Output**: Populate the **order-response** sheet with columns: email ID, response.

#### 3. Handle product inquiry

Customers may ask general open questions.
  - Respond to product inquiries using relevant information from the product catalog.
  - Ensure your solution scales to handle a full catalog of over 100,000 products without exceeding token limits. Avoid including the entire catalog in the prompt.
  - **Output**: Populate the **inquiry-response** sheet with columns: email ID, response.

## Evaluation Criteria
- **Advanced AI Techniques**: The system should use Retrieval-Augmented Generation (RAG) and vector store techniques to retrieve relevant information from data sources and use it to respond to customer inquiries.
- **Tone Adaptation**: The AI should adapt its tone appropriately based on the context of the customer's inquiry. Responses should be informative and enhance the customer experience.
- **Code Completeness**: All functionalities outlined in the requirements must be fully implemented and operational as described.
- **Code Quality and Clarity**: The code should be well-organized, with clear logic and a structured approach. It should be easy to understand and maintain.
- **Presence of Expected Outputs**: All specified outputs must be correctly generated and saved in the appropriate sheets of the output spreadsheet. Ensure the format of each output matches the requirements—do not add extra columns or sheets.
- **Accuracy of Outputs**: The accuracy of the generated outputs is crucial and will significantly impact the evaluation of your submission.

We look forward to seeing your solution and your approach to solving real-world problems with AI technologies.

# Prerequisites

### Configure OpenAI API Key.

In [None]:
# Install the OpenAI Python package.
%pip install openai httpx==0.27.2

**IMPORTANT: If you are going to use our custom API Key then make sure that you also use custom base URL as in example below. Otherwise it will not work.**

In [119]:
# Code example of OpenAI communication

from openai import OpenAI

client = OpenAI(
    # In order to use provided API key, make sure that models you create point to this custom base URL.
    base_url='https://47v4us7kyypinfb5lcligtc3x40ygqbs.lambda-url.us-east-1.on.aws/v1/',
    # The temporary API key giving access to ChatGPT 4o model. Quotas apply: you have 500'000 input and 500'000 output tokens, use them wisely ;)
    api_key='a0BIj000002MLxxMAG'
)

completion = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message)

ChatCompletionMessage(content='Hello! How can I assist you today?', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None, annotations=[])


In [150]:
# Code example of reading input data

import pandas as pd
from IPython.display import display

def read_data_frame(document_id, sheet_name):
    export_link = f"https://docs.google.com/spreadsheets/d/{document_id}/gviz/tq?tqx=out:csv&sheet={sheet_name}"
    print(f"Export Link: {export_link}")
    return  pd.read_csv(export_link)

# document_id = '14fKHsblfqZfWj3iAaM2oA51TlYfQlFT4WKo52fVaQ9U'
document_id = '1X6skuyHic38yajJXuHj7R7C_oD6VChN-zYlKo-6ktPw'
products_df = read_data_frame(document_id, 'products')
emails_df = read_data_frame(document_id, 'emails')

# Display first 3 rows of each DataFrame
display(products_df.head(3))
display(emails_df.head(3))

Export Link: https://docs.google.com/spreadsheets/d/1X6skuyHic38yajJXuHj7R7C_oD6VChN-zYlKo-6ktPw/gviz/tq?tqx=out:csv&sheet=products
Export Link: https://docs.google.com/spreadsheets/d/1X6skuyHic38yajJXuHj7R7C_oD6VChN-zYlKo-6ktPw/gviz/tq?tqx=out:csv&sheet=emails


Unnamed: 0,product_id,name,category,description,stock,seasons,price
0,RSG8901,Retro Sunglasses,Accessories,Transport yourself back in time with our retro...,1,"Spring, Summer",26.99
1,SWL2345,Sleek Wallet,Accessories,Keep your essentials organized and secure with...,5,All seasons,30.0
2,VSC6789,Versatile Scarf,Accessories,Add a touch of versatility to your wardrobe wi...,6,"Spring, Fall",23.0


Unnamed: 0,email_id,subject,message
0,E001,Leather Wallets,"Hi there, I want to order all the remaining LT..."
1,E002,Buy Vibrant Tote with noise,"Good morning, I'm looking to buy the VBT2345 V..."
2,E003,Need your help,"Hello, I need a new bag to carry my laptop and..."


In [51]:
# Creates a new shared Google Worksheet every invocation with the proper structure
from google.colab import auth
import gspread
from google.auth import default
from gspread_dataframe import set_with_dataframe

auth.authenticate_user()
creds, _ = default()
gc = gspread.authorize(creds)

# This code goes after creating google client
output_document = gc.create('Solving Business Problems with AI - Output')

# Create 'email-classification' sheet
email_classification_sheet = output_document.add_worksheet(title="email-classification", rows=50, cols=2)
email_classification_sheet.update([['Email ID', 'Subject', 'Message', 'Category']], 'A1:D1')

# Create 'order-status' sheet
order_status_sheet = output_document.add_worksheet(title="order-status", rows=50, cols=4)
order_status_sheet.update([['Email ID', 'Product ID', 'Quantity', 'Status']], 'A1:D1')

# Create 'order-response' sheet
order_response_sheet = output_document.add_worksheet(title="order-response", rows=50, cols=2)
order_response_sheet.update([['Email ID', 'Response']], 'A1:B1')

# Create 'inquiry-response' sheet
inquiry_response_sheet = output_document.add_worksheet(title="inquiry-response", rows=50, cols=2)
inquiry_response_sheet.update([['Email ID', 'Response']], 'A1:B1')

# Share the spreadsheet publicly
output_document.share('', perm_type='anyone', role='reader')

# This is the solution output link, paste it into the submission form
print(f"Shareable link: https://docs.google.com/spreadsheets/d/{output_document.id}")

Export Link: https://docs.google.com/spreadsheets/d/1W6RUwDy7AvNcsO7n48oNKlsvkx9kR39NMaicvUzEwmE/gviz/tq?tqx=out:csv&sheet=email-classification
Export Link: https://docs.google.com/spreadsheets/d/1W6RUwDy7AvNcsO7n48oNKlsvkx9kR39NMaicvUzEwmE/gviz/tq?tqx=out:csv&sheet=order-status
Export Link: https://docs.google.com/spreadsheets/d/1W6RUwDy7AvNcsO7n48oNKlsvkx9kR39NMaicvUzEwmE/gviz/tq?tqx=out:csv&sheet=order-response
Export Link: https://docs.google.com/spreadsheets/d/1W6RUwDy7AvNcsO7n48oNKlsvkx9kR39NMaicvUzEwmE/gviz/tq?tqx=out:csv&sheet=inquiry-response


Unnamed: 0,Email ID,Subject,Message,Category


Shareable link: https://docs.google.com/spreadsheets/d/1W6RUwDy7AvNcsO7n48oNKlsvkx9kR39NMaicvUzEwmE




```
# This is formatted as code
```

# Task 1. Classify emails

In [52]:
# Use this section to classify emails as per the email subject/content and put the results in a google drive sheet.
from google.colab import drive
import numpy as np

def classify_email(email, subject, message):
  try:
    email = email if email else "Unknown Email"
    subject = subject if subject else "No Subject"
    message = message if message else "No Message"

    prompt = f"""
    Given the following email, classify it as either a "product inquiry" or an "order request":

    Subject: {subject}
    Message: {message}

    Respond with only one of the two categories: "product inquiry" or "order request".
    """

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=10
    )
    category = response.choices[0].message.content.strip()

    return email, subject, message, category

    # # Add data to 'email-classification' sheet
    # email_classification_sheet.append_row([email, subject, message, category])
  except Exception as e:
    print(f"🚨 Error: {e}")
    return email, subject, message, "Error"

try:
  # Set default values for null fields.
  emails_df = emails_df.replace([np.nan, np.inf, -np.inf], "")

  classified_data = []

  for _, row in emails_df.iterrows():
      processed_email, processed_subject, processed_message, category = classify_email(row["email_id"], row["subject"], row["message"])
      classified_data.append([processed_email, processed_subject, processed_message, category])

  # Add all rows at once
  email_classification_sheet.append_rows(classified_data)

  # # Classify emails
  # emails_df.apply(lambda row: classify_email(row["email_id"], row["subject"], row["message"]), axis=1);
  print(f"Emails are now classified. 🔗 Shareable link: https://docs.google.com/spreadsheets/d/{output_document.id}")
except Exception as e:
    print(f"🚨 Error: {e}")

email_classification_df = read_data_frame(output_document.id, 'email-classification')
# order_status_df = read_data_frame(output_document.id, 'order-status')
# order_response_df = read_data_frame(output_document.id, 'order-response')
# inquiry_response_df = read_data_frame(output_document.id, 'inquiry-response')

display(email_classification_df.head(3))

Emails are now classified. 🔗 Shareable link: https://docs.google.com/spreadsheets/d/1W6RUwDy7AvNcsO7n48oNKlsvkx9kR39NMaicvUzEwmE
Export Link: https://docs.google.com/spreadsheets/d/1W6RUwDy7AvNcsO7n48oNKlsvkx9kR39NMaicvUzEwmE/gviz/tq?tqx=out:csv&sheet=email-classification


Unnamed: 0,Email ID,Subject,Message,Category
0,E001,Leather Wallets,"Hi there, I want to order all the remaining LT...",order request
1,E002,Buy Vibrant Tote with noise,"Good morning, I'm looking to buy the VBT2345 V...",order request
2,E003,Need your help,"Hello, I need a new bag to carry my laptop and...",product inquiry


# Task 2. Process order requests

In [124]:
# Use this method to extract order details from the Email Message.
import json

def extract_order_details_from_gpt(email_message):
    """
    Extracts product order details from an email and ensures quantities match available stock.
    """
    # Convert product stock information to JSON format
    product_catalog = products_df[["product_id", "name", "stock"]].to_dict(orient="records")

    prompt = f"""
    You are a customer service AI assistant that extracts order details from email conversations.

    Here is an email from a customer requesting a product:

    ---
    Email Message:
    "{email_message}"
    ---

    ### Product Catalog (with Available Stock):
    {json.dumps(product_catalog, indent=2)}

    ### Your Task:
    1. Identify the product(s) the customer wants to order.
    2. Check the available stock for each requested product.
    3. If the customer requests "all remaining stock," provide the **exact available quantity** from the catalog.
    4. If no quantity is mentioned, assume **1 unit** as the default.
    5. If the product is out of stock, return `0` as the quantity.
    6. Ensure the output follows this JSON format:

    {{
        "orders": [
            {{"product_id": "XXX1234", "quantity": 5}},
            {{"product_id": "YYY5678", "quantity": 3}}
        ]
    }}

    Now, extract the order details:
    """

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=300
        )

    try:
      return response.choices[0].message.content.strip()
    except json.JSONDecodeError:
      return []  # Return empty list if parsing fails

# extract_order_details_from_gpt("Hi there, I want to order all the remaining LTH0976 Leather Bifold Wallets you have in stock. I'm opening up a small boutique shop and these would be perfect for my inventory. Thank you!")

'```json\n{\n    "orders": [\n        {"product_id": "LTH0976", "quantity": 4}\n    ]\n}\n```'

In [159]:
import json

def clean_json_response(response_text):
    """ Removes Markdown code blocks from GPT responses. """
    return response_text.strip().strip("```json").strip("```").strip()

for _, row in email_classification_df.iterrows():
  if row["Category"].lower() == "order request":
    orders_json = extract_order_details_from_gpt(row["Message"])
    print(f"Extracted order details: {orders_json}")
    if orders_json: # check if the string is empty
        try:
            orders_data = json.loads(clean_json_response(orders_json))
        except json.JSONDecodeError as e:
            print(f"JSON Decode Error: {e}")
            print(f"Invalid JSON: '{orders_json}'")
    else:
        print("Received empty JSON string.")

    orders = orders_data.get("orders", [])

    # Load product inventory from Google Sheets
    products_df = products_df.replace([np.nan, np.inf, -np.inf], "")

    # Convert stock data into a dictionary (Product ID → Available Stock)
    stock_inventory = dict(zip(products_df["product_id"], products_df["stock"]))

    order_request_emails = []

    # Loop through extracted orders and access product_id and quantity
    for order in orders:
        product_id = order["product_id"]
        quantity = order["quantity"]
        print(product_id, quantity)

        if product_id and quantity:
            if product_id in stock_inventory and stock_inventory[product_id] >= quantity:
                # Order can be fulfilled
                status = "created"
                stock_inventory[product_id] -= quantity  # Update stock

                # Update stock levels in products_df
                index = products_df.index[products_df["product_id"] == product_id].tolist()

                if index:
                    products_df.at[index[0], "stock"] = stock_inventory[product_id]

                # products_df.loc[products_df["product_id"] == product_id, "stock"] = stock_inventory[product_id]

                response_message = f"Dear Customer,\n\nYour order for {quantity} unit(s) of {product_id} has been successfully processed.\n\nThank you for your business!"
            else:
                # Insufficient stock
                status = "out of stock"
                response_message = f"Dear Customer,\n\nUnfortunately, we are out of stock for {product_id}. Requested Quantity: {quantity}.\nWe apologize for the inconvenience and will notify you when restocked."

            # Append to order-status sheet
            order_status_sheet.append_row([row["Email ID"], product_id, quantity, status])

            # Append to order-response sheet
            order_response_sheet.append_row([row["Email ID"], response_message])
        else:
            print(f"⚠️ Unable to extract order details from email: {row['Email ID']}")

print(f"Orders processed, stock updated, and responses generated. 🔗 Shareable link: https://docs.google.com/spreadsheets/d/{output_document.id}")

Extracted order details: ```json
{
    "orders": [
        {"product_id": "LTH0976", "quantity": 0}
    ]
}
```
LTH0976 0
⚠️ Unable to extract order details from email: E001
Extracted order details: ```json
{
    "orders": [
        {"product_id": "VBT2345", "quantity": 0}
    ]
}
```
VBT2345 0
⚠️ Unable to extract order details from email: E002
Extracted order details: ```json
{
    "orders": [
        {"product_id": "SFT1098", "quantity": 0}
    ]
}
```
SFT1098 0
⚠️ Unable to extract order details from email: E004
Extracted order details: ```json
{
    "orders": [
        {"product_id": "CLF2109", "quantity": 0},
        {"product_id": "FZZ1098", "quantity": 0}
    ]
}
```
CLF2109 0
⚠️ Unable to extract order details from email: E007
FZZ1098 0
⚠️ Unable to extract order details from email: E007
Extracted order details: ```json
{
    "orders": [
        {"product_id": "VSC6789", "quantity": 0}
    ]
}
```
VSC6789 0
⚠️ Unable to extract order details from email: E008
Extracted order de

# Task 3. Handle product inquiry

In [163]:
import json
import pandas as pd

def find_relevant_products(message, products_df):
    """
    Searches for relevant products in the catalog based on keywords in the message.
    Returns a filtered DataFrame of matching products.
    """
    message_words = set(message.lower().split())  # Convert message to lowercase words
    filtered_products = products_df[
        products_df["name"].str.lower().apply(lambda x: any(word in x for word in message_words))
    ]

    return filtered_products

def generate_product_response(email_message):
    """
    Generates a customer response for product inquiries.
    Extracts relevant product details without exceeding token limits.
    """
    # Find matching products from catalog
    relevant_products = find_relevant_products(email_message, products_df)

    # If no relevant product found, send a generic response
    if relevant_products.empty:
        return "Dear Customer, we couldn’t find a match for your inquiry. Please provide more details."

    # Limit number of products sent to GPT (e.g., max 5 to avoid token overload)
    relevant_products = relevant_products.head(5)

    # Convert product details to JSON (Compact format to avoid token limits)
    product_details = relevant_products[["product_id", "name", "description", "price"]].to_dict(orient="records")

    prompt = f"""
    You are a customer service AI assistant for an e-commerce store.
    A customer has inquired about a product. Here is their email:

    ---
    Email Message:
    "{email_message}"
    ---

    ### Relevant Products:
    {json.dumps(product_details, indent=2)}

    ### Your Task:
    - Provide a professional and informative response to the customer.
    - Summarize the most relevant products and their details.
    - If multiple products match, suggest the best option.
    - Keep the response concise and professional.

    Example Response:
    "Dear Customer, based on your inquiry, we recommend the following product(s): ..."
    """

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=300
    )

    return response.choices[0].message.content.strip()

# # Open Google Sheets
# output_document = gc.open('Solving Business Problems with AI - Output')
# inquiry_response_sheet = output_document.add_worksheet(title="inquiry-response", rows=50, cols=2)
# inquiry_response_sheet.update([['Email ID', 'Response']], 'A1:B1')

# Process inquiries
# response_message = generate_product_response("Hello, I need a new bag to carry my laptop and documents for work. My name is David and I'm having a hard time deciding which would be better - the LTH1098 Leather Backpack or the Leather Tote? Does one have more organizational pockets than the other? Any insight would be appreciated!")
# inquiry_response_sheet.append_row([row["email_id"], response_message])

for _, row in email_classification_df.iterrows():
    if row["Category"].lower() == "product inquiry":
        response_message = generate_product_response(row["Message"])
        print(response_message)
        inquiry_response_sheet.append_row([row["Email ID"], response_message])

print(f"Product inquiries processed. 🔗 Shareable link: https://docs.google.com/spreadsheets/d/{output_document.id}")


Dear David,

Thank you for reaching out to us with your inquiry about finding the perfect bag for work. I understand the importance of having a functional and stylish bag to carry your laptop and documents.

Unfortunately, it seems that the detailed information for the specific bags you're interested in, the LTH1098 Leather Backpack and the Leather Tote, is not currently available in our database. However, I can offer some general insights that might help you decide which option is best for your needs:

1. **Leather Backpack (LTH1098)**: Typically, backpacks offer more organizational pockets both inside and outside, which is great for storing items like a laptop, notebooks, pens, and small electronics securely. This might be more suitable if organization and ease of carrying heavier loads are your priorities.

2. **Leather Tote**: Totes usually offer a spacious main compartment, which might be ideal if you prefer quick access to your items. They might come with fewer organizational poc

Currently, following use cases are not taken care:

- Processing the same task repeatedly results in duplicate entries. To address this, we need to enhance the workflow to detect existing orders or responses and either skip or update them.
- The system is not updating stock levels in the primary worksheet. We can resolve this by implementing validation checks and business logic to manage these requests effectively.

Thanks for the opportuinity to work on this task.