# Feedback insights demo with Azure OpenAI GPT
Use GPT to summarise and extract insights from customer feedback

## 0. Pre-reqs
- Python 3.10.12 was used to create this
- Access to pip install stuff
    - openai
    - pandas
    - tiktoken (for counting the number of tokens in text)
    - dotenv (for storing keys in .env file)

In [None]:
# Install dependencies
%pip install openai
%pip install pandas
%pip install tiktoken
%pip install python-dotenv

# 1. Load data and explore a bit
For the purposes of this demo the data is in CSV file and we'll import it using Pandas.


In [None]:
import pandas as pd

df = pd.read_csv('CustomerFeedback.csv')

# What's in there?
df.head()

---
# 2. Explore Summarisation
### Environment setup
You'll need to create a file called `.env` that contains the following:
```
AZURE_OPENAI_SERVICE = "<name of AOAI service in portal"
AZURE_OPENAI_KEY = "<API Key for AOAI service>"
AZURE_OPENAI_MODEL = "<e.g. gpt-35-turbo>"
AZURE_OPENAI_DEPLOYMENT = "<deployment name in AI studio>"
```

### Setup Azure OpenAI service
And some helper functions

In [3]:
from dotenv import load_dotenv
import os
import openai

# Load config from .env file (instructions above)
load_dotenv()
AZURE_OPENAI_SERVICE = os.getenv('AZURE_OPENAI_SERVICE')
AZURE_OPENAI_DEPLOYMENT = os.getenv('AZURE_OPENAI_DEPLOYMENT')
AZURE_OPENAI_MODEL = os.getenv('AZURE_OPENAI_MODEL')
AZURE_OPENAI_KEY = os.getenv('AZURE_OPENAI_KEY')

# Configure OpenAI
openai.api_type = "azure"
openai.api_base = f"https://{AZURE_OPENAI_SERVICE}.openai.azure.com"
openai.api_version = "2023-07-01-preview"
openai.api_key = AZURE_OPENAI_KEY

# Helper function to call OpenAI
def call_gpt(messages: str) -> str:
    response = openai.ChatCompletion.create(
                    deployment_id=AZURE_OPENAI_DEPLOYMENT,
                    model=AZURE_OPENAI_MODEL,
                    messages=messages, 
                    temperature=0.7,   # WARNING: setting to < 0.7 can cause infinite loops
                    stream=False
                    )
    return response.choices[0].message.content

# Helper function to make the messages in the required ChatCompletion format
def make_messages(system_prompt: str, user_prompt: str) -> str:
    messages = []
    messages.append({"role" : "system", "content" : system_prompt})
    messages.append({"role": "user", "content": user_prompt })
    return messages


## Test summarising a single record

In [None]:
single_record_summarisation_prompt = """
The [comment] below is a user review of Better Bank services and represents their recent experience with these services. You should answer the questions below based solely on the content of [comment] and no further content should be generated.  
 
[FACTS]:  
Use the following facts to help you answer the questions:  
- Better Bank is in the banking and insurance industry.
- They offer products including Home Loans, Personal Loans, Credit Cards, Savings Accounts, Term Deposits and Insurance.
- Their competitors include Bank A, Bank B or Insurance Co A
 
[OUTPUT]:
Each question from [QUESTIONS] below is in the format "json-element-name::question".   
Format your answers as a single JSON document using the json-element-name of each question.  
 
[QUESTIONS]  
1. summary::Generate a summary of the [comment] in less than 50 words.  
2. competitors::Any Better Bank competitors. Answer [N/A] if there are no mentions.  
3. products::Any Better Bank products mentioned. Answer [N/A] if there are no mentions.
4. classifications::Classify the [comment] into one of the following categories:   
    - Wait time  
    - Price  
    - Rates  
    - Customer Service  
    - Premiums 
    - Online platform   
    - Other   

[comment] """

print("==== SYSTEM PROMPT:\n" + single_record_summarisation_prompt)
comment = 'COMMENT:' + df.iloc[0]["Comment"]
print("==== COMMENT\n" + comment)

# call GPT to summarise using the prompt
response = call_gpt(make_messages(single_record_summarisation_prompt, comment))
print("\n===== RESPONSE:\n" + response)

___
# Process the dataset


In [5]:
# Supporting modules 
import os
import json

# Create output folder if required
if not os.path.exists('output'):
    os.makedirs('output')

## 3.1 Single Record summarisation and output to file

This will create a file in the `output` folder.  
The file will create **even if one (or all) records fail** JSON conversion.

In [None]:
# print total survey comments in dataframe
print(len(df), ' total feedback items')

In [None]:
# IMPORTANT: Make sure you "Execute Above Cells" using the pop-up menu at the top right of this code block

# LIMIT THIS RUN TO N RECORDS
max_records_to_process = 100
df_single_records = df.head(max_records_to_process).copy()

# Error log - define a DF for capturing any summarisation errors (e.g. content filtering)
df_errors = pd.DataFrame(columns=['CommentID', 'Date', 'ERROR'])


single_record_summarisation_prompt = """
The [comment] below is a user review of Better Bank services and represents their recent experience with these services. You should answer the questions below based solely on the content of [comment] and no further content should be generated.  
 
[FACTS]:  
Use the following facts to help you answer the questions:  
- Better Bank is in the banking and insurance industry.
- They offer products including Home Loans, Personal Loans, Credit Cards, Savings Accounts, Term Deposits and Insurance.
- Their competitors include Bank A, Bank B or Insurance Co A
 

Review the feedback [COMMENT] below, and summarise it, into a JSON schema as below:

{
    "Summary": <Generate a summary of the [comment] in less than 50 words>,
    "Competitor": <The first Better Bank competitor mentioned. Answer [N/A] if there are no mentions.>,
    "Products": <The main Better Bank product mentioned. Answer [N/A] if there are no mentions.>,
    "Classification": <Classify the [comment] into a single category: Wait time, Price, Rates, Customer Service, Premiums, Online platform, Other>
}

Double-check that the JSON schema is valid JSON.

The [COMMENT] is:
"""

print("\n=== Processing first", len(df_single_records), "single records >>>\n")

for index, row in df_single_records.iterrows():
    try:
        print(f"Summarising Survey record: {row['CommentID']}")
        df_single_records.at[index, 'JSON'] = call_gpt(make_messages(single_record_summarisation_prompt, ' COMMENT:' + row['Comment']))
    except Exception as e:
        print(f"Error summarising Survey record: {row['CommentID']}. Error: {e}")

        # remove the record from further processing
        df_single_records.drop(index, inplace=True)

        # copy the details into df_errors
        new_row = {'CommentID': row['CommentID'], 'ERROR': e}
        df_errors = pd.concat([df_errors, pd.DataFrame(new_row, index=[0])], ignore_index=True)

# if we got errors
if len(df_errors) > 0:
    FILENAME_ERRORS_SINGLE='output/errors_summary_single_records.csv'
    df_errors.to_csv(FILENAME_ERRORS_SINGLE, index=False)
    print('\n==== Wrote ERROR file:', FILENAME_ERRORS_SINGLE, '\n')


# For each row in the DF, check the JSON column contains valid JSON, and add each of the JSON keys as a new column
for index, row in df_single_records.iterrows():
    try:
        json_data = json.loads(row['JSON'])
        for key in json_data.keys():
            df_single_records.loc[index, key] = json_data[key]
        print(f"Converted JSON for Feedback comment: {row['CommentID']}")
    except Exception as e:
        print(f"Error {e} processing JSON for Survey comment: {row['CommentID']}:")
        print(row['JSON'])
        
        # Log the errors to a file  
        df_single_records.to_csv(FILENAME_ERRORS_SINGLE, index=False)
        print('\n==== Wrote file: ' + FILENAME_ERRORS_SINGLE)
        
        # We shouldn't fail the whole file because GPT mis-formatted one or more records
        continue

df_single_records.head(10)

## Drop the JSON column
df_single_records.drop(columns=['JSON'], inplace=True)

# write the selected_rows DF to a CSV file
FILENAME_SINGLE='output/AOAI-FeedbackSummary.csv'
df_single_records.to_csv(FILENAME_SINGLE, index=False)
print('\n==== Wrote file: ' + FILENAME_SINGLE)
print("==== THIS DOESN'T mean that all JSON was successful please check for ERRORS above ====")
print("(ignore any 'FutureWarning' messages)\n")


