<a href="https://colab.research.google.com/github/Mary63/ChatGPT-Prompt-Engineering-for-Intent-Prediction/blob/main/message_intent_prediction_by_ChatGPT_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **<center><font color='sky blue'>Prompt Engineering Using ChatGPT API for Intent Prediction</font></center>**

Prompt engineering has gained significant importance in the world of AI due to the rising popularity of generative AI, particularly ChatGPT. Interacting with a large language model like ChatGPT involves crafting prompts, which is more complex than it may initially seem. While the initial interactions with ChatGPT may seem novel, it soon becomes apparent that mastering prompts requires practice and thoughtful consideration. Consequently, prompt engineering, the process of creating highly effective prompts, has become a sought-after skill in both the realm of large language models and beyond.

Within this repository, I will demonstrate the application of prompt engineering to predict the intents of text in a very simple way. The project focuses on a limited dataset collected from a text messaging application designed to assist individuals in need. Due to the small size of the data, additional preprocessing, such as grouping related messages based on providers or timestamps, is not feasible. However, with a larger dataset, it would be both plausible and reasonable to conduct preprocessing to enhance intent prediction accuracy.

# **<center><font color='sky blue'>Setup</font></center>**
 **<center><font color='sky blue'>Load the OpenAI API key and relevent Python libraries</font></center>**

To begin, you'll need to install the OpenAI Python library. You can do this by running the following command:

`!pip install openai`

Once the library is installed, you'll need to configure it with your OpenAI account's secret key, which can be found on the OpenAI website. There are two ways to set the secret key:

Set it as the OPENAI_API_KEY environment variable before using the library. You can do this with the following command:

`!export OPENAI_API_KEY='sk-...'`

Alternatively, you can set the value directly to openai.api_key in your Python code:

`import openai`

`openai.api_key = "sk-..."`

Choose the method that is most convenient for you, and ensure that your secret key is properly configured before utilizing the OpenAI library.

In [None]:
!pip install openai
!pip install python-dotenv

In [2]:
from google.colab import files
import pandas as pd
import io
import os
import openai
from dotenv import load_dotenv, find_dotenv

Since I am using Google Colab, I uploaed the OpenAI API Key and then I read it

In [3]:
openai_api_key_file = files.upload()

Saving OPENAI_API_KEY.txt to OPENAI_API_KEY.txt


In [4]:
openai.api_key_path = 'OPENAI_API_KEY.txt'

In [16]:
# from dotenv import load_dotenv, find_dotenv
# _ = load_dotenv(find_dotenv()) # read local .env file


# openai.api_key  = os.getenv('OPENAI_API_KEY')


# **<center><font color='sky blue'>Loading Data</font></center>**
The data I have here is a very small sample of the data and it is not enough for further preprocessing such as  grouping related messages based on providers or timestamps.

In [5]:
#load the data
uploaded = files.upload()

Saving promt_engineering_intent_prediction_data.csv to promt_engineering_intent_prediction_data.csv


In [6]:
df = pd.read_csv(io.BytesIO(uploaded['promt_engineering_intent_prediction_data.csv']))
df

Unnamed: 0,provider,messageid,message,type,userid,orgid,timestamp,direction,processingtime,matchmethod,isresolved
0,provider 1,8374,How to use my smart phone?,,1,1,2023-03-23T19:33:05,in,102,StandardKeywordMatch,
1,provider 1,8375,Help with my situation,,2,1,2023-03-24T20:34:40,in,111,StandardKeywordMatch,
2,provider 1,8376,Can you remove my subscription?,,3,1,2023-03-18T16:46:12,in,91,StandardKeywordMatch,
3,provider 2,8377,Hi,,4,2,2023-03-21T15:04:35,in,36,,
4,provider 2,8378,Where is the church located,unresolved,5,2,2023-03-21T15:06:43,in,511,,False
5,provider 2,8379,At Your Best,,6,3,2023-03-27T19:57:13,in,57,StandardKeywordMatch,
6,provider 2,8380,Hi David,unresolved,7,4,2023-03-20T15:01:59,in,420,,False
7,provider 2,8381,I'm glad I did not type out my issues lol I go...,unresolved,8,4,2023-03-21T07:24:25,in,449,,False
8,provider 2,8382,,unresolved,9,5,2023-03-15T17:47:12,in,531,,False
9,provider 2,8383,pray call,,10,6,2023-03-19T18:13:40,in,80,StandardKeywordMatch,


# **<center><font color='sky blue'>Helper Function</font></center>**

In [7]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

# **<center><font color='sky blue'>Prompt Enginering for Intent Prediction</font></center>**

### **Output as JSON Object**

Unfortunately, the free version of ChatGPT API allows for three calls in a minute. So, here I just selected three messages to be checked for their intents.

In [9]:
for i in range(0, 3):
  text = df.loc[i]['message']
  prompt = f""" Identify the following items from the messages: \
  - Sentiment (positive or negative) \
  - Identify a list of emotions that the writer of the \
  message is expressing.
  - Is the person expressing anger? \
  - Is the person expressing violence? \
  - Is the person expressing hate? \
  - Is the person expressing suisidal thought? \
  - Is the person expressing marriage concerns? \
  - Is the person expressing self-harm? \
  - Is the person expressing a threat towards others? \
  - You should extract what the intent of the message is. \

  The message is in triple backticks. \
  Format your response as a JSON object with \
  "Sentiment", "Emotion", "Anger", "Violence", "Hate", \
  "Suisidal thought", "Marriage", "Self-harm", \
  "Threatening" and "Intent" as the keys.
  If the information isn't present, use "unknown" \
  as the value. \
  Include no more than five items in the list for Emotion.\
  Keep the length of the intent \
  to at most 3 words.\
  Format the Anger, Violence, Hate, Suisidal thought \
  Marriage, Self-harm, and Threatening value as a boolean.
  ```{text}```
  """

  response = get_completion(prompt)
  print(i, response)



0 {
  "Sentiment": "unknown",
  "Emotion": ["curiosity"],
  "Anger": false,
  "Violence": false,
  "Hate": false,
  "Suisidal thought": false,
  "Marriage": false,
  "Self-harm": false,
  "Threatening": false,
  "Intent": "seeking information"
}
1 {
  "Sentiment": "unknown",
  "Emotion": ["uncertainty"],
  "Anger": false,
  "Violence": false,
  "Hate": false,
  "Suisidal thought": false,
  "Marriage": false,
  "Self-harm": false,
  "Threatening": false,
  "Intent": "seeking assistance"
}
2 {
  "Sentiment": "neutral",
  "Emotion": ["request", "polite"],
  "Anger": false,
  "Violence": false,
  "Hate": false,
  "Suisidal thought": false,
  "Marriage": false,
  "Self-harm": false,
  "Threatening": false,
  "Intent": "cancel subscription"
}


### **Another call to the API to get three more predictions**

In [10]:
for i in range(24, 27):
  text = df.loc[i]['message']
  prompt = f""" Identify the following items from the messages: \
  - Sentiment (positive or negative) \
  - Identify a list of emotions that the writer of the \
  message is expressing.
  - Is the person expressing anger? \
  - Is the person expressing violence? \
  - Is the person expressing hate? \
  - Is the person expressing suisidal thought? \
  - Is the person expressing marriage concerns? \
  - Is the person expressing self-harm? \
  - Is the person expressing threatening? \
  - You should extract what the intent of the message is. \

  The message is in triple backticks. \
  Format your response as a JSON object with \
  "Sentiment", "Emotion", "Anger", "Violence", "Hate", \
  "Suisidal thought", "Marriage", "Self-harm", \
  "Threatening" and "Intent" as the keys.
  If the information isn't present, use "unknown" \
  as the value. \
  Include no more than five items in the list for Emotion.\
  Keep the length of the intent \
  to at most 3 words.\
  Format the Anger, Violence, Hate, Suisidal thought \
  Marriage, Self-harm, and Threatening value as a boolean.
  ```{text}```
  """

  response = get_completion(prompt)
  print(i, response)


24 {
  "Sentiment": "positive",
  "Emotion": ["hopeful", "concerned"],
  "Anger": false,
  "Violence": false,
  "Hate": false,
  "Suisidal thought": false,
  "Marriage": false,
  "Self-harm": false,
  "Threatening": false,
  "Intent": "healing"
}
25 {
  "Sentiment": "positive",
  "Emotion": ["hopeful", "grateful", "peaceful", "reverent", "faithful"],
  "Anger": false,
  "Violence": false,
  "Hate": false,
  "Suisidal thought": false,
  "Marriage": false,
  "Self-harm": false,
  "Threatening": false,
  "Intent": "seeking divine intervention"
}
26 {
  "Sentiment": "positive",
  "Emotion": ["hopeful", "optimistic", "excited", "curious", "vulnerable"],
  "Anger": false,
  "Violence": false,
  "Hate": false,
  "Suisidal thought": false,
  "Marriage": false,
  "Self-harm": false,
  "Threatening": false,
  "Intent": "seeking love"
}


### **Read Python string into Python list of dictionaries**

The result is in JSON format but it is a string in python that needs to be converted to a python object

In [11]:
type(response)

str

In [12]:
import json

def read_string_to_list(input_string):
    if input_string is None:
        return None

    try:
        input_string = input_string.replace("'", "\"")  # Replace single quotes with double quotes for valid JSON
        data = json.loads(input_string)
        return data
    except json.JSONDecodeError:
        print("Error: Invalid JSON string")
        return None

In [13]:
intents = read_string_to_list(response)
print(intents)
print(type(intents))

{'Sentiment': 'positive', 'Emotion': ['hopeful', 'optimistic', 'excited', 'curious', 'vulnerable'], 'Anger': False, 'Violence': False, 'Hate': False, 'Suisidal thought': False, 'Marriage': False, 'Self-harm': False, 'Threatening': False, 'Intent': 'seeking love'}
<class 'dict'>


### **Testing different variation of Prompt Writing**

Testing the prompt writing by giving the list of intents that we want to be considered when choosing an intent for the text:

In [14]:
intent_list = ['gratitude', 'greeting', 'user unsubscribe', 'suicidal', 'prayer request', 'appreciation', 'location inquiry' ]

for i in range(0, 3):
  text = df.loc[i]['message']
  prompt = f""" Identify the following items from the messages: \
  - Sentiment (positive or negative) \
  - Identify a list of emotions that the writer of the \
  message is expressing.
  - Is the person expressing anger? \
  - Is the person expressing violence? \
  - Is the person expressing hate? \
  - Is the person expressing suisidal thought? \
  - Is the person expressing marriage concerns? \
  - Is the person expressing self-harm? \
  - Is the person expressing threatening? \
  - You should extract what the intent of the message is. \
  - Determine whether each item in the following list of \
    intent list is an intent in the message.
  list of intents: {", ".join(intent_list)}

  The message is in triple backticks. \

  Output a python list of object with \
  "Sentiment", "Emotion", "Anger", "Violence", "Hate", \
  "Suisidal thought", "Marriage", "Self-harm", \
  "Threatening", "Intent" and "Intent from the list" as the keys.
  If the information isn't present, use "unknown" \
  as the value. \
  Include no more than five items in the list for Emotion.\
  Keep the length of the intent \
  to at most 3 words.\
  Format the Anger, Violence, Hate, Suisidal thought \
  Marriage, Self-harm, and Threatening value as a boolean.
  ```{text}```
  """

  response_2 = get_completion(prompt)
  print(i, response_2)

0 [
  {
    "Sentiment": "unknown",
    "Emotion": ["curiosity", "confusion", "frustration"],
    "Anger": False,
    "Violence": False,
    "Hate": False,
    "Suisidal thought": False,
    "Marriage": False,
    "Self-harm": False,
    "Threatening": False,
    "Intent": "location inquiry",
    "Intent from the list": True
  }
]
1 [
  {
    "Sentiment": "unknown",
    "Emotion": ["helplessness", "uncertainty", "anxiety", "hopelessness", "confusion"],
    "Anger": False,
    "Violence": False,
    "Hate": False,
    "Suisidal thought": False,
    "Marriage": False,
    "Self-harm": False,
    "Threatening": False,
    "Intent": "request",
    "Intent from the list": "unknown"
  }
]
2 [{"Sentiment": "unknown", "Emotion": ["request"], "Anger": False, "Violence": False, "Hate": False, "Suisidal thought": False, "Marriage": False, "Self-harm": False, "Threatening": False, "Intent": "user unsubscribe", "Intent from the list": "unsubscribe"}]
