# Inferring
In this lesson, you will infer sentiment and topics from product reviews and news articles.

## Setup

In [14]:
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')


In [15]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]


## Product review text

In [3]:
lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""


In [4]:
lamp_review_2 = """
我需要一盞適合臥室的好燈，這款還有額外的儲物空間，\
價格也不太高。很快就收到了，兩天就送達。\
運輸過程中，燈的拉繩斷了，公司樂意地重新寄送了一盞新燈，幾天內也送到了。\
組裝起來很簡單。然後我發現少了一個零件，所以我聯繫了他們的客服，他們很快就給我寄來了缺失的零件！\
在我看來，這是一家非常關心他們的客戶和產品的好公司!!
"""


## Sentiment (positive/negative)

In [5]:
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)


The sentiment of the product review is positive.


In [6]:
prompt = f"""
請問以下產品評論的情感傾向是什麼？評論內容由三個反引號分隔。

評論文字： '''{lamp_review_2}'''
"""
response = get_completion(prompt)
print(response)


正向。


In [7]:
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Give your answer as a single word, either "positive" \
or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)


positive


In [8]:
prompt = f"""
以下產品評論的情感傾向是什麼，評論內容由三個反引號分隔？

請用一個詞回答，"正面" 或 "負面"。

評論文字： '''{lamp_review_2}'''
"""
response = get_completion(prompt)
print(response)


正面


## Identify types of emotions

In [9]:
prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)


happy, satisfied, grateful, impressed, content


In [10]:
prompt = f"""
請確定以下評論的作者所表達的情感列表。列表中最多包含五項。將你的答案格式化為一串以逗號分隔的小寫單詞。

評論文字： '''{lamp_review_2}'''
"""
response = get_completion(prompt)
print(response)


滿意,感激,信任,興奮,讚賞


## Identify anger

In [11]:
prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)


No


In [12]:
prompt = f"""
以下評論的作者是否表達了憤怒？評論內容由三個反引號分隔。請用「是」或「否」來回答。

評論文字: '''{lamp_review_2}'''
"""
response = get_completion(prompt)
print(response)


否


## Extract product and company name from customer reviews

In [16]:
prompt = f"""
Identify the following items from the review text: 
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Item" and "Brand" as the keys. 
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
  
Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)


{
  "Item": "lamp",
  "Brand": "Lumina"
}


In [17]:
prompt = f"""
從評論文字中識別以下項目：
- 評論者購買的商品
- 製造該商品的公司

評論內容由三個反引號分隔。請將您的回答格式化為JSON對象，其中"Item"和"Brand"作為鍵。
如果信息不存在，請將值設置為"unknown"。請儘可能簡短地回答。

評論文字: '''{lamp_review_2}'''
"""
response = get_completion(prompt)
print(response)


{
    "Item": "燈",
    "Brand": "unknown"
}


## Doing multiple tasks at once

In [18]:
prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)


{
  "Sentiment": "positive",
  "Anger": false,
  "Item": "lamp with additional storage",
  "Brand": "Lumina"
}


In [19]:
prompt = f"""
"從評論文字中識別以下項目：
- 情感傾向（正面或負面）
- 評論者是否表達了憤怒？（真或假）
- 評論者購買的商品
- 製造該商品的公司

評論內容由三個反引號分隔。請將您的回答格式化為JSON對象，其中"Sentiment"，"Anger"，"Item"和"Brand"作為鍵。
如果信息不存在，請將值設置為"unknown"。請儘可能簡短地回答。將Anger的值格式化為布林值。

評論文字：" '''{lamp_review_2}'''
"""
response = get_completion(prompt)
print(response)


{
    "Sentiment": "正面",
    "Anger": false,
    "Item": "燈",
    "Brand": "unknown"
}


## Inferring topics

In [20]:
story = """
In a recent survey conducted by the government, 
public sector employees were asked to rate their level 
of satisfaction with the department they work at. 
The results revealed that NASA was the most popular 
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings, 
stating, "I'm not surprised that NASA came out on top. 
It's a great place to work with amazing people and 
incredible opportunities. I'm proud to be a part of 
such an innovative organization."

The results were also welcomed by NASA's management team, 
with Director Tom Johnson stating, "We are thrilled to 
hear that our employees are satisfied with their work at NASA. 
We have a talented and dedicated team who work tirelessly 
to achieve our goals, and it's fantastic to see that their 
hard work is paying off."

The survey also revealed that the 
Social Security Administration had the lowest satisfaction 
rating, with only 45% of employees indicating they were 
satisfied with their job. The government has pledged to 
address the concerns raised by employees in the survey and 
work towards improving job satisfaction across all departments.
"""

In [21]:
story_2 = """
在政府最近進行的一項調查中，公共部門員工被問及他們對自己所在部門的滿意程度。\
結果顯示，美國國家航空航天局（NASA）以95%的滿意度評分成為最受歡迎的部門。\

NASA的員工約翰·史密斯對調查結果發表評論，他表示："我並不驚訝NASA會排名首位。\
這裡是一個與優秀人才共事、提供無比機會的工作場所。我為能成為這樣一個創新機構的一部分而感到驕傲。" \

NASA的管理團隊也對這些結果表示歡迎，總監湯姆·約翰遜表示："我們很高興聽到我們的員工對於在NASA的工作感到滿意。\
我們有一個才華橫溢且奉獻精神的團隊，他們為實現我們的目標而不懈努力，看到他們的辛勤工作得到回報，真是太棒了。" \

調查同時揭示，社會保障局的滿意度評分最低，只有45%的員工表示他們對工作感到滿意。\
政府承諾將解決員工在調查中提出的問題，並致力於提高所有部門的工作滿意度。"
"""


## Infer 5 topics

In [22]:
prompt = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.

Make each item one or two words long. 

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)


government survey, job satisfaction, NASA, Social Security Administration, employee concerns


In [23]:
response.split(sep=',')

['government survey',
 ' job satisfaction',
 ' NASA',
 ' Social Security Administration',
 ' employee concerns']

In [24]:
topic_list = [
    "nasa", "local government", "engineering", 
    "employee satisfaction", "federal government"
]

In [35]:
prompt = f"""
確定以下由三個反引號分隔的文字中正在討論的五個主題。

讓每個項目長度為一到兩個詞。

將您的回答由逗號分隔。

文字範例： '''{story_2}'''
"""
response = get_completion(prompt)
print(response)
# 將您的回答格式化為由逗號分隔的項目列表。


1. 調查結果
2. NASA的滿意度
3. NASA員工的評論
4. NASA管理團隊的回應
5. 社會保障局的滿意度


In [42]:
topic_list_2 = topic_list_2 = [u'nasa', u'地方政府', u'工程學', u'員工滿意度', u'聯邦政府']


## Make a news alert for certain topics

In [26]:
prompt = f"""
Determine whether each item in the following list of \
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as list with 0 or 1 for each topic.\

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)


nasa: 1
local government: 0
engineering: 0
employee satisfaction: 1
federal government: 1


In [27]:
topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")
    

ALERT: New NASA story!


In [43]:
prompt = f"""
確定以下主題列表中的每個項目是否是以下由三個反引號分隔的文本中的主題。

對於每個主題，用 0 或 1 來表示您的答案。

主題列表： {", ".join(topic_list_2)}

Text sample: '''{story_2}'''
"""
response = get_completion(prompt)
print(response)


nasa: 1
地方政府: 0
工程學: 0
員工滿意度: 1
聯邦政府: 1


In [44]:
topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("警告：新的NASA故事！")
    

警告：新的NASA故事！


## Try experimenting on your own!