# :cook: <font color="orange">Inferring</font>


In this lesson, you will infer sentiment and topics from product reviews and news articles.


## ⚙️ Setup

In [1]:
from util import local_settings
from env_colors import TerminalTextColor as ttc
from openai import OpenAI

print("First LLM API example")
print(f"✅ OpenAI Key loaded ({local_settings.OPENAI_API_KEY[0:-15]}...)")

client = OpenAI(api_key=local_settings.OPENAI_API_KEY)

def get_completion(prompt, model="gpt-3.5-turbo", temperature=0, messages=None):
    if not messages:
        messages = [{"role": "user", "content": prompt}]

    completion = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )

    return completion.choices[0].message.content

sk-Yuu6ZbvcrmJ6aYp5IUQiT3BlbkFJMcylU
First LLM API example
✅ OpenAI Key loaded (sk-Yuu6ZbvcrmJ6aYp5IUQiT3BlbkFJMcylU...)


## A product review

In [2]:
lamp_review = """
Needed a nice lamp for my bedroom, and this one had
additional storage and not too high of a price point.
Got it fast.  The string to our lamp broke during the
transit and the company happily sent over a new one.
Came within a few days as well. It was easy to put
together.  I had a missing part, so I contacted their
support and they very quickly got me the missing piece!
Lumina seems to me to be a great company that cares
about their customers and products!!
"""

## Sentiment

If you want to extract a sentiment, positive or negative, of a piece of text, in the traditional machine learning workflow, you'd have to collect the label data set, train
a model, figure out how to deploy the model somewhere in the cloud and make inferences. 

And that could work pretty well, but it was, you know, just a lot of work to go through that process. 

And also for every task, such as sentiment versus extracting names versus
something else, you have to train and deploy a separate model. One of the really nice
things about large language model is that for many tasks like these, you can just write a prompt and have it start generating results pretty much right away. 

And that gives tremendous speed in terms of application development.

In [4]:
prompt = f"""
What is the sentiment of the following product review,
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""

print("--- Review (Text) ---")
print(lamp_review)

print("--- Sentiment ---")
response = get_completion(prompt)
print(response)

--- Review (Text) ---

Needed a nice lamp for my bedroom, and this one had
additional storage and not too high of a price point.
Got it fast.  The string to our lamp broke during the
transit and the company happily sent over a new one.
Came within a few days as well. It was easy to put
together.  I had a missing part, so I contacted their
support and they very quickly got me the missing piece!
Lumina seems to me to be a great company that cares 
about their customers and products!!

--- Sentiment ---
The sentiment of the product review is positive.


## Concise response - Sentiment

If you wanted to give a more concise response to make it easier for post-processing, I can take this prompt and add another instruction to give you answers to a single word, either positive or negative. So it just prints out positive like this, which makes it easier for a piece of text to take this output and process it and do something with it.

In [5]:
prompt = f"""
What is the sentiment of the following product review,
which is delimited with triple backticks?

Give your answer as a single word, either "positive" or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

positive


🤖 So, large language models are pretty good at extracting specific things out of a piece of text.

## Identify types of emotions

```"...Identify a list of emotions that the writer of the following review is expressing. Include no more than five items in this list."```

So, large language models are pretty good at extracting specific things out of a piece of text.

For a lot of customer support organizations, it's important to understand if a particular user is extremely upset.

In [6]:
prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

happy, satisfied, impressed, grateful, pleased


## <font color="red">Identify anger</font> :raised_eyebrow:

In [7]:
prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

No


In [8]:
text = "This lamp from the website is a total disaster! Shoddy build quality, false brightness claims, and a nightmare assembly process. Customer service is nonexistent. Save your money and sanity – avoid this deceptive purchase. There are better options elsewhere that won't leave you regretting every penny spent."

prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{text}'''
"""
response = get_completion(prompt)
print(response)

Yes


## Extract <font color="cyan">product</font>  and <font color="cyan">company name </font> from customer reviews

:point_right: information extraction is the part of NLP that relates to taking a piece of text and extracting certain things that you want to know from the text.

I'm going to ask it to format your response as a JSON object with "Item" and "Brand" as the keys. 

And so if I do that, it says the item is a lamp, the brand is Lumina, and you can easily load this into the Python dictionary to then do additional processing on this output.

In [10]:
prompt = f"""
Identify the following items from the review text:
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks.

Format your response as a JSON object with
"Item" and "Brand" as the keys.

If the information isn't present, use "unknown"
as the value.

Make your response as short as possible.

Review text: '''{lamp_review}'''
"""

print("--- REVIEW (Text) ---")
print(lamp_review)

print(f"--- RESPONSE ---")
response = get_completion(prompt)
print(response)

--- REVIEW (Text) ---

Needed a nice lamp for my bedroom, and this one had
additional storage and not too high of a price point.
Got it fast.  The string to our lamp broke during the
transit and the company happily sent over a new one.
Came within a few days as well. It was easy to put
together.  I had a missing part, so I contacted their
support and they very quickly got me the missing piece!
Lumina seems to me to be a great company that cares 
about their customers and products!!

--- RESPONSE ---
{
  "Item": "lamp",
  "Brand": "Lumina"
}


## <font color="cyan">Doing multiple tasks at once</font> 

So let's say ```identify the following items, extract sentiment, is the reviewer expressing anger, item purchased, company that made it```.

Then here I'm also going to tell it to format the anger value as a boolean value, and let me run that. And this outputs a JSON where sentiment is positive, anger, and then no quotes around false because it asked it to just output it as a boolean value.

In [11]:
prompt = f"""
Identify the following items from the review text:
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks.
Format your response as a JSON object with
"Sentiment", "Anger", "Item" and "Brand" as the keys.

If the information isn't present, use "unknown"
as the value.

Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

{
  "Sentiment": "positive",
  "Anger": false,
  "Item": "lamp",
  "Brand": "Lumina"
}


## <font color="cyan">Inferring Topics</font>

:point_right:  one of the cool applications I've seen of large language models is __**inferring topics**__. 

Given a long piece of text:
- what is this piece of text about? 
- What are the topics?

In [12]:
story = """
In a recent survey conducted by the government,
public sector employees were asked to rate their level
of satisfaction with the department they work at.
The results revealed that NASA was the most popular
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings,
stating, "I'm not surprised that NASA came out on top.
It's a great place to work with amazing people and
incredible opportunities. I'm proud to be a part of
such an innovative organization."

The results were also welcomed by NASA's management team,
with Director Tom Johnson stating, "We are thrilled to
hear that our employees are satisfied with their work at NASA.
We have a talented and dedicated team who work tirelessly
to achieve our goals, and it's fantastic to see that their
hard work is paying off."

The survey also revealed that the
Social Security Administration had the lowest satisfaction
rating, with only 45% of employees indicating they were
satisfied with their job. The government has pledged to
address the concerns raised by employees in the survey and
work towards improving job satisfaction across all departments.
"""

In [16]:
prompt = f"""
Determine five topics that are being discussed in the
following text, which is delimited by triple backticks.

Make each item one or two words long.

Output format:
Format your response as a list of items separated by commas like this:
Topic 1, Topic 2, ..., Topic n

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

Survey, Satisfaction, NASA, Social Security Administration, Job satisfaction


In [18]:
topic_list = response.split(sep=',')

topic_list

['Survey',
 ' Satisfaction',
 ' NASA',
 ' Social Security Administration',
 ' Job satisfaction']

## 💊 <font color="cyan">Make a news alert for certain topics</font>

In [24]:
topic_list = [
    "Nasa",
    "Artificial Intelligence", # <--
    "Medicine",             # <--
    "employee satisfaction",
    "federal government"
]
topic_list

['Nasa',
 'Artificial Intelligence',
 'Medicine',
 'employee satisfaction',
 'federal government']

In [25]:
prompt = f"""
Determine whether each item in the following list of
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as list with 0 or 1 for each topic.

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

[1, 0, 0, 1, 1]


In [30]:
r_list = eval(response)

print(f"Type of response: {type(response)}")
print(f"Type of r_list:   {type(r_list)}")

r_list

Type of response: <class 'str'>
Type of r_list:   <class 'list'>


[1, 0, 0, 1, 1]

In [31]:
topic_dict = {topic_list[i]: int(r_list[i]) for i in range(len(topic_list))}

print(topic_dict)

if topic_dict['Nasa'] == 1:
    print("ALERT: New NASA story!")

{'Nasa': 1, 'Artificial Intelligence': 0, 'Medicine': 0, 'employee satisfaction': 1, 'federal government': 1}
ALERT: New NASA story!
