# Inferring
In this lesson, you will infer sentiment and topics from product reviews and news articles.

## Setup

In [1]:
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

In [2]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

## Product review text

In [3]:
lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""

In [16]:
chatgpt_review = """
Brains aren't just a neural network. They consist of very chaotically connected expert systems, all of which serve one specific purpose. GI (without the A) results from the interactions between the brains expert systems. One for language, one for hunger, one for emotions, etc. When it call comes together, it results in cognition. That is so fundamentally different from how a computational neural network works as a data structure, in its extremely structured and clinically perfect way. Real neurons don't have activation functions, they have voltage potential. The propagation of signals in the brain is subject to dozens of chemicals, if not hundreds. Every neuron has a myriad of receptors for cannabinoids, neurotransmitters, and so on. If you want to simulate a real brain, you also need to simulate a physical environment for it. A simulation that includes temperature, blood content, blood pressure, chemical presence and so on. These chemicals are needed in the brain to regulate emotions and facilitate focus and alertness. The absence of all chemical influences on a neural network might keep it from ever developing AGI. We don't know these things, so I find it a little naive to claim that GPT-4 is anywhere even remotely, close to AGI or what we would consider a brain.
> It fundamentally does not get more intelligent after training is done.
"""

## Sentiment (positive/negative)

In [4]:
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

The sentiment of the product review is positive.


In [17]:
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: '''{chatgpt_review}'''
"""

response2 = get_completion(prompt)
print(response2)

The sentiment of the product review is negative.


In [5]:
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Give your answer as a single word, either "positive" \
or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

positive


## Identify types of emotions

In [6]:
prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

satisfied, pleased, grateful, impressed, happy


In [18]:
prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{chatgpt_review}'''
"""
response2 = get_completion(prompt)
print(response2)

chaotic, specific, different, perfect, naive


First of all, none of the output are emotions. They are actually adjectives.

And second, not all of the adjectives match what the comment talks about. The comment is: specific and different, but not chaotic, perfect nor naive.

## Identify anger

In [7]:
prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

No


In [20]:
prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{chatgpt_review}'''
"""
response = get_completion(prompt)
print(response)

No


## Extract product and company name from customer reviews

In [8]:
prompt = f"""
Identify the following items from the review text: 
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Item" and "Brand" as the keys. 
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
  
Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

{

  "Item": "lamp",

  "Brand": "Lumina"

}


In [21]:
prompt = f"""
Identify the following items from the review text: 
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Item" and "Brand" as the keys. 
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
  
Review text: '''{chatgpt_review}'''
"""
response = get_completion(prompt)
print(response)

{

  "Item": "GPT-4",

  "Brand": "unknown"

}


Yes, correct!

In [23]:
chatgpt_review2 = """
I think we have a good chance for Gemini being a massive breakthrough. Deepmind is to be taken seriously. If they really combine their Alpha X techniques with LLMs and go for scale, I'm not sure what the result will be. If I understand Demis Hassabis correctly, they are about to not just take the next token, but to perform extensive tree search, before deiciding which path to go down. In his words (paraphrasing): Not just the most probable token, but the best one.\n\nI'm holding back on the hype (and sheer potential terror of it), but it might be big. It might also turn out to be nothing. Engineering God is hard work. There are bound to be issues on the way.
"""

In [26]:
prompt = f"""
Identify the following items from the review text: 
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Item" and "Brand" as the keys. 
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
  
Review text: '''{chatgpt_review2}'''
"""
response = get_completion(prompt)
print(response)

{

  "Item": "unknown",

  "Brand": "unknown"

}


Too bad this is wrong. The answer should have been Gemini and DeepMind. So this is a window into the type of understanding this model has. And therefore, opportunity for interpreting/retraining.

## Doing multiple tasks at once

In [9]:
prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

{

  "Sentiment": "positive",

  "Anger": false,

  "Item": "lamp",

  "Brand": "Lumina"

}


In [27]:
prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{chatgpt_review}'''
"""
response = get_completion(prompt)
print(response)

{

  "Sentiment": "negative",

  "Anger": false,

  "Item": "unknown",

  "Brand": "unknown"

}


In [28]:
prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{chatgpt_review2}'''
"""
response = get_completion(prompt)
print(response)

{

  "Sentiment": "positive",

  "Anger": false,

  "Item": "unknown",

  "Brand": "unknown"

}


## Inferring topics

In [10]:
story = """
In a recent survey conducted by the government, 
public sector employees were asked to rate their level 
of satisfaction with the department they work at. 
The results revealed that NASA was the most popular 
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings, 
stating, "I'm not surprised that NASA came out on top. 
It's a great place to work with amazing people and 
incredible opportunities. I'm proud to be a part of 
such an innovative organization."

The results were also welcomed by NASA's management team, 
with Director Tom Johnson stating, "We are thrilled to 
hear that our employees are satisfied with their work at NASA. 
We have a talented and dedicated team who work tirelessly 
to achieve our goals, and it's fantastic to see that their 
hard work is paying off."

The survey also revealed that the 
Social Security Administration had the lowest satisfaction 
rating, with only 45% of employees indicating they were 
satisfied with their job. The government has pledged to 
address the concerns raised by employees in the survey and 
work towards improving job satisfaction across all departments.
"""

In [29]:
story2 = """
Joking aside, if you can't tell the difference, does it matter? Maybe quantity will bring quality. Enough self referencing and agents going around and who knows.
"spicy" does a lot of work here.  As I understand it (which is minimal) the model comprises layers of transformers that handle varying contextual scopes and the vast quantities of training data and parameters of the system result in emergent capability we can't really explain. We can explain autocorrect.  People like to focus on the autoregressive decoding aspect and say silly things like "it's just predicting the next word", but this predicting is done by the model using the context of every word so far in the conversation. The "bag of words" it comes up with and assigns probabilities to are chosen carefully and in ways we don't understand.
Bard sucks. Unless it's gotten better in the past couple weeks. Maybe I was too hard on it, but I gave it some pretty simple coding questions and it has absolutely no idea what was going on. It's got a long way to go, at least for my purposes.
>AGI is decades, if not centuries away
"""

## Infer 5 topics

In [11]:
prompt = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.

Make each item one or two words long. 

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

1. Government survey

2. Department satisfaction rating

3. NASA

4. Social Security Administration

5. Job satisfaction improvement


In [31]:
prompt = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.

Make each item one or two words long. 

Format your response as a list of items separated by commas.

Text sample: '''{story2}'''
"""
response = get_completion(prompt)
print(response)

1. Autoregressive decoding

2. Transformers

3. Training data

4. AGI (Artificial General Intelligence)

5. Bard (referring to a specific program or system)


Pretty accurate. Though items 4 and 5 are not two words long, but helpful when answering first requirement of ask.

In [12]:
response.split(sep=',')

['1. Government survey\n2. Department satisfaction rating\n3. NASA\n4. Social Security Administration\n5. Job satisfaction improvement']

In [13]:
topic_list = [
    "nasa", "local government", "engineering", 
    "employee satisfaction", "federal government"
]

## Make a news alert for certain topics

In [14]:
prompt = f"""
Determine whether each item in the following list of \
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as list with 0 or 1 for each topic.\

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

[1, 0, 0, 1, 1]


In [15]:
topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")

IndexError: list index out of range

## Try experimenting on your own!