# Inferring

Is like the model takes a text as input and performs some kind of analysis. This could be:

* Extracting labels
* Extracting names
* Sentiment analysis

In [48]:
import openai
import os
import creds
#import pandas as pd
import json

#from pandas import json_normalize
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

openai.api_key = creds.api_key

## Functions

A function is created in which the specified prompt will be executed.

In [49]:
def get_completion(promtp, model = "gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model = model,
        messages = messages,
        temperature = 0 # grado de aleatoriedad en el resultado del modelo
    )
    return response.choices[0].message["content"]

## Product review text

In [52]:
lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""


## Sentiment analysis

In [53]:
prompt = f"""
What is the sentiment of the following product review, which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

The sentiment of the product review is positive.


In [6]:
prompt = f"""
What is the sentiment of the following product review, which is delimited with triple backticks?

Give your answer as a single word, either "positive" \ or "negative".

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

positive


## Identify type of emotions

Large language models (LLM) are pretty good at extracting specific things out of a piece of text.

In [7]:
prompt = f"""
Identify a list of emotion that the writer of the following review is expressing. Include no more than five items in the list. Format your answer \ 
as a list of lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

satisfied, pleased, grateful, impressed, happy


### Identify specific emotions

#### Anger

In [8]:
prompt = f"""
Is the writer of the following review expressing anger? The review is delimited with triple backticks. Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

No


#### Delight

In [10]:
prompt = f"""
Is the writer of the following review expressing delight? The review is delimited with triple backticks.

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

Yes, the writer of the review is expressing delight. They mention that the company happily sent over a new string for the lamp and quickly provided a missing part. They also state that Lumina seems to be a great company that cares about their customers and products.


#### Boring

In [11]:
prompt = f"""
Is the writer of the following review expressing boring? The review is delimited with triple backticks.

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

No, the writer of the review is not expressing boredom. They are expressing satisfaction with the lamp and the company's customer service.


#### Satisfaction

In [14]:
prompt = f"""
Is the writer of the following review expressing satisfaction? The review is delimited with triple backticks.

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

Yes, the writer of the review is expressing satisfaction. They mention that the company sent a replacement for a broken string and quickly provided a missing part. They also state that Lumina seems to be a great company that cares about their customers and products.


## Information extraction

Is the part of NLP that relates to taking a piece of text and extracting certain things that you want to know from the text.

In [15]:
prompt = f"""
Identify the following items from the review text:

- Identify purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. Format your response as a JSON object with "Item" and "Brand" as the keys. \ 
If the information isn't present, use "unknown" as the value. Make your response as short as possible.

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

{
  "Item": "lamp",
  "Brand": "Lumina"
}


## Doing multiple tasks at once

In [25]:
prompt = f"""
Identify the following items from the review text:

- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. Format your response as a JSON object with "Sentiment", "Anger", "Item" and "Brand" as the keys. \ 
Is the information isn't present, use "unknown" as the value. Make your response as short as possible. Format the Anger value as boolean.

Review text: '''{lamp_review}'''
"""

response = get_completion(prompt)
print(response)

{
  "Sentiment": "positive",
  "Anger": false,
  "Item": "lamp",
  "Brand": "Lumina"
}


## Inferring topics

Given a long piece of text, identify the topics of this text. 

In [55]:
story = """
In a recent survey conducted by the government, 
public sector employees were asked to rate their level 
of satisfaction with the department they work at. 
The results revealed that NASA was the most popular 
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings, 
stating, "I'm not surprised that NASA came out on top. 
It's a great place to work with amazing people and 
incredible opportunities. I'm proud to be a part of 
such an innovative organization."

The results were also welcomed by NASA's management team, 
with Director Tom Johnson stating, "We are thrilled to 
hear that our employees are satisfied with their work at NASA. 
We have a talented and dedicated team who work tirelessly 
to achieve our goals, and it's fantastic to see that their 
hard work is paying off."

The survey also revealed that the 
Social Security Administration had the lowest satisfaction 
rating, with only 45% of employees indicating they were 
satisfied with their job. The government has pledged to 
address the concerns raised by employees in the survey and 
work towards improving job satisfaction across all departments.
"""

In [56]:
prompt = f"""
Determine 5 topics that are being discussed in the following text, which is delimited by triple backticks. \ 

Make each item 1 or 2 words long.

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""

response = get_completion(prompt)
print(response)

1. Government survey
2. Department satisfaction
3. NASA
4. Social Security Administration
5. Job satisfaction improvement


In [57]:
prompt = f"""
Determine 5 topics that are being discussed in the following text, which is delimited by triple backticks. Sort topics by relevance. \ 

Make each item one or two words long.

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""

response = get_completion(prompt)
print(response)

- Job satisfaction
- Government survey
- NASA
- Social Security Administration
- Employee satisfaction


Assume that we¿re a news website or something, and we track the next topics:

- nasa
- local government
- engineering
- employee satisfaction
- federal government.

We want to figure out, given a news article, which of these topics are covered in that news article.

In [58]:
topic_list = [
    "nasa", "local government", "engineering", 
    "employee satisfaction", "federal government"
]

In [65]:
prompt = f"""
Determine whether each item in the following list of topics is a topic in the text below, which is delimited with triple backticks. \ 

Give your answer as list with 0 or 1 for each topic. Give the name of each topic and separate with the answer with ":".\ 

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""

response = get_completion(prompt)
print(response)

nasa: 1
local government: 0
engineering: 0
employee satisfaction: 1
federal government: 1


The above in machine learning is sometimes called a "Zero-shot learning algorithm" because we didn't give it any training data that was label, so that's Zero-shot.

## Make a news alert for certain topic

In [66]:
topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")

ALERT: New NASA story!


## Experiment

In [67]:
prompt = f"""
Determine whether each item in the following list of topics is a topic in the text below, which is delimited with triple backticks. \ 

Give your answer as Json object with 0 or 1 for each topic. Give the name of each topic and separate with the answer with ":".\ 

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""

response = get_completion(prompt)
print(response)

{
  "nasa": 1,
  "local government": 0,
  "engineering": 0,
  "employee satisfaction": 1,
  "federal government": 1
}


In [70]:
topic_dict = json.loads(response)

if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")
if topic_dict['federal government'] == 1:
    print("ALERT: New FEDERAL Government story!")    

ALERT: New NASA story!
ALERT: New FEDERAL Government story!
