# Inferring
In this lesson, you will infer sentiment and topics from product reviews and news articles.

## Setup

In [20]:
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.organization = os.getenv('OPENAI_ORGANIZATION')
openai.api_key  = os.getenv('OPENAI_API_KEY')

In [2]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

## Product review text

In [4]:
lamp_review = """
Needed a nice lamp for my bedroom, and this one had
additional storage and not too high of a price point.
Got it fast.  The string to our lamp broke during the
transit and the company happily sent over a new one.
Came within a few days as well. It was easy to put
together.  I had a missing part, so I contacted their
support and they very quickly got me the missing piece!
Lumina seems to me to be a great company that cares
about their customers and products!!
"""

## Sentiment (positive/negative)

In [5]:
prompt = f"""
What is the sentiment of the following product review,
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

The sentiment of the product review is positive.


In [6]:
prompt = f"""
What is the sentiment of the following product review,
which is delimited with triple backticks?

Give your answer as a single word, either "positive"
or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

positive


### Test mixed sentiment

In [8]:
prompt = f"""
What is the sentiment of the following product review,
which is delimited with triple backticks?

Give your answer as a single word, either "positive", "mixed"
or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

Positive


## Identify types of emotions

In [9]:
prompt = f"""
Identify a list of emotions that the writer of the
following review is expressing. Include no more than
five items in the list. Format your answer as a list of
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

satisfied, pleased, grateful, impressed, content


## Identify anger

In [10]:
prompt = f"""
Is the writer of the following review expressing anger?
The review is delimited with triple backticks.
Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

No


## Extract information

In [11]:
prompt = f"""
If the writer of the following review suggests improvement that
the company provide, extract it and show it.
The review is delimited with triple backticks.

Review text: ```{lamp_review}```
"""

response = get_completion(prompt)
print(response)

Improvement suggestion: None. The review does not suggest any improvements for the company.


In [13]:
prompt = f"""
In the following review, why did the customer contact support? Please provide details.
The review is delimited with triple backticks.

Review text: ```{lamp_review}```
"""

response = get_completion(prompt)
print(response)

The customer contacted support because they had a missing part when putting the lamp together.


In [14]:
## Extract product and company name from customer reviews

prompt = f"""
Identify the following items from the review text:
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks.
Format your response as a JSON object with
"Item" and "Brand" as the keys.
If the information isn't present, use "unknown"
as the value.
Make your response as short as possible.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

{
  "Item": "lamp",
  "Brand": "Lumina"
}


## Doing multiple tasks at once

In [15]:
prompt = f"""
Identify the following items from the review text:
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks.
Format your response as a JSON object with
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown"
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

{
    "Sentiment": "positive",
    "Anger": false,
    "Item": "lamp with additional storage",
    "Brand": "Lumina"
}


In [16]:
prompt = f"""
Identify the following items from the review text:
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks.
Format your response as a YAML object with
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown"
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

```yaml
Sentiment: positive
Anger: false
Item: lamp with additional storage
Brand: Lumina
```


In [0]:
amazon_review = """
Amateurs de cafés sans être de fins connaisseurs mon conjoint et moi sommes des consommateurs réguliers de Starbucks comme de café Nespresso depuis plusieurs années.
Alors une fois le partenaria lancé entre les deux marques on ne pouvait que tester !

Le positif de cette expérience :
- Avec une machine Inissia Nespresso les dosettes sont parfaitement adaptées.
- Le goût : on valide sans s'extasier. C'est assez personnel les goûts de chacun mais si on apprécie les arômes de ces dosettes assez corsées (force 6 à 11 si vous voulez un repère Nespresso) on n'a pas vraiment senti les "note de noisette" ou "note de caramel" mises en avant sur les paquets.
En revanche le "Starbucks Blonde" (jaune) a été pour moi un vrai coup de cœur. Testé en espresso, lungo et même avec une pointe de lait c'est un délice ! Ajoutez une mousse de lait et vous êtes chez Starbucks !
Pour mon conjoint, c'est le house Blend (maron clair) qui a été le plus apprécié. Vendu avec "une note de caramel" qu'il n'a pas sentie, il a particulièrement apprécié la douceur de la gorgée puis l'arrière goût plus fort qui arrive après.
Bon chacun ses préférences et ses goûts mais tout cela pour dire que ça été une bonne surprise finalement.
- l’échantillonnage : 8 sortes de cafés différents dans une offre découverte intéressante et variée. Attention toutefois à ceux qui aiment les cafés pas trop forts car ils sont classés force 6 à 11. Je nuancerai tout de même cette remarque sur la force des cafés car étant habituée aux cafés Cosi et Volluto (force 4) de Nespresso j'ai vraiment apprécié le Starbucks Blonde force 6 qui n'était pas si fort que ça.
Il y en a donc pour tous les goûts selon moi dans ce pack et les dosettes de décaféiné sont un plus (en particulier quand on a de la visite).
Le rapport qualité prix est bon 80 dosettes pour 26€ c'est moins que pour les capsules standard.

On aime moins :
- Comme pour les dosettes Nespresso le gros point noir c'est l’aluminium ! Oui les dosettes se recyclent, oui c'est le matériau qui conserve le mieux l'arôme du café, mais ça reste mauvais pour l'environnement comme pour la santé. Et oui il existe des marques qui proposent des dosettes dans d'autres matériaux mais ces capsules n'étant pas parfaitement adaptées à la machine elles réduisent sa durée de vie et la qualité du café n'est pas la même. A quand les dosettes Nespresso écolo ? :)

Malgré cela, on craque parce que le café... c'est notre point faible !
Ces dosettes ? On les recommande à tous les curieux, les amateurs de café Starbucks, Nespresso ou juste de café un peu corsé. On a aimé accompagner d'une noisette de lait, et un petit biscuit (spéculoos) ou d'un carré de chocolat noir. Elles suscitent la curiosité des proches quand on reçoit, et ont été appréciées par ceux qui les ont goûtées autour de nous.

Reçu rapidement (prime) et en bon état, rien à redire sur la livraison. On recommande !!
"""

In [19]:
prompt = f"""
Quelles sont les fautes de français dans la revue de produit suivante?

La revue est délimitée par des triples apostrophes.

Revue : '''{amazon_review}'''
"""

response = get_completion(prompt)
print(response)

- "mon conjoint et moi sommes des consommateurs réguliers de Starbucks comme de café Nespresso" : il manque l'article devant "café Nespresso", il faudrait écrire "du café Nespresso".
- "Alors une fois le partenaria lancé" : il manque une lettre, il faudrait écrire "partenariat".
- "on valide sans s'extasier" : il manque l'article devant "on", il faudrait écrire "nous validons".
- "si on apprécie les arômes de ces dosettes assez corsées" : il manque l'article devant "si", il faudrait écrire "si l'on apprécie".
- "on n'a pas vraiment senti les 'note de noisette' ou 'note de caramel'" : il manque l'article devant "note", il faudrait écrire "les notes de noisette" et "les notes de caramel".
- "c'est le house Blend (maron clair)" : il manque une lettre, il faudrait écrire "le House Blend (marron clair)".
- "Bon chacun ses préférences et ses goûts" : il manque l'article devant "chacun", il faudrait écrire "Eh bien, chacun a ses préférences et ses goûts".
- "ça été une bonne surprise" : il y 

In [25]:
prompt = f"""
Identifiez les éléments suivants dans le texte de la revue :
- Sentiment (positif ou négatif)
- L'auteur du commentaire exprime-t-il de la colère ? (vrai ou faux)
- Article acheté par l'auteur de l'évaluation
- Prix de l'article
- Entreprise qui a fabriqué l'article

L'avis est délimité par des triples apostrophes.
Formulez votre réponse sous la forme d'un objet JSON contenant les éléments suivants
"Sentiment", "Colère", "Article", "Prix" et "Marque" comme clés.
Si l'information n'est pas présente, utilisez "inconnu"
comme valeur.
Faites en sorte que votre réponse soit aussi courte que possible.
Formulez la valeur "Colère" comme une valeur booléenne.

Texte de la critique : '''{amazon_review}'''
"""

response = get_completion(prompt)
print(response)

{
  "Sentiment": "positif",
  "Colère": false,
  "Article": "dosettes de café Starbucks pour machine Nespresso",
  "Prix": "26€ pour 80 dosettes",
  "Marque": "Starbucks et Nespresso"
}


## Inferring topics

In [27]:
story = """
In a recent survey conducted by the government,
public sector employees were asked to rate their level
of satisfaction with the department they work at.
The results revealed that NASA was the most popular
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings,
stating, "I'm not surprised that NASA came out on top.
It's a great place to work with amazing people and
incredible opportunities. I'm proud to be a part of
such an innovative organization."

The results were also welcomed by NASA's management team,
with Director Tom Johnson stating, "We are thrilled to
hear that our employees are satisfied with their work at NASA.
We have a talented and dedicated team who work tirelessly
to achieve our goals, and it's fantastic to see that their
hard work is paying off."

The survey also revealed that the
Social Security Administration had the lowest satisfaction
rating, with only 45% of employees indicating they were
satisfied with their job. The government has pledged to
address the concerns raised by employees in the survey and
work towards improving job satisfaction across all departments.
"""

## Infer 5 topics

In [28]:
prompt = f"""
Determine five topics that are being discussed in the
following text, which is delimited by triple backticks.

Make each item one or two words long.

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

government survey, job satisfaction, NASA, Social Security Administration, employee satisfaction


In [29]:
response.split(sep=', ')

['government survey',
 'job satisfaction',
 'NASA',
 'Social Security Administration',
 'employee satisfaction']

In [30]:
topic_list = [
    "nasa", "local government", "engineering",
    "employee satisfaction", "federal government"
]

## Make a news alert for certain topics

In [31]:
prompt = f"""
Determine whether each item in the following list of
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as list with 0 or 1 for each topic.

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

nasa: 1
local government: 0
engineering: 0
employee satisfaction: 1
federal government: 1


In [32]:
topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")

ALERT: New NASA story!


## Try experimenting on your own!

### Output JSON for news alert

In [33]:
prompt = f"""
Determine whether each item in the following list of
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as a JSON object with 0 or 1 for each topic.

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

{
  "nasa": 1,
  "local government": 0,
  "engineering": 0,
  "employee satisfaction": 1,
  "federal government": 1
}


In [34]:
import json
topic_dict = json.loads(response)
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")

ALERT: New NASA story!
