# **Inferring**
In this lesson, you will infer sentiment and topics from product reviews and news articles.

## Setup

In [16]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')

In [17]:
client = OpenAI(
    # This is the default and can be omitted
    api_key=OPENAI_API_KEY,
)

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message.content

## Product review text

In [18]:
lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""

## Sentiment (positive/negative)

In [None]:
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

In [None]:
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Give your answer as a single word, either "positive" \
or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

## Identify types of emotions

In [None]:
prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

## Identify anger

In [None]:
prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

## Extract product and company name from customer reviews

In [None]:
prompt = f"""
Identify the following items from the review text: 
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Item" and "Brand" as the keys. 
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
  
Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

## Doing multiple tasks at once

In [None]:
prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

## Inferring Text Topics
Another application inferring by an LLM is deducing topics from a lengthy piece of text.

This time, the sample is regarding a fictitious newspaper article about a survey conducted by the government measuring the satisfaction rate of workers in government agencies. The results reveal that NASA workers had the highest satisfaction rating.Inferring Text Topics
Another application inferring by an LLM is deducing topics from a lengthy piece of text.

This time, the sample is regarding a fictitious newspaper article about a survey conducted by the government measuring the satisfaction rate of workers in government agencies. The results reveal that NASA workers had the highest satisfaction rating.

In [25]:
story = """
In a recent survey conducted by the government, 
public sector employees were asked to rate their level 
of satisfaction with the department they work at. 
The results revealed that NASA was the most popular 
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings, 
stating, "I'm not surprised that NASA came out on top. 
It's a great place to work with amazing people and 
incredible opportunities. I'm proud to be a part of 
such an innovative organization."

The results were also welcomed by NASA's management team, 
with Director Tom Johnson stating, "We are thrilled to 
hear that our employees are satisfied with their work at NASA. 
We have a talented and dedicated team who work tirelessly 
to achieve our goals, and it's fantastic to see that their 
hard work is paying off."

The survey also revealed that the 
Social Security Administration had the lowest satisfaction 
rating, with only 45% of employees indicating they were 
satisfied with their job. The government has pledged to 
address the concerns raised by employees in the survey and 
work towards improving job satisfaction across all departments.
"""

Five topics discussed in the article are requested from the model in a format that each item is one or two words long and in a comma-separated list. ChatGPT returns the topics as government surveys, job satisfaction, NASA, etc.

In [None]:
prompt = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.

Make each item one or two words long. 

Format your response as a list of items separated by commas without numbering them.

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

In [None]:
response.split(sep=', ')

## Make a news alert for certain topics

The final sample application is about the selection of topics that a text covers, among a targeted topics list. Initially, the list of possible topics is defined:The final sample application is about the selection of topics that a text covers, among a targeted topics list. Initially, the list of possible topics is defined:

In [28]:
topic_list = [
    "nasa", "local government", "engineering", 
    "employee satisfaction", "federal government"
]

In [None]:
# First, let's modify the prompt to get a cleaner response format
prompt = f"""
Determine whether each item in the following list of topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as comma-separated pairs of topic and number (0 or 1), with no spaces after commas.
For example: nasa:1,local government:0,engineering:0

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

In [None]:
print("\nResponse items:")
print("-" * 30)
for i in response.replace('{', '').replace('}', '').split(','):
    # Split into key and value
    key_value = i.strip().strip('"').split(':')
    if len(key_value) == 2:
        key = key_value[0].strip().strip('"')
        value = key_value[1].strip()
        print(f"{key:<20} : {value}")

In [None]:
# Now we can safely parse the response
topic_dict = {i.split(':')[0]: int(i.split(':')[1]) for i in response.split(',')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")

# Exercise
 - Complete the prompts similar to what we did in class. 
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

## Version 1: Analyze the emotional undertones and workplace culture

In [None]:
# Version 1: Analyze the emotional undertones and workplace culture
prompt_v1 = f"""
Analyze the following text and provide:
1. The dominant emotions expressed by employees (list up to 3)
2. Key aspects of workplace culture mentioned
3. Rate the overall workplace atmosphere on a scale of 1-10

Format your response as:
EMOTIONS: emotion1, emotion2, emotion3
CULTURE: aspect1, aspect2, aspect3
ATMOSPHERE_SCORE: X/10

Text: '''{story}'''
"""
response = get_completion(prompt_v1)
print(response)



## Version 2: Compare and contrast departments

In [None]:
# Version 2: Compare and contrast departments
prompt_v2 = f"""
From the text below, create a comparison table between NASA and the Social Security Administration.
Format your response as:
CATEGORY | NASA | SSA
Satisfaction Rate: X% | X%
Employee Morale: (High/Medium/Low) | (High/Medium/Low)
Management Response: (Positive/Negative/None) | (Positive/Negative/None)

Text: '''{story}'''
"""
response = get_completion(prompt_v2)
print(response)



## Version 3: Future implications analysis

In [None]:
# Version 3: Future implications analysis
prompt_v3 = f"""
Based on the news story below, predict:
1. Three likely changes that might be implemented in government departments
2. Potential impact on employee retention (in 10 words or less)
3. One specific recommendation for improving satisfaction at SSA

Format as:
CHANGES:
- Change 1
- Change 2
- Change 3

RETENTION IMPACT: [your 10-word prediction]

SSA RECOMMENDATION: [your specific recommendation]

Text: '''{story}'''
"""
response = get_completion(prompt_v3)
print(response)

## Lab Report: Inferring with GPT
## Experiment Summary and Findings

### Overview
This report summarizes experiments conducted using GPT to analyze text and infer various aspects including sentiment, topics, and workplace dynamics from a given story about NASA and government employee satisfaction.

### Experimental Approaches

1. **Basic Topic Analysis**
   - Tested GPT's ability to identify main topics from text
   - Successfully identified key themes like "NASA", "employee satisfaction"
   - Format: Simple comma-separated list
   - Accuracy: Generally reliable for obvious topics

2. **Sentiment and Emotion Analysis**
   - Analyzed emotional undertones and workplace culture
   - Format: Structured output with emotions, culture aspects, and ratings
   - Results were consistent with human interpretation
   - Particularly strong at identifying positive sentiments

3. **Comparative Department Analysis**
   - Created structured comparisons between departments
   - Format: Table-style output comparing metrics
   - Effectively highlighted contrasts between agencies

### Challenges and Limitations

1. **Format Consistency**
   - Initial JSON formatting caused parsing issues
   - Solution: Simplified to basic comma-separated format
   - Learning: Clear, simple formatting instructions work better

2. **Data Interpretation**
   - GPT occasionally included information not explicitly stated
   - Example: When asked about engineering topics, sometimes inferred presence without direct mentions
   - Learning: Need to be specific about inference vs. explicit mention

3. **Response Reliability**
   - More reliable when:
     - Given specific formatting instructions
     - Asked for factual rather than interpretive analysis
     - Provided with clear evaluation criteria

### Key Learnings

1. **Prompt Engineering**
   - Clear, structured prompts produce more reliable results
   - Including example formats improves output consistency
   - Breaking complex tasks into smaller components helps accuracy

2. **Data Processing**
   - Simple data formats are easier to work with
   - Important to validate and clean GPT outputs
   - String parsing requires careful error handling

3. **Best Practices**
   - Always verify GPT's inferences against source text
   - Use structured prompts for consistent results
   - Include specific formatting instructions
   - Test multiple prompt variations for optimal results

### Conclusion
GPT demonstrated strong capabilities in topic identification and sentiment analysis, but required careful prompt engineering for optimal results. The most successful approaches used clear formatting instructions and specific evaluation criteria. Future work could focus on improving reliability for more complex analytical tasks.
