# __Demo: Transformer Applications__

# __Context:__

The text below includes two sections, text1 and text2. `text1` presents a customer's negative review of the iPhone, while `text2` provides a positive review. This demo aims to demonstrate the Transformer models' ability to manage two distinct tasks effectively.

In [1]:
text1 = '''Extremely disappointed with my recent iPhone purchase from Apple. The device constantly lags, and the battery life is abysmal,
barely lasting through the day. Despite the hefty price tag, the performance is far from satisfactory. Customer support has been unhelpful,
providing no viable solutions to address these persistant issues. This experience has left me regretting my decision to choose Apple,
and I expected much better from a company known for its premium products.'''

text2 = '''I recently purchased an iPhone from Apple, and it has been an absolute delight! The device runs smoothly, and the battery life is impressive, easily lasting throughout the day.
The price, though high, is justified by the excellent performance and top-notch customer support. I am thoroughly satisfied with my decision to choose Apple, and it reaffirms their reputation
for delivering premium products. Highly recommended for anyone seeking a reliable and high-performance smartphone'''

# __Task 1: Text Classification__
- Analyze customer reviews of an iPhone purchase to classify the sentiment as positive or negative.

In [5]:
from transformers import pipeline
classifier = pipeline("sentiment-analysis", model="gpt2")

import pandas as pd

# Analyze for text1
outputs1 = classifier(text1)
pd.DataFrame(outputs1)

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Device set to use cuda:0


Unnamed: 0,label,score
0,LABEL_1,0.647237


__Note:__ The output displays a label with its corresponding probability.

In [None]:
#Ananlyze for text2:
outputs2 = classifier(text2)
pd.DataFrame(outputs2)

# __Task 2: Text Generation__

- Generate a customer service response to one of the reviews. This is achieved by using a text generation pipeline.

In [6]:
from transformers import set_seed
set_seed(42) # Set the seed to get reproducible results

generator = pipeline("text-generation")
response = "Dear Patron, Thanks for writing in! I am sorry to hear your experience with us."
prompt = text1 + "\n\nCustomer service response:\n" + response
outputs = generator(prompt, max_length=150)
print(outputs[0]['generated_text'])

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cuda:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=150) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Extremely disappointed with my recent iPhone purchase from Apple. The device constantly lags, and the battery life is abysmal,
barely lasting through the day. Despite the hefty price tag, the performance is far from satisfactory. Customer support has been unhelpful,
providing no viable solutions to address these persistant issues. This experience has left me regretting my decision to choose Apple,
and I expected much better from a company known for its premium products.

Customer service response:
Dear Patron, Thanks for writing in! I am sorry to hear your experience with us. We have been working with you for a while to improve Apple's product experience. We received some positive feedback from reviewers on the iPhone 6, and we are working with you to improve our customer service. We are currently working with Apple to improve our customer service and to help us out. In the meantime, we have implemented other policies to deal with the issues. We have also made improvements to make the 

__Note:__ This demonstrates how Transformers can create context-aware responses, which can be a valuable tool in automated customer service or similar applications.

# __Conclusion__
This demo aims to illustrate the practical utility of Transformer models in real-world scenarios, emphasizing their effectiveness in understanding and generating human-like text.

In [7]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

In [8]:
import torch
torch.cuda.is_available()

True

In [12]:
df = pd.read_csv("customer_feedback.csv", sep=", ")

  df = pd.read_csv("customer_feedback.csv", sep=", ")


In [14]:
df.head(3)

Unnamed: 0,"""Text",Sentiment,Source,Date/Time,User ID,Location,"Confidence Score"""
0,"""""""I love this product!""""",Positive,Twitter,2023-06-15 09:23:14,@user123,New York,"0.85"""
1,"""""""The service was terrible.""""",Negative,Yelp Reviews,2023-06-15 11:45:32,user456,Los Angeles,"0.65"""
2,"""""""This movie is amazing!""""",Positive,IMDb,2023-06-15 14:10:22,moviefan789,London,"0.92"""


In [23]:
df["Sentiment"].unique()

array(['Positive', 'Negative'], dtype=object)

In [15]:
model_name = "google/flan-t5-small"

In [16]:
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/308M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

In [17]:
tokenizer = AutoTokenizer.from_pretrained(model_name)

tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

In [19]:
device = "cuda" if torch.cuda.is_available() else "cpu"

In [20]:
model = model.to(device)

In [47]:
data = df['"Text'].tolist()
len(data)

96

In [27]:
def predict_sentiment(data, max_new_tokens=10):
    prompt = f"Classify the following customer data as Positive, Negative or Neutral \n Feedback {data} \n sentiment:"
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    output = model.generate(**inputs, max_new_tokens=max_new_tokens)
    generated = tokenizer.decode(output[0], skip_special_tokens=True)
    return generated

In [29]:
predicted = [predict_sentiment(i) for i in data]

In [30]:
df.head(3)

Unnamed: 0,"""Text",Sentiment,Source,Date/Time,User ID,Location,"Confidence Score"""
0,"""""""I love this product!""""",Positive,Twitter,2023-06-15 09:23:14,@user123,New York,"0.85"""
1,"""""""The service was terrible.""""",Negative,Yelp Reviews,2023-06-15 11:45:32,user456,Los Angeles,"0.65"""
2,"""""""This movie is amazing!""""",Positive,IMDb,2023-06-15 14:10:22,moviefan789,London,"0.92"""


In [31]:
df["Predicted_llm"]= predicted

In [32]:
df.head(3)

Unnamed: 0,"""Text",Sentiment,Source,Date/Time,User ID,Location,"Confidence Score""",Predicted_llm
0,"""""""I love this product!""""",Positive,Twitter,2023-06-15 09:23:14,@user123,New York,"0.85""",positive
1,"""""""The service was terrible.""""",Negative,Yelp Reviews,2023-06-15 11:45:32,user456,Los Angeles,"0.65""",negative
2,"""""""This movie is amazing!""""",Positive,IMDb,2023-06-15 14:10:22,moviefan789,London,"0.92""",positive


In [33]:
df.shape

(96, 8)

In [40]:
df["Sentiment"] = df["Sentiment"].apply(lambda x: x.lower())

In [41]:
# dir(str)

In [42]:
from sklearn.metrics import accuracy_score
accuracy_score(df["Sentiment"], df["Predicted_llm"])

1.0

In [45]:
df[df["Sentiment"]==df["Predicted_llm"]].shape

(96, 8)