# Text summarizing with ChaptGPT
In this lesson, you will summarize text with a focus on specific topics.

## Setup

In [3]:
pip install python-dotenv


Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.1


In [4]:
pip install openai




In [5]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')

In [7]:
import openai

client = openai.Client(api_key="-")

def get_completion(prompt, model="gpt-4o-mini"):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0,
    )
    return response.choices[0].message.content


## Text to summarize

In [8]:
prod_review = """
Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \
super cute, and its face has a friendly look. It's \
a bit small for what I paid though. I think there \
might be other options that are bigger for the \
same price. It arrived a day earlier than expected, \
so I got to play with it myself before I gave it \
to her.
"""

## Summarize with a word/sentence/character limit

In [9]:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site.

Summarize the review below, delimited by triple
backticks, in at most 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


The panda plush toy is soft, cute, and loved by my daughter, but it's smaller than expected for the price. It arrived early, allowing me to enjoy it first.


## Summarize with a focus on shipping and delivery

In [10]:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
Shipping deparmtment.

Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that mention shipping and delivery of the product.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


The product arrived a day earlier than expected, enhancing the overall experience for the customer.


## Summarize with a focus on price and value

In [11]:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
pricing deparmtment, responsible for determining the \
price of the product.

Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that are relevant to the price and perceived value.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


The panda plush toy is loved for its softness and cuteness, but perceived as overpriced for its small size compared to larger options available at the same price.


#### Comment
- Summaries include topics that are not related to the topic of focus.

## Try "extract" instead of "summarize"

In [12]:
prompt = f"""
Your task is to extract relevant information from \
a product review from an ecommerce site to give \
feedback to the Shipping department.

From the review below, delimited by triple quotes \
extract the information relevant to shipping and \
delivery. Limit to 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

The product arrived a day earlier than expected.


## Summarize multiple product reviews

In [13]:

review_1 = prod_review

# review for a standing lamp
review_2 = """
Needed a nice lamp for my bedroom, and this one \
had additional storage and not too high of a price \
point. Got it fast - arrived in 2 days. The string \
to the lamp broke during the transit and the company \
happily sent over a new one. Came within a few days \
as well. It was easy to put together. Then I had a \
missing part, so I contacted their support and they \
very quickly got me the missing piece! Seems to me \
to be a great company that cares about their customers \
and products.
"""

# review for an electric toothbrush
review_3 = """
My dental hygienist recommended an electric toothbrush, \
which is why I got this. The battery life seems to be \
pretty impressive so far. After initial charging and \
leaving the charger plugged in for the first week to \
condition the battery, I've unplugged the charger and \
been using it for twice daily brushing for the last \
3 weeks all on the same charge. But the toothbrush head \
is too small. I’ve seen baby toothbrushes bigger than \
this one. I wish the head was bigger with different \
length bristles to get between teeth better because \
this one doesn’t.  Overall if you can get this one \
around the $50 mark, it's a good deal. The manufactuer's \
replacements heads are pretty expensive, but you can \
get generic ones that're more reasonably priced. This \
toothbrush makes me feel like I've been to the dentist \
every day. My teeth feel sparkly clean!
"""

# review for a blender
review_4 = """
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""

reviews = [review_1, review_2, review_3, review_4]

In [14]:
for i in range(len(reviews)):
    prompt = f"""
    Your task is to generate a short summary of a product \
    review from an ecommerce site.

    Summarize the review below, delimited by triple \
    backticks in at most 20 words.

    Review: ```{reviews[i]}```
    """

    response = get_completion(prompt)
    print(i, response, "\n")

0 Cute and soft panda plush toy, but a bit small for the price; arrived early. 

1 Great lamp with storage, quick delivery, responsive support for issues, and easy assembly. Highly recommended! 

2 Impressive battery life, but toothbrush head is too small; overall, a good deal if priced around $50. 

3 Prices increased significantly after a sale; quality seems lower, and motor issues arose after a year. 



# Exercise
 - Complete the prompts similar to what we did in class.
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

# My prompts 1: Art History prompt

In [None]:
prompt = f"""
Your task is to generate a short summary of the history of
rennaissance art, bring forward the most significant pieces.
Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that are relevant to the topic and how it affected the world.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

The Renaissance art period saw a revival of classical styles and techniques, with significant pieces like Leonardo da Vinci's "Mona Lisa" and Michelangelo's "David" influencing art worldwide.


# My prompts 2: The history of machine learning transformers

In [15]:
history_1 = """
For many years, sequence modelling and generation was done by using plain recurrent neural networks (RNNs).
A well-cited early example was the Elman network (1990). In theory, the information from one token can propagate arbitrarily far down the sequence,
but in practice the vanishing-gradient problem leaves the model's state at the end of a long sentence without precise, extractable information about preceding tokens.

A key breakthrough was LSTM (1995),[note 1] a RNN which used various innovations to overcome the vanishing gradient problem,
allowing efficient learning of long-sequence modelling. One key innovation was the use of an attention mechanism which used
 neurons that multiply the outputs of other neurons, so-called multiplicative units.[13] Neural networks using multiplicative units were
  later called sigma-pi networks[14] or higher-order networks.[15] LSTM became the standard architecture for long sequence modelling until the
  2017 publication of Transformers. However, LSTM still used sequential processing, like most other RNNs.[note 2] Specifically,
  RNNs operate one token at a time from first to last; they cannot operate in parallel over all tokens in a sequence.

Modern Transformers overcome this problem, but unlike RNNs, they require computation time that is quadratic in the size of the context window.
The linearly scaling fast weight controller (1992) learns to compute a weight matrix for further processing depending on the input.[16]
One of its two networks has "fast weights" or "dynamic links" (1981).[17][18][19] A slow neural network learns by gradient descent to generate keys
and values for computing the weight changes of the fast neural network which computes answers to queries.[16] This was later shown to be equivalent
to the unnormalized linear Transformer.[20][21]
"""

history_2 = """ The idea of encoder-decoder sequence transduction had been developed in the early 2010s (see previous papers[22][23]).
The papers most commonly cited as the originators that produced seq2seq are two concurrently published papers from 2014.[22][23]

A 380M-parameter model for machine translation uses two long short-term memories (LSTM).[23] Its architecture consists of two parts.
The encoder is an LSTM that takes in a sequence of tokens and turns it into a vector. The decoder is another LSTM that converts the vector into a sequence of tokens. Similarly,
another 130M-parameter model used gated recurrent units (GRU) instead of LSTM.[22] Later research showed that GRUs are neither better nor worse than LSTMs for seq2seq.[24][25]

These early seq2seq models had no attention mechanism, and the state vector is accessible only after the last word of the source text was processed.
Although in theory such a vector retains the information about the whole original sentence, in practice the information is poorly preserved.
This is because the input is processed sequentially by one recurrent network into a fixed-size output vector, which is then processed by another recurrent network into an output.
If the input is long, then the output vector would not be able to contain all relevant information, degrading the output. As evidence, reversing the input sentence improved seq2seq translation.[26]

The RNNsearch model introduced an attention mechanism to seq2seq for machine translation to solve the bottleneck problem (of the fixed-size output vector),
allowing the model to process long-distance dependencies more easily. The name is because it "emulates searching through a source sentence during decoding a translation".[4]

The relative performances were compared between global (that of RNNsearch) and local (sliding window) attention model architectures for machine translation,
 finding that mixed attention had higher quality than global attention, while local attention reduced translation time.[27]

In 2016, Google Translate was revamped to Google Neural Machine Translation, which replaced the previous model based on statistical machine translation.
The new model was a seq2seq model where the encoder and the decoder were both 8 layers of bidirectional LSTM.[28]
It took nine months to develop, and it outperformed the statistical approach, which took ten years to develop.[29]
"""
history = [history_1, history_2]

In [16]:
for i in range(len(history)):
    prompt = f"""
    Your task is to generate a short summary of the history of transformers \
    review from a history site.

    Summarize below, delimited by triple \
    backticks in at most 20 words.

    Review: ```{history[i]}```
    """

    response = get_completion(prompt)
    print(i, response, "\n")

0 Transformers revolutionized sequence modeling in 2017, overcoming RNN limitations like sequential processing and vanishing gradients. 

1 Transformers evolved from early seq2seq models, introducing attention mechanisms for improved machine translation efficiency and performance. 

