<a href="https://colab.research.google.com/github/wendywqz/ML/blob/main/GenAI_Summarizing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Set up the environment

import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY') # Store the key in .dotenv file, or directly paste the key.

In [None]:
def get_completion(prompt, model="gpt-3.5-turbo"): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]


# Summarizing

In [None]:
# Text to summarize - general summarizing

prod_review = """
Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \
super cute, and its face has a friendly look. It's \
a bit small for what I paid though. I think there \
might be other options that are bigger for the \
same price. It arrived a day earlier than expected, \
so I got to play with it myself before I gave it \
to her.
"""


【Output】
Soft, cute panda plush toy loved by daughter, but smaller than expected for the price. Arrived early, friendly face.

In [None]:
# Focus on shipping and delivery

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site.

Summarize the review below, delimited by triple
backticks, in at most 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

【Output】
The customer was pleased with the early delivery of the panda plush toy, but felt it was slightly small for the price paid.

In [None]:
# Focus on price and value

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
pricing deparmtment, responsible for determining the \
price of the product.

Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that are relevant to the price and perceived value.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


【Output】The panda plush toy is loved for its softness and cuteness, but some customers feel it's a bit small for the price.

In [None]:
# Try 'extract' instead of 'summarize'

prompt = f"""
Your task is to extract relevant information from \
a product review from an ecommerce site to give \
feedback to the Shipping department.

From the review below, delimited by triple quotes \
extract the information relevant to shipping and \
delivery. Limit to 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


【Output】Feedback: The product arrived a day earlier than expected, which was a pleasant surprise. Consider offering larger options for the same price to improve customer satisfaction.

# Summarize multiple product reviews

In [None]:
review_1 = prod_review

# review for a standing lamp
review_2 = """
Needed a nice lamp for my bedroom, and this one \
had additional storage and not too high of a price \
point. Got it fast - arrived in 2 days. The string \
to the lamp broke during the transit and the company \
happily sent over a new one. Came within a few days \
as well. It was easy to put together. Then I had a \
missing part, so I contacted their support and they \
very quickly got me the missing piece! Seems to me \
to be a great company that cares about their customers \
and products.
"""

# review for an electric toothbrush
review_3 = """
My dental hygienist recommended an electric toothbrush, \
which is why I got this. The battery life seems to be \
pretty impressive so far. After initial charging and \
leaving the charger plugged in for the first week to \
condition the battery, I've unplugged the charger and \
been using it for twice daily brushing for the last \
3 weeks all on the same charge. But the toothbrush head \
is too small. I’ve seen baby toothbrushes bigger than \
this one. I wish the head was bigger with different \
length bristles to get between teeth better because \
this one doesn’t.  Overall if you can get this one \
around the $50 mark, it's a good deal. The manufactuer's \
replacements heads are pretty expensive, but you can \
get generic ones that're more reasonably priced. This \
toothbrush makes me feel like I've been to the dentist \
every day. My teeth feel sparkly clean!
"""

# review for a blender
review_4 = """
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""

reviews = [review_1, review_2, review_3, review_4]

In [None]:
for i in range(len(reviews)):
    prompt = f"""
    Your task is to generate a short summary of a product \
    review from an ecommerce site.

    Summarize the review below, delimited by triple \
    backticks in at most 20 words.

    Review: ```{reviews[i]}```
    """

    response = get_completion(prompt)
    print(i, response, "\n")

【Output】

0 Summary:
Adorable panda plush loved by daughter, but small for price. Arrived early, soft and cute.

1 Great lamp with storage, fast delivery, excellent customer service for missing parts. Company cares about customers.

2 Impressive battery life, small brush head, good deal for $50, generic replacement heads available, leaves teeth feeling clean.

3 17-piece system on sale for $49, prices increased later. Base quality not as good, motor issues after a year.

# Experiment on Coursera Reivew

In [None]:
import requests
from bs4 import BeautifulSoup

# Step 1: Fetch the web page
url = "https://www.coursera.org/learn/classification-vector-spaces-in-nlp/reviews"  # Replace with the actual URL
response = requests.get(url)

# Step 2: Parse the HTML content
soup = BeautifulSoup(response.content, "html.parser")

# Step 3: Identify and extract reviews
# Example: Reviews are inside <div class="review-text">
reviews = []
for review in soup.find_all("div", class_="review-text"):
    reviews.append(review.get_text(strip=True))  # Extract and clean the text

print (reviews)

In [None]:
import requests
from bs4 import BeautifulSoup

# Step 1: Fetch the web page
url = "https://www.coursera.org/learn/classification-vector-spaces-in-nlp/reviews"  # Replace with the actual URL
response = requests.get(url)

# Step 2: Parse the HTML content
soup = BeautifulSoup(response.content, "html.parser")

# Step 3: Identify and extract reviews
# Example: Reviews are inside <div class="review-text">
reviews = []
for review in soup.find_all("div", class_="review-text"):
    reviews.append(review.get_text(strip=True))  # Extract and clean the text

print (reviews)

['Filled StarFilled StarStarStarStarBy Forest L•Jun 18, 2020Lectures are too short and the topics are overly simplified. Assignments are toy examples.This is helpful (81)', 'Filled StarFilled StarStarStarStarBy Anand R•Jun 19, 2020This course seemed rushed, and navigated across depth and breadth very unsystematically. There were errors in the assignments and instructions, and the python code in the assignments was also very non-pythonic in many places.This is helpful (57)', "Filled StarFilled StarFilled StarStarStarBy Frozhen•Jun 21, 2020I came to this specialization from Andrew's twitter post, wanted to give it a try since Andrew's DL specialization is very good. However, this course is not taught by Andrew, and the video lectures sound like the instructor is just reading the script, not really inspiring me to follow the lecture since the videos are very dry.This is helpful (41)", 'Filled StarFilled StarStarStarStarBy Juan d L•Jun 25, 2020The course is interesting, and it is built car

In [None]:
!pip install pandas
import pandas as pd

# Convert reviews to a DataFrame
reviews_with_index = list(enumerate(reviews))  # Create a list of tuples (index, review)
df = pd.DataFrame(reviews_with_index, columns=['Index', 'Review Text']) # Using reviews_with_index and labeled columns

# Display the table
#print(df)
df



Unnamed: 0,Index,Review Text
0,0,Filled StarFilled StarStarStarStarBy Forest L•...
1,1,Filled StarFilled StarStarStarStarBy Anand R•J...
2,2,Filled StarFilled StarFilled StarStarStarBy Fr...
3,3,Filled StarFilled StarStarStarStarBy Juan d L•...
4,4,Filled StarFilled StarFilled StarFilled StarSt...
5,5,"Filled StarStarStarStarStarBy Achkan S•Aug 29,..."
6,6,Filled StarFilled StarFilled StarStarStarBy Ol...
7,7,Filled StarFilled StarFilled StarStarStarBy Ag...
8,8,Filled StarFilled StarFilled StarStarStarBy ES...
9,9,Filled StarFilled StarFilled StarStarStarBy Cl...


In [None]:
for i, review_text in enumerate(reviews):
    prompt = f"""
    Your task is to generate a short summary of a product \
    review from an ecommerce site.

    Summarize the review below, delimited by triple \
    backticks in at most 12 words.

    Review: ```{reviews[i]}```
    """

    response = get_completion(prompt)
    print(i, response, "\n")

  # Error happens in the following code, since it cannot be str
  #for i in reviews:
  #  prompt = f""" Your task is to generate a short summary of a product \ review from an ecommerce site.
  #  Summarize the review below, delimited by triple \
  #  backticks in at most 20 words.
  #  Review: ```{reviews[i]}```
  #  """
#response = get_completion(prompt)
#print(i, response, "\n")"

【Output】

0 Short lectures, simplified topics, toy assignments; helpful overall.

1 Course rushed, errors in assignments, non-pythonic code. Helpful overall.

2 Not Andrew's course, dry lectures, not inspiring, but somewhat helpful.

3 Interesting course, but grading process focuses more on programming than NLP. Unorganized Slack.

4 Helpful review, wished for clearer explanation of LSH and Approximate Hashing.

5 Too many problems with course: lack of audience focus, short videos.

6 Good starting NLP course, lacking depth, needs more intuitive lectures.

7 "Interesting course but lacking depth and challenge, too easy to pass."

8 Interesting content but assignments are too simple with pre-written code. Helpful.

9 "Easy but lacking mathematical formalism, helpful for LSH. 3.5 stars."

10 Summary: Helpful review highlighting auto grader's strict style requirements.

11 Disappointing course with unhelpful content and assignments.

12 Good coverage, tips, and demos, but lacks clarity and test cases.

13 Summary: Lacks depth, resembles MOOCs, misses university-level courses.

14 Disappointing deeplearning.ai course, found material superfluous.

15 Disappointing assignments and short videos, lacking detailed explanations. Helpful overall.

16 Excellent course, challenging assignments, eagerly awaiting next course in series.

17 "Not worth it, outdated content on deep learning methods."

18 Excellent NLP introduction with deep learning fundamentals, highly recommended.

19 "Exciting, detailed lectures with some challenges, but informative and fun."

20 Great course with in-depth coverage of algorithms and numpy/matplotlib. Highly recommended.

21 Great course covering practical and conceptual aspects, highly recommended for NLP.

22 Short and concise video lectures, lacking details on vector subspaces.

23 Course 1 review: AI grading, structured assignments, autograder issues, and coding challenges.

24 Simple course with basic concepts, good feedback, lacks depth. Helpful.