# Text summarizing with ChaptGPT
In this lesson, you will summarize text with a focus on specific topics.

## Setup

In [1]:
from openai import OpenAI
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY = '_'

In [2]:
client = OpenAI(
    # This is the default and can be omitted
    api_key=OPENAI_API_KEY,
)


def get_completion(prompt, model="gpt-3.5-turbo"): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message.content


## Text to summarize

In [3]:
prod_review = """
Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \ 
super cute, and its face has a friendly look. It's \ 
a bit small for what I paid though. I think there \ 
might be other options that are bigger for the \ 
same price. It arrived a day earlier than expected, \ 
so I got to play with it myself before I gave it \ 
to her.
"""

  prod_review = """


## Summarize with a word/sentence/character limit

In [4]:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


Soft, cute panda plush toy loved by daughter, arrived early. Small for price, but friendly face and quality. Consider larger options for same cost.


## Summarize with a focus on shipping and delivery

In [5]:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
Shipping deparmtment. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words, and focusing on any aspects \
that mention shipping and delivery of the product. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


The customer was pleased with the early delivery of the panda plush toy, but felt it was slightly small for the price paid.


## Summarize with a focus on price and value

In [6]:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
pricing deparmtment, responsible for determining the \
price of the product.  

Summarize the review below, delimited by triple 
backticks, in at most 30 words, and focusing on any aspects \
that are relevant to the price and perceived value. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)


The panda plush toy is loved for its softness and cuteness, but some customers feel it's a bit small for the price.


#### Comment
- Summaries include topics that are not related to the topic of focus.

## Try "extract" instead of "summarize"

In [48]:
prompt = f"""
Your task is to extract relevant information from \ 
a product review from an ecommerce site to give \
feedback to the Shipping department. 

From the review below, delimited by triple quotes \
extract the information relevant to shipping and \ 
delivery. Limit to 30 words. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

Feedback: The shipping was faster than expected, arriving a day early. Customer suggests offering larger options for the same price.


## Summarize multiple product reviews

In [7]:

review_1 = prod_review 

# review for a standing lamp
review_2 = """
Needed a nice lamp for my bedroom, and this one \
had additional storage and not too high of a price \
point. Got it fast - arrived in 2 days. The string \
to the lamp broke during the transit and the company \
happily sent over a new one. Came within a few days \
as well. It was easy to put together. Then I had a \
missing part, so I contacted their support and they \
very quickly got me the missing piece! Seems to me \
to be a great company that cares about their customers \
and products. 
"""

# review for an electric toothbrush
review_3 = """
My dental hygienist recommended an electric toothbrush, \
which is why I got this. The battery life seems to be \
pretty impressive so far. After initial charging and \
leaving the charger plugged in for the first week to \
condition the battery, I've unplugged the charger and \
been using it for twice daily brushing for the last \
3 weeks all on the same charge. But the toothbrush head \
is too small. I’ve seen baby toothbrushes bigger than \
this one. I wish the head was bigger with different \
length bristles to get between teeth better because \
this one doesn’t.  Overall if you can get this one \
around the $50 mark, it's a good deal. The manufactuer's \
replacements heads are pretty expensive, but you can \
get generic ones that're more reasonably priced. This \
toothbrush makes me feel like I've been to the dentist \
every day. My teeth feel sparkly clean! 
"""

# review for a blender
review_4 = """
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \ 
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \ 
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \ 
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""

reviews = [review_1, review_2, review_3, review_4]

  review_4 = """


In [8]:
for i in range(len(reviews)):
    prompt = f"""
    Your task is to generate a short summary of a product \ 
    review from an ecommerce site. 

    Summarize the review below, delimited by triple \
    backticks in at most 20 words. 

    Review: ```{reviews[i]}```
    """

    response = get_completion(prompt)
    print(i, response, "\n")

  """


0 Summary: 
Adorable panda plush loved by daughter, but small for price. Arrived early, soft and cute. 

1 Great lamp with storage, fast delivery, excellent customer service for missing parts. Company cares about customers. 

2 Impressive battery life, small toothbrush head, good deal for $50, generic replacement heads available, leaves teeth feeling clean. 

3 17-piece system on sale for $49, quality decline, motor issue after a year, price increase, customer service, brand loyalty. 



# Exercise
 - Complete the prompts similar to what we did in class. 
     - Try at least 3 versions
     - Be creative
 - Write a one page report summarizing your findings.
     - Were there variations that didn't work well? i.e., where GPT either hallucinated or wrong
 - What did you learn?

In [9]:
!pip install requests beautifulsoup4 wikipedia-api

Defaulting to user installation because normal site-packages is not writeable
Collecting wikipedia-api
  Downloading wikipedia_api-0.7.1.tar.gz (17 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: wikipedia-api
  Building wheel for wikipedia-api (setup.py): started
  Building wheel for wikipedia-api (setup.py): finished with status 'done'
  Created wheel for wikipedia-api: filename=Wikipedia_API-0.7.1-py3-none-any.whl size=14398 sha256=0b19018226b6bed6317fe0c45bcd0f44573b01f45e3b5d29b4700740dc66f76d
  Stored in directory: c:\users\aurel\appdata\local\pip\cache\wheels\48\93\2f\978da1e445cf17606445f4b47fd8454250f5440d5a10c677e9
Successfully built wikipedia-api
Installing collected packages: wikipedia-api
Successfully installed wikipedia-api-0.7.1


In [10]:
import requests
from bs4 import BeautifulSoup
import wikipediaapi

def fetch_text_from_url(url):
    """
    Fetches and extracts clean text content from a Wikipedia page URL using BeautifulSoup.
    :param url: str, the Wikipedia page URL
    :return: str, the extracted text content
    """
    try:
        response = requests.get(url)
        if response.status_code == 200:
            soup = BeautifulSoup(response.text, 'html.parser')
            content_div = soup.find("div", {"id": "mw-content-text"})
            paragraphs = content_div.find_all("p")
            text_content = "\n".join([p.get_text() for p in paragraphs])
            return text_content.strip()
        else:
            return f"Failed to fetch page. Status code: {response.status_code}"
    except Exception as e:
        return f"Error: {e}"

def fetch_text_from_api(page_title):
    """
    Fetches the full text content of a Wikipedia page using the wikipedia-api library.
    :param page_title: str, the title of the Wikipedia page
    :return: str, the full page text content
    """
    try:
        wiki_wiki = wikipediaapi.Wikipedia('en')
        page = wiki_wiki.page(page_title)
        if page.exists():
            return page.text.strip()
        else:
            return f"The page '{page_title}' does not exist."
    except Exception as e:
        return f"Error: {e}"

def wiki_page(url_or_title, method='url'):
    """
    Fetches Wikipedia content for a given URL or title using the specified method.
    :param url_or_title: str, URL for 'url' method or title for 'api' method
    :param method: str, 'url' for BeautifulSoup or 'api' for wikipedia-api
    :return: str, extracted text content
    """
    if method == 'url':
        return fetch_text_from_url(url_or_title)
    elif method == 'api':
        return fetch_text_from_api(url_or_title)
    else:
        return "Invalid method. Use 'url' or 'api'."

if __name__ == "__main__":
    # Example pages
    Gandalf = wiki_page("https://en.wikipedia.org/wiki/Gandalf", method="url")
    Frodo = wiki_page("https://en.wikipedia.org/wiki/Frodo_Baggins", method="url")
    Aragorn = wiki_page("Aragorn", method="api")

    print("Gandalf:\n", Gandalf[:500], "\n\n")
    print("Frodo:\n", Frodo[:500], "\n\n")
    print("Aragorn:\n", Aragorn[:500], "\n")


Gandalf:
 Gandalf is a protagonist in J. R. R. Tolkien's novels The Hobbit and The Lord of the Rings. He is a wizard, one of the Istari order, and the leader of the Company of the Ring. Tolkien took the name "Gandalf" from the Old Norse "Catalogue of Dwarves" (Dvergatal) in the Völuspá.

As a wizard and the bearer of one of the Three Rings, Gandalf has great power, but works mostly by encouraging and persuading. He sets out as Gandalf the Grey, possessing great knowledge and travelling continually. Gandal 


Frodo:
 Frodo Baggins (Westron: Maura Labingi) is a fictional character in J. R. R. Tolkien's writings and one of the protagonists in The Lord of the Rings. Frodo is a hobbit of the Shire who inherits the One Ring from his cousin Bilbo Baggins, described familiarly as "uncle", and undertakes the quest to destroy it in the fires of Mount Doom in Mordor. He is mentioned in Tolkien's posthumously published works, The Silmarillion and Unfinished Tales.

Frodo is repeatedly wounded duri

In [12]:
Gandalf = wiki_page("https://en.wikipedia.org/wiki/Gandalf", method="url")
Frodo = wiki_page("https://en.wikipedia.org/wiki/Frodo_Baggins", method="url")
Vader= wiki_page("https://en.wikipedia.org/wiki/Darth_Vader", method='url')

In [13]:
prompt = f"""
Your task is to generate a short summary of the character bio.  

Summarize it and list the movies with him/her

Review: ```{Gandalf}```
"""

response = get_completion(prompt)
print(response)


Gandalf is a powerful wizard and a key character in J.R.R. Tolkien's novels The Hobbit and The Lord of the Rings. He is known for his wisdom, guidance, and leadership in the fight against the Dark Lord Sauron. Gandalf is associated with fire and is a member of the Istari order. He plays a crucial role in the quest to destroy the One Ring and defeat Sauron. Gandalf is portrayed by Ian McKellen in Peter Jackson's film adaptations of The Lord of the Rings and The Hobbit. 

Movies:
1. The Lord of the Rings film series (2001–2003)
2. The Hobbit film series (2012–2014)


In [14]:
prompt = f"""
Your task is to generate a short summary on the character actions.  

Summarize it and list all his bad actions

Review: ```{Vader}```
"""

response = get_completion(prompt)
print(response)

Summary:
Darth Vader is a fictional character in the Star Wars franchise who starts as Anakin Skywalker and turns to the dark side to become a Sith Lord. He serves Emperor Palpatine, hunts down Jedi, and attempts to crush the Rebel Alliance. He is the father of Luke Skywalker and Leia Organa. Vader is portrayed by David Prowse physically and voiced by James Earl Jones. He is known for his iconic mask and breathing sound effect.

Bad Actions:
1. Betrays the Jedi by joining the Sith and slaughtering younglings.
2. Kills Imperial Minister Maketh Tua for trying to defect to the Rebellion.
3. Tortures and freezes Han Solo in carbonite.
4. Massacres Tusken Raiders out of grief and rage.
5. Betrays Mace Windu to save Palpatine.
6. Murders the Jedi in the Jedi Temple.
7. Lies to Vader about Padmé's death.
8. Hunts down and kills surviving Jedi.
9. Orders the construction of the Death Star.
10. Attempts to turn Luke to the dark side.
11. Engages in lightsaber duels with Obi-Wan Kenobi and Luke 

In [16]:
characters = [Vader,Gandalf, Frodo]


for i in range(len(reviews)):
    prompt = f"""
    Your task is to generate a 20 words max introduction of the character.

    Give the 3 characters he/she is the most connected to 

    Characters: ```{characters[i]}```
    """

    response = get_completion(prompt)
    print(i, response, "\n")

0 Introduction: Darth Vader, the iconic Sith Lord from the Star Wars franchise, is a powerful and tragic figure in cinema history.

Most connected characters:
1. Chancellor Palpatine
2. Obi-Wan Kenobi
3. Luke Skywalker 

1 Gandalf is a wise and powerful wizard on a mission to defeat Sauron. He is connected to Frodo, Aragorn, and Bilbo. 

2 Frodo Baggins, a hobbit of the Shire, inherits the One Ring and embarks on a quest to destroy it. 

Most connected to:
1. Samwise Gamgee
2. Gandalf
3. Aragorn 



IndexError: list index out of range