# Text Analysis

Goal of this notebook is practicing and understanding prompts used for text analysis.

Examples sourced from [Deeplearning.ai ChatGPT Prompt Engineering for Developers](https://learn.deeplearning.ai/chatgpt-prompt-eng/lesson/4/summarizing) course.

LLM Model: [dolly-v2-3b](https://huggingface.co/databricks/dolly-v2-3b)

In [1]:
!export PYTORCH_ENABLE_MPS_FALLBACK=1
import time

import torch
from langchain import PromptTemplate, LLMChain
from langchain.llms import HuggingFacePipeline
from transformers import pipeline

# print(torch.backends.mps.is_available())
# print(torch.backends.mps.is_built())
# print(torch.cuda.is_available())

In [2]:
generate_text = pipeline(model="databricks/dolly-v2-3b", torch_dtype=torch.bfloat16,
                         trust_remote_code=True, device_map="auto", return_full_text=True)

simple_prompt = PromptTemplate(
    input_variables=["instruction"],
    template="{instruction}")

prompt_with_context = PromptTemplate(
    input_variables=["instruction", "context"],
    template="{instruction}\n\nInput:\n{context}")

hf_pipeline = HuggingFacePipeline(pipeline=generate_text)

llm_chain = LLMChain(llm=hf_pipeline, prompt=simple_prompt)
llm_context_chain = LLMChain(llm=hf_pipeline, prompt=prompt_with_context)


def get_response(input_prompt, context=None):
    if context is None:
        return llm_chain.predict(instruction=input_prompt).lstrip()
    else:
        return llm_context_chain.predict(instruction=input_prompt, context=context).lstrip()


def reply_prompt(input_prompt, context=None):
    start = time.time()
    response = get_response(input_prompt, context)
    end = time.time()
    duration_seconds = end - start
    print(response)
    print(f"\nProcessing duration: {duration_seconds:0.2f} seconds")


## Challenge:  summarize text with a focus on specific topics

In [4]:
prod_review = """
Got this panda plush toy for my daughter's birthday, who loves it and takes it everywhere. It's soft and super cute, and its face has a friendly look. It's a bit small for what I paid though. I think there  might be other options that are bigger for the same price. It arrived a day earlier than expected, so I got to play with it myself before I gave it  to her. """

### Summarize with character limit.

In [30]:
prompt = f"""
Your task is to summarize a text delimited by <> using at most 50 characters.

Text: <{prod_review}>
"""

reply_prompt(prompt)

The text is delimited by < and >.

This panda plush toy for my daughter's birthday, who loves it and takes it everywhere. It's soft and super cute, and its face has a friendly look. It's a bit small for what I paid though. I think there might be other options that are bigger for the same price. It arrived a day earlier than expected, so I got to play with it myself before I gave it to her.

Processing duration: 147.59 seconds


### Summarize with specific focus

In [21]:
prompt = f"""
Your task is to generate a short summary of a text delimited by <> in at most 90 characters and focusing on any aspects that mention shipping and delivery of the product.

Text: ```{prod_review}```
"""

reply_prompt(prompt)

The text you provided mentions shipping and delivery of the product. It might be helpful to know that the product was delivered a day earlier than expected.

Processing duration: 118.23 seconds


### Use "extract" instead of "summarize"

In [29]:
prompt = f"""
From text delimited by <> extract relevant information.

Text: <{prod_review}>
"""

reply_prompt(prompt)

* Purchase price
$
* Delivery date
< 24 hours
* Retail value
$
* Product description
< soft, cute, friendly face; small size; longer lasts; arrives a day earlier than expected
* Website
* Customer service response
< Thanks for the great review! My factory takes pride in producing the highest quality plush animals. I would definitely buy from again.>

Processing duration: 135.52 seconds
panda plush toy,$19, friendly look, arrival a day earlier than expected

Processing duration: 92.03 seconds
