# Introduction to Personaolized Customer Contact and resolution using GenAI Falcon models

---
In this demo notebook, we demonstrate  domain adaption fine-tuning. how to use the Falcon 7B model to summarize and respond to customer feedback in a personalized manner.  {More on team goals here}

The Falcon model is a permissively licensed ([Apache-2.0](https://jumpstart-cache-prod-us-east-2.s3.us-east-2.amazonaws.com/licenses/Apache-License/LICENSE-2.0.txt)) open source model trained on the [RefinedWeb dataset](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). 

---

Below is the content of the notebook.

1. [Deploy Falcon model for inference](#1.-Deploying-Falcon-model-for-inference)
2.   Summarize Customer Feedback
3.   Generate personalized response feedback email
4.   Other examples
5.   Clean up endpoint

## 1. Deploying Falcon model for inference

NOTE: Ignore errors on this step

In [2]:
!pip install sagemaker --quiet --upgrade --force-reinstall
!pip install ipywidgets==7.0.0 --quiet

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
daal4py 2021.6.0 requires daal==2021.4.0, which is not installed.
spyder 5.3.3 requires pyqt5<5.16, which is not installed.
spyder 5.3.3 requires pyqtwebengine<5.16, which is not installed.
autovizwidget 0.20.5 requires pandas<2.0.0,>=0.20.1, but you have pandas 2.1.1 which is incompatible.
awscli 1.29.14 requires botocore==1.31.14, but you have botocore 1.31.58 which is incompatible.
awscli 1.29.14 requires s3transfer<0.7.0,>=0.6.0, but you have s3transfer 0.7.0 which is incompatible.
distributed 2022.7.0 requires tornado<6.2,>=6.0.3, but you have tornado 6.3.2 which is incompatible.
hdijupyterutils 0.20.5 requires pandas<2.0.0,>=0.17.1, but you have pandas 2.1.1 which is incompatible.
jupyterlab 3.4.4 requires jupyter-server~=1.16, but you have jupyter-server 2.7.0 which is incompatible.
jupyterlab-server 2

In [3]:
model_id = "huggingface-llm-falcon-7b-instruct-bf16"

#### Deploy the model (this step may take a few minutes)

In [4]:
%%time
from sagemaker.jumpstart.model import JumpStartModel

my_model = JumpStartModel(model_id=model_id)
predictor = my_model.deploy()



sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
----------------------!CPU times: user 1.37 s, sys: 243 ms, total: 1.62 s
Wall time: 11min 37s


#### Quick test to make sure model is responding

In [5]:
%%time


prompt = "Tell me about Amazon SageMaker."

payload = {
    "inputs": prompt,
    "parameters": {
        "do_sample": True,
        "top_p": 0.9,
        "temperature": 0.8,
        "max_new_tokens": 1024,
        "stop": ["<|endoftext|>", "</s>"]
    }
}

response = predictor.predict(payload)
print(response[0]["generated_text"])


Amazon SageMaker is a cloud-based service that allows machine learning enthusiasts to easily build, train, and deploy machine learning models without requiring any specific technical expertise.
CPU times: user 14.4 ms, sys: 0 ns, total: 14.4 ms
Wall time: 1.23 s


#### Define function to query the model

In [6]:
def query_endpoint(payload):
    """Query endpoint and print the response"""
    response = predictor.predict(payload)
    print(f"\033[1m Input:\033[0m {payload['inputs']}")
    print(f"\033[1m Output:\033[0m {response[0]['generated_text']}")
    return response[0]['generated_text']

## 2. Summarize customer review

#### 2a. Enter the customer feeback statement here in the format *review = """ {customer review} ""
NOTE use three double quotes at the beginning and end to ensure that any single or double quotes in the customer review are processed correctly

In [7]:
review = """I have been very frustrated with the Att / Directv websites for several months or more!!! The only time the site lets me navigate to see my bill details is after I pay the bill and as long as I don't navigate away from that page. Otherwise, for rest  of the billing cycle I cannot see any bill details or history of bills paid. I called this week to return to paper billing  which is not ideal and very wasteful, but I cannot spend as much time as I do now online to navigate your website. I tried to order another remote because some of the number pads do not work. Your site redirects or says access denied. I'm beyond frustrated with your system software """

#### 2b. Use the Model to summarize the review

In [8]:
# Summarization

payload = {
    "inputs":f"""{review}. 
            Summarize the review above:""",
    "parameters":{
        "max_new_tokens":200
        }
    }
#print(payload)
summary = query_endpoint(payload)
#print(summary)

[1m Input:[0m I have been very frustrated with the Att / Directv websites for several months or more!!! The only time the site lets me navigate to see my bill details is after I pay the bill and as long as I don't navigate away from that page. Otherwise, for rest  of the billing cycle I cannot see any bill details or history of bills paid. I called this week to return to paper billing  which is not ideal and very wasteful, but I cannot spend as much time as I do now online to navigate your website. I tried to order another remote because some of the number pads do not work. Your site redirects or says access denied. I'm beyond frustrated with your system software . 
            Summarize the review above:
[1m Output:[0m 
            The reviewer is frustrated with the website's navigation and billing options. They have tried to return to paper billing but are unable to do so due to the website's limitations. They also experienced issues with accessing remote controls.


## 3. Generate an personalized email 

#### 3a.  Generate email based on Summary

In [9]:
payload = {
    "inputs": f"""Compose an personal email showing empathy to the customer with the review: {summary}""",
    "parameters":{
        "max_new_tokens": 310,
        "no_repeat_ngram_size": 3
        }
}
email = query_endpoint(payload)

[1m Input:[0m Compose an personal email showing empathy to the customer with the review: 
            The reviewer is frustrated with the website's navigation and billing options. They have tried to return to paper billing but are unable to do so due to the website's limitations. They also experienced issues with accessing remote controls.
[1m Output:[0m 
Subject: Your feedback matters to us!

Dear valued customer,

Thank you for taking the time to share your feedback with us. We sincerely appreciate your patience and understanding. We understand that navigating our website and billing options can be frustrating, especially when encountering limitations. We want to assure you that we are continuously working to improve your experience.

If you have any further questions or concerns, please do not hesitate to contact our customer support team. We are here to assist you.

Thank you again for your support.

Best regards,
[Your Name]
[Your Company]


#### 3b. Alternative approach where we generate email directly from review (rather than summary)

In [10]:
payload = {
    "inputs": f"""Compose an personal email showing empathy to the customer with the review: {review}""",
    "parameters":{
        "max_new_tokens": 310,
        "no_repeat_ngram_size": 3
        }
}
email = query_endpoint(payload)

[1m Input:[0m Compose an personal email showing empathy to the customer with the review: I have been very frustrated with the Att / Directv websites for several months or more!!! The only time the site lets me navigate to see my bill details is after I pay the bill and as long as I don't navigate away from that page. Otherwise, for rest  of the billing cycle I cannot see any bill details or history of bills paid. I called this week to return to paper billing  which is not ideal and very wasteful, but I cannot spend as much time as I do now online to navigate your website. I tried to order another remote because some of the number pads do not work. Your site redirects or says access denied. I'm beyond frustrated with your system software 
[1m Output:[0m / website. I'm a loyal customer for over 10 years and I'm not sure if I will continue to be a customer. I'm not sure if I will continue to be a customer.
Dear valued customer,

Thank you for reaching out to us and sharing your feedba

## 4. Other examples

In [11]:
# Summarization

payload = {
    "inputs":"""I have been very frustrated with the Att / Directv websites for several months or more!!! The only time the site lets me 
            navigate to see my bill details is after I pay the bill and as long as I don't navigate away from that page. Otherwise, for rest 
            of the billing cycle I cannot see any bill details or history of bills paid. I called this week to return to paper billing 
            which is not ideal and very wasteful, but I cannot spend as much time as I do now online to navigate your website. I tried to 
            order another remote because some of the number pads do not work. Your site redirects or
            says access denied. I'm beyond frustrated with your system software !!!. 
            Summarize the review above:""",
    "parameters":{
        "max_new_tokens":200
        }
    }
query_endpoint(payload)

[1m Input:[0m I have been very frustrated with the Att / Directv websites for several months or more!!! The only time the site lets me 
            navigate to see my bill details is after I pay the bill and as long as I don't navigate away from that page. Otherwise, for rest 
            of the billing cycle I cannot see any bill details or history of bills paid. I called this week to return to paper billing 
            which is not ideal and very wasteful, but I cannot spend as much time as I do now online to navigate your website. I tried to 
            order another remote because some of the number pads do not work. Your site redirects or
            says access denied. I'm beyond frustrated with your system software !!!. 
            Summarize the review above:
[1m Output:[0m  I am frustrated with the website navigation and the 
            inability to access bill details.
I am sorry to hear that you are experiencing frustration with the website navigation and bill details

' I am frustrated with the website navigation and the \n            inability to access bill details.\nI am sorry to hear that you are experiencing frustration with the website navigation and bill details. I understand that navigating away from the page after paying the bill is causing issues. I would suggest reaching out to customer service to see if they can provide any solutions or workarounds. Additionally, I would suggest exploring other options for bill payment and viewing your bill details.'

In [12]:
payload = {
    "inputs": """Compose an personal email showing empathy to the customer with the review: I am frustrated with the website navigation and the 
            inability to access bill details.s""",
    "parameters":{
        "max_new_tokens": 310,
        "no_repeat_ngram_size": 3
        }
}
query_endpoint(payload)

[1m Input:[0m Compose an personal email showing empathy to the customer with the review: I am frustrated with the website navigation and the 
            inability to access bill details.s
[1m Output:[0m 
Dear valued customer,

I understand that navigating through a website can be frustrating, especially when it comes to accessing important details like your bill information. We want to make your experience as seamless as possible, and I assure you that we are working hard to improve the website's user interface.

In the meantime, I would like to personally reach out to you to offer my sincerest apologies for the inconvenience you have experienced. If you have any further questions or concerns, please do not hesitate to contact our customer support team.

Thank you for your patience and understanding.

Best regards,
[Your Name]
[Your Company]


"\nDear valued customer,\n\nI understand that navigating through a website can be frustrating, especially when it comes to accessing important details like your bill information. We want to make your experience as seamless as possible, and I assure you that we are working hard to improve the website's user interface.\n\nIn the meantime, I would like to personally reach out to you to offer my sincerest apologies for the inconvenience you have experienced. If you have any further questions or concerns, please do not hesitate to contact our customer support team.\n\nThank you for your patience and understanding.\n\nBest regards,\n[Your Name]\n[Your Company]"

In [17]:
payload = {
    "inputs": """Compose an personal email showing empathy to the customer whose NFL package failed over the weekend and 
    they couldn't watch the Broncos play\". favorite team Broncos""",
    "parameters":{
        "max_new_tokens": 310,
        "no_repeat_ngram_size": 3
        }
}
email = query_endpoint(payload)

[1m Input:[0m Compose an personal email showing empathy to the customer whose NFL package failed over the weekend and 
    they couldn't watch the Broncos play". favorite team Broncos
[1m Output:[0m 
Subject: We're Sorry Your Broncos Package Failed!

Dear [Customer],

We understand how disappointing it can be to not be able to watch your favorite team play. We're sorry to hear that your package failed over the weekend and we want to make it right.

We're offering a free upgrade to our NFL Sunday Ticket package for the remainder of the season. This includes access to all regular season and playoff games.

Please let us know if you'd like to take advantage of this offer. We'll be happy to help you get set up.

Thank you for being a loyal customer, and we look forward to serving you for many years to come.

Best regards,
[Your Name]
[Your Company]


In [14]:
# Sentiment-analysis
payload = {
    "inputs": """"I hate it when my phone battery dies."
                Sentiment: Negative
                ###
                Tweet: "My day has been :+1:"
                Sentiment: Positive
                ###
                Tweet: "This is the link to the article"
                Sentiment: Neutral
                ###
                Tweet: "This new music video was incredibile"
                Sentiment:""",
    "parameters": {
        "max_new_tokens":2
    }
}
query_endpoint(payload)

[1m Input:[0m "I hate it when my phone battery dies."
                Sentiment: Negative
                ###
                Tweet: "My day has been :+1:"
                Sentiment: Positive
                ###
                Tweet: "This is the link to the article"
                Sentiment: Neutral
                ###
                Tweet: "This new music video was incredibile"
                Sentiment:
[1m Output:[0m  Positive
                


' Positive\n                '

In [15]:
# Summarization

payload = {
    "inputs":"""Starting today, the state-of-the-art Falcon 40B foundation model from Technology
    Innovation Institute (TII) is available on Amazon SageMaker JumpStart, SageMaker's machine learning (ML) hub
    that offers pre-trained models, built-in algorithms, and pre-built solution templates to help you quickly get
    started with ML. You can deploy and use this Falcon LLM with a few clicks in SageMaker Studio or
    programmatically through the SageMaker Python SDK.
    Falcon 40B is a 40-billion-parameter large language model (LLM) available under the Apache 2.0 license that
    ranked #1 in Hugging Face Open LLM leaderboard, which tracks, ranks, and evaluates LLMs across multiple
    benchmarks to identify top performing models. Since its release in May 2023, Falcon 40B has demonstrated
    exceptional performance without specialized fine-tuning. To make it easier for customers to access this
    state-of-the-art model, AWS has made Falcon 40B available to customers via Amazon SageMaker JumpStart.
    Now customers can quickly and easily deploy their own Falcon 40B model and customize it to fit their specific
    needs for applications such as translation, question answering, and summarizing information.
    Falcon 40B are generally available today through Amazon SageMaker JumpStart in US East (Ohio),
    US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Mumbai),
    Europe (London), Europe (Frankfurt), Europe (Ireland), and Canada (Central),
    with availability in additional AWS Regions coming soon. To learn how to use this new feature,
    please see SageMaker JumpStart documentation, the Introduction to SageMaker JumpStart –
    Text Generation with Falcon LLMs example notebook, and the blog Technology Innovation Institute trainsthe
    state-of-the-art Falcon LLM 40B foundation model on Amazon SageMaker. Summarize the article above:""",
    "parameters":{
        "max_new_tokens":200
        }
    }
query_endpoint(payload)

[1m Input:[0m Starting today, the state-of-the-art Falcon 40B foundation model from Technology
    Innovation Institute (TII) is available on Amazon SageMaker JumpStart, SageMaker's machine learning (ML) hub
    that offers pre-trained models, built-in algorithms, and pre-built solution templates to help you quickly get
    started with ML. You can deploy and use this Falcon LLM with a few clicks in SageMaker Studio or
    programmatically through the SageMaker Python SDK.
    Falcon 40B is a 40-billion-parameter large language model (LLM) available under the Apache 2.0 license that
    ranked #1 in Hugging Face Open LLM leaderboard, which tracks, ranks, and evaluates LLMs across multiple
    benchmarks to identify top performing models. Since its release in May 2023, Falcon 40B has demonstrated
    exceptional performance without specialized fine-tuning. To make it easier for customers to access this
    state-of-the-art model, AWS has made Falcon 40B available to customers via Amaz

'\n    Technology Innovation Institute (TII) has made the state-of-the-art Falcon 40B model available on Amazon SageMaker JumpStart,\n    which offers pre-trained models, built-in algorithms, and pre-built solution templates to help you quickly get\n    started with ML. You can deploy and use this Falcon LLM with a few clicks in SageMaker Studio or programmatically\n    through the SageMaker Python SDK.'

#### Documenting Supported parameters to tweak if desired

***
Some of the supported parameters while performing inference are the following:

* **max_length:** Model generates text until the output length (which includes the input context length) reaches `max_length`. If specified, it must be a positive integer.
* **max_new_tokens:** Model generates text until the output length (excluding the input context length) reaches `max_new_tokens`. If specified, it must be a positive integer.
* **num_beams:** Number of beams used in the greedy search. If specified, it must be integer greater than or equal to `num_return_sequences`.
* **no_repeat_ngram_size:** Model ensures that a sequence of words of `no_repeat_ngram_size` is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.
* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.
* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of sentence token. If specified, it must be boolean.
* **do_sample:** If True, sample the next word as per the likelihood. If specified, it must be boolean.
* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.
* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.
* **return_full_text:** If True, input text will be part of the output generated text. If specified, it must be boolean. The default value for it is False.
* **stop**: If specified, it must a list of strings. Text generation stops if any one of the specified strings is generated.

We may specify any subset of the parameters mentioned above while invoking an endpoint. 

For more parameters and information on HF LLM DLC, please see [this article](https://huggingface.co/blog/sagemaker-huggingface-llm#4-run-inference-and-chat-with-our-model).
***

The pre-trained model was not specifically trained to generate unanswerable questions. Despite the input prompt, it tends to generate questions that can be answered from the text. The fine-tuned model is generally better at this task, and the improvement is more prominent for larger models 

### 5 Clean up the endpoint

In [18]:
# Delete the SageMaker endpoint
predictor.delete_model()
predictor.delete_endpoint()
#instruction_tuned_predictor.delete_model()
#instruction_tuned_predictor.delete_endpoint()

Exception: One or more models cannot be deleted, please retry. 
Failed models: hf-llm-falcon-7b-instruct-bf16-2023-10-03-14-25-05-053