In [None]:
# Introduction to SageMaker JumpStart - Text Generation

---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

---

In this demo notebook, we demonstrate how to use the SageMaker Python SDK to deploy a SageMaker JumpStart text generation model and invoke the endpoint.

## Setup
First, upgrade to the latest sagemaker SDK to ensure all available models are deployable.

In [1]:
%pip install --upgrade --quiet sagemaker

Note: you may need to restart the kernel to use updated packages.


Select the desired model to deploy. The provided dropdown filters all text generation models available in SageMaker JumpStart.

In [4]:
model_id = "meta-textgeneration-llama-3-8b-instruct"
model_version = "2.2.1"

In [5]:
accept_eula = True

## Deploy model

Create a `JumpStartModel` object, which initializes default model configurations conditioned on the selected instance type. JumpStart already sets a default instance type, but you can deploy the model on other instance types by passing `instance_type` to the `JumpStartModel` class.

In [6]:
from sagemaker.jumpstart.model import JumpStartModel

model = JumpStartModel(model_id=model_id,model_version=model_version, instance_type="ml.g5.2xlarge")

You can now deploy the model using SageMaker JumpStart. If the selected model is gated, you will need to accept the end-user license agreement (EULA) prior to deployment. This is accomplished by providing the `accept_eula=True` argument to the `deploy` method. The deployment might take few minutes. 

In [7]:
predictor = model.deploy(accept_eula=accept_eula)

-----------------!

## Invoke the endpoint

This section demonstrates how to invoke the endpoint using example payloads that are retrieved programmatically from the `JumpStartModel` object. You can replace these example payloads with your own payloads.

In [8]:
example_payloads = model.retrieve_all_examples()

In [9]:
for payload in example_payloads:
    response = predictor.predict(payload.body)
    response = response[0] if isinstance(response, list) else response
    print("Input:\n", payload.body, end="\n\n")
    print("Output:\n", response["generated_text"].strip(), end="\n\n\n")

Input:
 {'inputs': '<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nwhat is the recipe of mayonnaise?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n', 'parameters': {'max_new_tokens': 256, 'top_p': 0.9, 'temperature': 0.6, 'details': True, 'stop': '<|eot_id|>'}}

Output:
 The classic condiment! Mayonnaise is a thick, creamy sauce made from a combination of oil, egg yolks, acid (such as vinegar or lemon juice), and seasonings. Here's a simple recipe to make mayonnaise at home:

**Ingredients:**

* 2 egg yolks
* 1 tablespoon lemon juice or vinegar (such as apple cider vinegar or white wine vinegar)
* 1/2 teaspoon Dijon mustard (optional, but recommended for flavor)
* 1/2 cup (120 ml) neutral-tasting oil, such as canola, grapeseed, or sunflower oil
* Salt, to taste
* Water, as needed

**Instructions:**

1. **Start with room temperature ingredients**: Make sure your egg yolks, lemon juice, and oil are at room temperature. This will help the mixture emulsify smoothl

# BLEU

In [10]:
!pip install nltk sacrebleu
!sacrebleu --version

sacrebleu 2.4.2


In [11]:
import sacrebleu

# 参考文本（字符串）
reference_text = "The patient is referred due to the rapid progression of the tumor despite initial treatment modalities. Further evaluation and management from a specialized neuro-oncology team are essential for this recurrent and aggressive tumor."

# 生成的文本（字符串）
candidate_text = "The patient is referred due to the rapid progression of the tumor despite initial treatment modalities. Further evaluation and management from a specialized neuro-oncology team are essential for this recurrent and aggressive tumor."

# 计算BLEU分数，调整n-gram权重
# 默认权重是(0.25, 0.25, 0.25, 0.25)，分别对应1-gram到4-gram
# 我们可以调整权重，例如只考虑1-gram和2-gram，权重分别为(0.5, 0.5)
# 设置n-gram权重
weights = (0.5, 0.5, 0, 0)

# 创建BLEU对象，并传入自定义的权重
weights = (0.5, 0.5, 0, 0)

bleu = sacrebleu.corpus_bleu([candidate_text], [[reference_text]])
print(f"BLEU score (): {bleu.score}")
print(bleu)

BLEU score (): 100.00000000000004
BLEU = 100.00 100.0/100.0/100.0/100.0 (BP = 1.000 ratio = 1.000 hyp_len = 35 ref_len = 35)


# ROUGE

In [12]:
!pip install rouge



In [13]:
from rouge import Rouge

# 计算ROUGE分数
rouge = Rouge()
scores = rouge.get_scores(candidate_text, reference_text, avg=True) # 由于只有一个 reference，所以 avg没有影响
print(f"ROUGE scores: {scores}")

ROUGE scores: {'rouge-1': {'r': 1.0, 'p': 1.0, 'f': 0.999999995}, 'rouge-2': {'r': 1.0, 'p': 1.0, 'f': 0.999999995}, 'rouge-l': {'r': 1.0, 'p': 1.0, 'f': 0.999999995}}


In [14]:
with open("letter.txt", "r", encoding="utf-8") as f:
    letter = f.read()
# print(letter)

response = predictor.predict({'inputs': letter,
                             'parameters': {'max_new_tokens': 128}})
response = response[0] if isinstance(response, list) else response

print("Output:\n", response["generated_text"].strip(), end="\n\n\n")

Output:
 Output:
{
"is_Papilledema": False,
"referral_content": "Kindly see him for a specialist evaluation to determine the best course of action. I believe a more intensive investigation, including possibly imaging and more advanced pulmonary function tests, might be necessary."
}




In [15]:
## test all test.json with llama3

In [22]:
import sacrebleu
from rouge import Rouge

import json
import pandas as pd

import re

def replace_output_prefix(input_str):
    # 使用正则表达式进行替换，忽略大小写
    result = re.sub(r'(?i)^Output:\s*', '', input_str)
    return result


def extract_braced_content(input_str):
    # 使用正则表达式提取大括号及其内的内容
    match = re.search(r'\{[^{}]*\}', input_str)
    if match:
        # 返回包含大括号及其内容的匹配部分
        return match.group(0)
    else:
        # 如果没有匹配到内容，返回空字符串或其他标记
        return ''


def extract_referral_content(data_str):
    # 使用正则表达式提取referral_content的内容
    match = re.search(r'"referral_content":\s*(null|".*?")', data_str)
    if match:
        content = match.group(1)
        if content == "null":
            return ""
        else:
            # 去掉引号
            return content.strip('"')
    return ''


def evaluate_jsonl_with_llama3(predictor, path, csv_file_path):
    test_data_json = []
    with open(path, 'r', encoding='utf-8') as f:
        for line in f:
            test_data_json.append(json.loads(line.strip()))
    rouge_score_list = []
    bleu_score_list = []

    rouge = Rouge()

    evaluate_list = []

    for single_test in test_data_json:
        instruction = single_test["instruction"]
        whole_letter = single_test["whole_letter"]
        referral_content = single_test["referral_content"]
        prompt = f"{instruction}\n\n###\n\n{whole_letter}\n\n###"
        response = predictor.predict({'inputs': prompt, 'parameters': {'max_new_tokens': 128}})
        # print(prompt)
        print("----------------------------------------------------------------")

        response = response[0] if isinstance(response, list) else response
        # print("Output:\n", response["generated_text"].strip(), end="\n\n\n")

        reference_text = referral_content
        try:
            tmp_str = replace_output_prefix(response["generated_text"].strip())
            # print(tmp_str)
            tmp_str = extract_braced_content(tmp_str)
            # print(tmp_str)
            tmp = extract_referral_content(tmp_str)
            candidate_text = tmp
        except Exception as err:
            # print(single_test["id"])
            # print(response["generated_text"].strip())
            # print()
            candidate_text = "extract failure"
        finally:

            evaluate_list.append(candidate_text)
            print("predict: " + candidate_text)
            print("real: " + reference_text)


            bleu = sacrebleu.corpus_bleu([candidate_text], [[reference_text]])
            bleu_score_list.append(bleu.score)
            print(bleu.score)
            single_test["bleu"] = bleu.score
            single_test["predict_referral_content"] = candidate_text
            print()
            # 计算ROUGE分数
            # scores = rouge.get_scores(candidate_text, reference_text) # 由于只有一个 reference，所以 avg没有影响
            # rouge_score_list.append(scores)
            
#     with open(output_jsonl_path, mode='w', encoding='utf-8') as f:
#         for single_test in test_data_json:
#             f.write(json.dumps(single_test, ensure_ascii=False) + '\n')

#     print(f"predicted data has been saved to {output_path}.")
    
    # 创建 CSV 文件
    csv_data = []

    for single_test in test_data_json:
        csv_data.append({
            # "id": single_test["id"],
            # "name": single_test["name"],
            "instruction": single_test["instruction"],
            "whole_letter": single_test["whole_letter"],
            "referral_content": single_test["referral_content"],
            "predict_referral_content": single_test["predict_referral_content"],
            "bleu": single_test["bleu"],
        })

    # 创建 DataFrame
    df = pd.DataFrame(csv_data)

    # 保存为 CSV 文件
    df.to_csv(csv_file_path, index=False, encoding='utf-8')

    print(f"CSV file has been saved to {csv_file_path}")
    
    return evaluate_list, bleu_score_list, rouge_score_list

In [23]:
test_evaluate_list, test_bleu_score_list, test_rouge_score_list = evaluate_jsonl_with_llama3(predictor, "test_data_735/test.jsonl", "llama3_original_test_221.csv")

----------------------------------------------------------------
predict: The reason for this referral is to further evaluate and manage Ms. Jane Doe's progressively worsening cardiopulmonary symptoms. Given her clinical presentation, further investigation is essential to rule out possible cardiac etiologies, including congestive heart failure or valvular disease.
real: The reason for this referral is to further evaluate and manage Ms. Jane Doe's progressively worsening cardiopulmonary symptoms. Given her clinical presentation, further investigation is essential to rule out possible cardiac etiologies, including congestive heart failure or valvular disease.
100.00000000000004

----------------------------------------------------------------
predict: I am writing to refer Mr. Robert Davis, a 75-year-old male, for further evaluation and management. Your expertise in this matter would be greatly appreciated.
real: Mr. Davis has presented with significant neck pain and dropped head syndrom

## Clean up the endpoint
Don't forget to clean up resources when finished to avoid unnecessary charges.

In [24]:
def analyze_predict_data(bleu_score_list):
    # 统计大于100的个数
    count_gt_100 = sum(1 for score in bleu_score_list if score >= 100)

    # 统计大于70的个数
    count_gt_70 = sum(1 for score in bleu_score_list if score > 70)

    prob_gt_100 = count_gt_100 / len(bleu_score_list)
    prob_gt_70 = count_gt_70 / len(bleu_score_list)
    average_score = sum(bleu_score_list) / float(len(bleu_score_list))

    print(f"分数大于100的个数：{count_gt_100}, 占所有数据的百分比为： {prob_gt_100}")
    print(f"分数大于70的个数：{count_gt_70}, 占所有数据的百分比为： {prob_gt_70}")
    print(f"bleu平均分数: {average_score}")

In [25]:
analyze_predict_data(test_bleu_score_list)

分数大于100的个数：52, 占所有数据的百分比为： 0.37142857142857144
分数大于70的个数：55, 占所有数据的百分比为： 0.39285714285714285
bleu平均分数: 39.17140614899256


In [None]:
predictor.delete_predictor()

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.


![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|jumpstart-text-generation-inference.ipynb)

