## [model_tuner]flan_t5_xl_instruction_ml_p3_16xl

이 노트북에서는 특정 작업에 대해 사전 학습된 FLAN-T5-XL의 성능을 향상시키기 위해 instruction fine-tuning이라는 프로세스에서 대상 작업의 예제를 사용하여 모델을 조정할 수 있습니다. Instruction fine-tuning은 {prompt, response} 쌍의 형태로 레이블이 지정된 예제 세트를 사용하여 프롬프트가 주어졌을 때 응답을 적절하게 예측하도록 사전 학습된 모델을 추가로 훈련합니다. 이 프로세스는 모델의 가중치를 수정합니다.

사전 학습된 **FLAN T5 모델**을 미세 조정합니다. 사전 학습된 FLAN T5 모델은 많은 작업에 "있는 그대로" 사용할 수 있지만, fine-tuning을 통해 특정 작업이나 언어 도메인에서 모델 성능을 향상시킬 수 있습니다. 예를 들어, 사전 학습에 사용되지 않은 작업에 대해 모델을 미세 조정합니다. 미세 조정 후에는 pretrained 모델과 fine-tuned 모델을 사용하여 두 개의 추론 엔드포인트를 배포합니다. 그런 다음 두 엔드포인트에 대해 동일한 추론 쿼리를 실행하고 결과를 비교합니다.

<img src="./figures/flan-t5.png"  width="700" height="370">

[](https://aws.amazon.com/blogs/machine-learning/instruction-fine-tuning-for-flan-t5-xl-with-amazon-sagemaker-jumpstart/)

#### In this notebook:
1. [Setting up](#1.-Setting-up)
1. [Fine-tuning a model](#2.-Fine-tuning-a-model)
1. [Deploying inference endpoints](#3.-Deploying-inference-endpoints)
1. [Running inference queries](#4.-Running-inference-queries)
1. [Cleaning up resources](#5.-Cleaning-up-resources)

### 1. Setting up

필요한 패키지를 설치하고 업그레이드하는 것으로 시작합니다. 아래 셀을 실행한 후 커널을 재시작합니다.
노트북 전체에서 다음 변수를 사용할 것이며, 특히, FLAN T5 모델 크기를 선택하고 학습 및 추론 인스턴스 유형을 선택합니다. 또한 현재 노트북 인스턴스와 연결된 실행 역할도 가져옵니다.

In [17]:
import boto3
import sagemaker
import pprint
import time

# Get current region, role, and default bucket
aws_region = boto3.Session().region_name
aws_role = sagemaker.session.Session().get_caller_identity_arn()
output_bucket = sagemaker.Session().default_bucket()

# This will be useful for printing
newline, bold, unbold = "\n", "\033[1m", "\033[0m"

print(f"{bold}aws_region:{unbold} {aws_region}")
print(f"{bold}aws_role:{unbold} {aws_role}")
print(f"{bold}output_bucket:{unbold} {output_bucket}")

[1maws_region:[0m us-west-2
[1maws_role:[0m arn:aws:iam::322537213286:role/service-role/AmazonSageMaker-ExecutionRole-20230528T120509
[1moutput_bucket:[0m sagemaker-us-west-2-322537213286


In [18]:
import IPython
from ipywidgets import Dropdown
from sagemaker.jumpstart.filters import And
from sagemaker.jumpstart.notebook_utils import list_jumpstart_models

# Default model choice
model_id = "huggingface-text2text-flan-t5-xl"

# Identify FLAN T5 models that support fine-tuning
filter_value = And("task == text2text", "framework == huggingface", "training_supported == true")
model_list = [m for m in list_jumpstart_models(filter=filter_value) if "flan-t5" in m]

# Display the model IDs in a dropdown, for user to select
dropdown = Dropdown(
    value=model_id,
    options=model_list,
    description="FLAN T5 models available for fine-tuning:",
    style={"description_width": "initial"},
    layout={"width": "max-content"},
)
display(IPython.display.Markdown("### Select a pre-trained model from the dropdown below"))
display(dropdown)

### Select a pre-trained model from the dropdown below

A Jupyter Widget

In [4]:
from sagemaker.instance_types import retrieve_default

model_id, model_version = dropdown.value, "*"

# Instance types for training and inference
training_instance_type = retrieve_default(
    model_id=model_id, model_version=model_version, scope="training"
)
inference_instance_type = retrieve_default(
    model_id=model_id, model_version=model_version, scope="inference"
)

print(f"{bold}model_id:{unbold} {model_id}")
print(f"{bold}training_instance_type:{unbold} {training_instance_type}")
print(f"{bold}inference_instance_type:{unbold} {inference_instance_type}")

[1mmodel_id:[0m huggingface-text2text-flan-t5-xl
[1mtraining_instance_type:[0m ml.p3.16xlarge
[1minference_instance_type:[0m ml.g5.2xlarge


#### 2.1. Preparing training data
우리는 supervised fine-tuning을 위해 SQuAD2.0의 하위 집합을 사용할 것입니다. 이 데이터 세트에는 위키백과 문서 세트에 대해 human annotators으로 조정된 질문들이 포함되어 있습니다. 답변이 있는 질문 외에도 SQuAD2.0에는 약 5만 개의 답변할 수 없는 질문이 포함되어 있습니다. 이러한 질문은 그럴듯하지만 문서 내용에서 직접 답을 구할 수 없습니다. 저희는 답변이 없는 질문만 작업에 사용합니다.

*Citation: @article{rajpurkar2018know, title={Know what you don't know: Unanswerable questions for SQuAD},
author={Rajpurkar, Pranav and Jia, Robin and Liang, Percy}, journal={arXiv preprint arXiv:1806.03822}, year={2018} }*

License: [Creative Commons Attribution-ShareAlike License (CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/legalcode)

In [5]:
!rm -rf data

In [6]:
from sagemaker.s3 import S3Downloader

# We will use the train split of SQuAD2.0
original_data_file = "train-v2.0.json"

# The data was mirrored in the following bucket
original_data_location = f"s3://sagemaker-sample-files/datasets/text/squad2.0/{original_data_file}"
S3Downloader.download(original_data_location, "./data")

Text2Text generation 모델은 데이터가 예상되는 형식이라면 모든 텍스트 데이터에 대해 fine-tuned를 할 수 있습니다. 데이터에는 학습 및 선택적 validation 부분이 포함되어야 합니다. 각 epoch가 끝날 때마다 계산되는 validation 손실에 따라 최상의 모델이 선택됩니다. validation 세트가 제공되지 않으면 training 데이터의 (조정 가능한) 백분율이 자동으로 분할되어 validation에 사용됩니다.

training 데이터는 각 라인이 단일 데이터 샘플을 나타내는 dict로, JSON lines(`.jsonl`) 포맷으로 형식화해야 합니다. 모든 training 데이터는 단일 폴더에 있어야 하지만 여러 개의 jsonl 파일에 저장할 수 있습니다. 파일 확장자 `.jsonl`은 필수입니다. 또한, training 폴더에는 입력 및 출력 형식을 설명하는 `template.json` 파일도 포함할 수 있습니다.

템플릿 파일을 지정하지 않으면 다음 default 템플릿이 사용됩니다:
```json
{
    "prompt": "{prompt}",
    "completion": "{completion}"
}
```
이 경우 JSON lines 항목의 데이터에는 `prompt` 및 `completion` 필드가 포함되어야 합니다.
이 데모에서는 사용자 지정 템플릿을 사용하겠습니다(아래 참조).

In [7]:
import json

local_data_file = "./data/task-data.jsonl"  # any name with .jsonl extension

with open('./data/' + original_data_file) as f:
    data = json.load(f)

with open(local_data_file, "w") as f:
    for article in data["data"]:
        for paragraph in article["paragraphs"]:
            # iterate over questions for a given paragraph
            for qas in paragraph["qas"]:
                if qas["is_impossible"]:
                    # the question is relevant, but cannot be answered
                    example = {"context": paragraph["context"], "question": qas["question"]}
                    json.dump(example, f)
                    f.write("\n")

template = {
    "prompt": "Ask a question which is related to the following text, but cannot be answered based on the text. Text: {context}",
    "completion": "{question}",
}
with open("./data/template.json", "w") as f:
    json.dump(template, f)

In [8]:
from sagemaker.s3 import S3Uploader

train_data_location = f"s3://{output_bucket}/train_data"
S3Uploader.upload(local_data_file, train_data_location)
S3Uploader.upload("./data/template.json", train_data_location)
print(f"{bold}training data:{unbold} {train_data_location}")

[1mtraining data:[0m s3://sagemaker-us-west-2-322537213286/train_data


#### 2.2. Start training

이제 training job을 시작할 준비가 되었습니다.

In [9]:
from sagemaker import image_uris, model_uris, script_uris

# Training instance will use this image
train_image_uri = image_uris.retrieve(
    region=aws_region,
    framework=None,  # automatically inferred from model_id
    model_id=model_id,
    model_version=model_version,
    image_scope="training",
    instance_type=training_instance_type,
)

# Pre-trained model
train_model_uri = model_uris.retrieve(
    model_id=model_id, model_version=model_version, model_scope="training"
)

# Script to execute on the training instance
train_script_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="training"
)

output_location = f"s3://{output_bucket}/demo-fine-tune-flan-t5/"

print(f"{bold}image uri:{unbold} {train_image_uri}")
print(f"{bold}model uri:{unbold} {train_model_uri}")
print(f"{bold}script uri:{unbold} {train_script_uri}")
print(f"{bold}output location:{unbold} {output_location}")

[1mimage uri:[0m 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
[1mmodel uri:[0m s3://jumpstart-cache-prod-us-west-2/huggingface-training/train-huggingface-text2text-flan-t5-xl.tar.gz
[1mscript uri:[0m s3://jumpstart-cache-prod-us-west-2/source-directory-tarballs/huggingface/transfer_learning/text2text/prepack/v1.0.3/sourcedir.tar.gz
[1moutput location:[0m s3://sagemaker-us-west-2-322537213286/demo-fine-tune-flan-t5/


In [6]:
# !rm -rf ./flan_t5_xl_instruction_ml_p3_16xl/
# !mkdir ./flan_t5_xl_instruction_ml_p3_16xl/
# !aws s3 cp $train_script_uri ./flan_t5_xl_instruction_ml_p3_16xl/
# !tar -xvzf ./flan_t5_xl_instruction_ml_p3_16xl/sourcedir.tar.gz -C ./flan_t5_xl_instruction_ml_p3_16xl/
# !rm -rf ./flan_t5_xl_instruction_ml_p3_16xl/sourcedir.tar.gz

In [19]:
from sagemaker import hyperparameters

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(model_id=model_id, model_version=model_version)

# 일부 default 하이퍼파라미터를 custom 값들로 override 합니다.
hyperparameters["epochs"] = "3"
pprint.pprint(hyperparameters)

# Note that the maximum output length is set to 128 tokens by default.
# The targets in your data (i.e., ground truth responses) will be truncated to this size.
# You can override this behavior, e.g.,
# hyperparameters["max_output_length"] = "256"

{'adam_beta1': '0.9',
 'adam_beta2': '0.999',
 'adam_epsilon': '1e-08',
 'auto_find_batch_size': 'False',
 'batch_size': '64',
 'dataloader_drop_last': 'False',
 'dataloader_num_workers': '0',
 'early_stopping_patience': '3',
 'early_stopping_threshold': '0.0',
 'epochs': '3',
 'eval_accumulation_steps': 'None',
 'eval_steps': '500',
 'evalaution_strategy': 'epoch',
 'gradient_accumulation_steps': '1',
 'gradient_checkpointing': 'True',
 'label_smoothing_factor': '0',
 'learning_rate': '0.0001',
 'load_best_model_at_end': 'True',
 'logging_first_step': 'False',
 'logging_nan_inf_filter': 'True',
 'logging_steps': '500',
 'logging_strategy': 'steps',
 'lr_scheduler_type': 'constant_with_warmup',
 'max_eval_samples': '-1',
 'max_grad_norm': '1.0',
 'max_input_length': '-1',
 'max_output_length': '128',
 'max_steps': '-1',
 'max_train_samples': '-1',
 'pad_to_max_length': 'True',
 'preprocessing_num_workers': 'None',
 'save_steps': '500',
 'save_strategy': 'epoch',
 'save_total_limit': '2

이제 training job을 시작할 준비가 되었습니다. model size, amount of data 등에 따라 완료하는 데 20분에서 몇 시간까지 시간이 걸릴 수 있습니다(예: xl 모델, 4만 개의 examples, 3개의 epoch 경우 몇 시간이 걸릴 수 있음).

In [20]:
from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base

model_name = "-".join(model_id.split("-")[2:])  # get the most informative part of ID
training_job_name = name_from_base(f"js-demo-{model_name}-{hyperparameters['epochs']}")
print(f"{bold}job name:{unbold} {training_job_name}")

training_metric_definitions = [
    {"Name": "val_loss", "Regex": "'eval_loss': ([0-9\\.]+)"},
    {"Name": "train_loss", "Regex": "'loss': ([0-9\\.]+)"},
    {"Name": "epoch", "Regex": "'epoch': ([0-9\\.]+)"},
]

# Create SageMaker Estimator instance
sm_estimator = Estimator(
    role=aws_role,
    image_uri=train_image_uri,
    model_uri=train_model_uri,
    source_dir=train_script_uri,
    entry_point="transfer_learning.py",
    instance_count=1,
    instance_type=training_instance_type,
    volume_size=250,
    max_run=360000,
    hyperparameters=hyperparameters,
    output_path=output_location,
    metric_definitions=training_metric_definitions,
)

# Launch a SageMaker training job over data located in the given S3 path
# Training jobs can take hours, it is recommended to set wait=False,
# and monitor job status through SageMaker console
sm_estimator.fit({"training": train_data_location}, job_name=training_job_name, wait=False)

INFO:sagemaker:Creating training-job with name: js-demo-flan-t5-xl-3-2023-05-29-06-20-37-686


[1mjob name:[0m js-demo-flan-t5-xl-3-2023-05-29-06-20-37-686


training 및 validation 손실과 같은 성능 메트릭은 트레이닝 중에 CloudWatch를 통해 액세스할 수 있습니다. 또한 다음과 같이 메트릭의 가장 최근 스냅샷을 가져올 수도 있습니다.

In [None]:
sm_estimator.logs()

2023-05-29 06:27:14 Starting - Preparing the instances for training
2023-05-29 06:27:14 Downloading - Downloading input data
2023-05-29 06:27:14 Training - Training image download completed. Training in progress.[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2023-05-29 06:27:15,982 sagemaker-training-toolkit INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2023-05-29 06:27:16,056 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2023-05-29 06:27:16,059 sagemaker_pytorch_container.training INFO     Invoking user training script.[0m
[34m2023-05-29 06:27:17,680 sagemaker-training-toolkit INFO     Installing dependencies from requirements.txt:[0m
[34m/opt/conda/bin/python3.8 -m pip install -r requirements.txt[0m
[34mProcessing ./lib/absl-py/absl_py-1.4.0-py3-none-any.whl[0m
[34mProcessing ./lib/accelerate/accelerate-0.16.0-p

In [23]:
from sagemaker import TrainingJobAnalytics

# Wait for a couple of minutes for the job to start before running this cell
# This can be called while the job is still running
flag = True
while flag:
    try:
        df = TrainingJobAnalytics(training_job_name=training_job_name).dataframe()
        flag = False
    except Exception as e:
        print(e)
        time.sleep(10)
        flag = True

df.head(10)

Unnamed: 0,timestamp,metric_name,value
0,0.0,val_loss,1.566096
1,5760.0,val_loss,1.543338
2,11580.0,val_loss,1.619192
3,0.0,train_loss,1.7226
4,4500.0,train_loss,1.4195
5,9060.0,train_loss,1.1988
6,0.0,epoch,0.77
7,1380.0,epoch,1.0
8,4500.0,epoch,1.55
9,7140.0,epoch,2.0


### 3. Deploying inference endpoints

training job이 성공적으로 완료되면 노트북의 이후 작업을 실행해야 합니다. 변수 `training_job_name`에는 job name이 포함되고 `output_location`은 fine-tuned model artifact가 있는 S3 위치를 가리킨다는 점을 기억하세요.

기존의 pre-trained 모델과 fine-tuned 모델에 대한 2개의 추론 엔드포인트를 생성합니다. 그런 다음 2개 엔드포인트에 대해 동일한 요청을 실행하고 결과를 비교합니다.

각 엔드포인트 배포에는 몇 분 정도 소요될 수 있습니다.

In [15]:
from sagemaker import image_uris

# Retrieve the inference docker image URI. This is the base HuggingFace container image
deploy_image_uri = image_uris.retrieve(
    region=aws_region,
    framework=None,  # automatically inferred from model_id
    model_id=model_id,
    model_version=model_version,
    image_scope="inference",
    instance_type=inference_instance_type,
)
deploy_image_uri

'763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04'

In [16]:
from sagemaker import model_uris, script_uris
from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.utils import name_from_base

# Retrieve the URI of the pre-trained model
pre_trained_model_uri = model_uris.retrieve(
    model_id=model_id, model_version=model_version, model_scope="inference"
)

pre_trained_name = name_from_base(f"jumpstart-demo-pre-trained-{model_id}")

# Create the SageMaker model instance of the pre-trained model
if ("small" in model_id) or ("base" in model_id):
    deploy_source_uri = script_uris.retrieve(
        model_id=model_id, model_version=model_version, script_scope="inference"
    )
    pre_trained_model = Model(
        image_uri=deploy_image_uri,
        source_dir=deploy_source_uri,
        entry_point="inference.py",
        model_data=pre_trained_model_uri,
        role=aws_role,
        predictor_cls=Predictor,
        name=pre_trained_name,
    )
else:
    # For those large models, we already repack the inference script and model
    # artifacts for you, so the `source_dir` argument to Model is not required.
    pre_trained_model = Model(
        image_uri=deploy_image_uri,
        model_data=pre_trained_model_uri,
        role=aws_role,
        predictor_cls=Predictor,
        name=pre_trained_name,
    )

print(f"{bold}image URI:{unbold}{newline} {deploy_image_uri}")
print(f"{bold}model URI:{unbold}{newline} {pre_trained_model_uri}")
print("Deploying an endpoint ...")

# 사전 학습된 모델을 배포합니다. 모델 클래스를 통해 모델을 배포할 때 Predictor 클래스를 전달해야 SageMaker API를 통해 추론을 실행할 수 있습니다.
pre_trained_predictor = pre_trained_model.deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    predictor_cls=Predictor,
    endpoint_name=pre_trained_name,
)
print(f"{newline}Deployed an endpoint {pre_trained_name}")

INFO:sagemaker:Creating model with name: jumpstart-demo-pre-trained-huggingface--2023-05-29-06-11-38-706


[1mimage URI:[0m
 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
[1mmodel URI:[0m
 s3://jumpstart-cache-prod-us-west-2/huggingface-infer/prepack/v1.0.5/infer-prepack-huggingface-text2text-flan-t5-xl.tar.gz
Deploying an endpoint ...


INFO:sagemaker:Creating endpoint-config with name jumpstart-demo-pre-trained-huggingface--2023-05-29-06-11-38-706
INFO:sagemaker:Creating endpoint with name jumpstart-demo-pre-trained-huggingface--2023-05-29-06-11-38-706


----------!
Deployed an endpoint jumpstart-demo-pre-trained-huggingface--2023-05-29-06-11-38-706


In [None]:
from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.utils import name_from_base

fine_tuned_name = name_from_base(f"jumpstart-demo-fine-tuned-{model_id}")
fine_tuned_model_uri = f"{output_location}{training_job_name}/output/model.tar.gz"

# Create the SageMaker model instance of the fine-tuned model
fine_tuned_model = Model(
    image_uri=deploy_image_uri,
    model_data=fine_tuned_model_uri,
    role=aws_role,
    predictor_cls=Predictor,
    name=fine_tuned_name,
)

print(f"{bold}image URI:{unbold}{newline} {deploy_image_uri}")
print(f"{bold}model URI:{unbold}{newline} {fine_tuned_model_uri}")
print("Deploying an endpoint ...")

# Deploy the fine-tuned model.
fine_tuned_predictor = fine_tuned_model.deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    predictor_cls=Predictor,
    endpoint_name=fine_tuned_name,
)
print(f"{newline}Deployed an endpoint {fine_tuned_name}")

INFO:sagemaker:Creating model with name: jumpstart-demo-fine-tuned-huggingface-t-2023-05-29-11-50-05-886


[1mimage URI:[0m
 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
[1mmodel URI:[0m
 s3://sagemaker-us-west-2-322537213286/demo-fine-tune-flan-t5/js-demo-flan-t5-xl-3-2023-05-29-06-20-37-686/output/model.tar.gz
Deploying an endpoint ...


INFO:sagemaker:Creating endpoint-config with name jumpstart-demo-fine-tuned-huggingface-t-2023-05-29-11-50-05-886
INFO:sagemaker:Creating endpoint with name jumpstart-demo-fine-tuned-huggingface-t-2023-05-29-11-50-05-886


----------!
Deployed an endpoint jumpstart-demo-fine-tuned-huggingface-t-2023-05-29-11-50-05-886


### 4. Running inference queries

이름에서 알 수 있듯이 FLAN T5와 같은 Text2Text 모델은 텍스트를 입력으로 받아 출력으로 텍스트를 생성합니다. 입력 텍스트에는 작업에 대한 설명이 포함됩니다. 이 데모에서는 텍스트가 주어지면 질문을 생성하는 작업을 수행합니다. 질문은 텍스트와 관련이 있어야 하지만 텍스트에는 답이 포함되어서는 안 됩니다. 이러한 작업은 추가 정보 수집을 자동화하거나 기술 문서에서 부족한 부분을 식별할 때 발생할 수 있습니다.

In [None]:
prompt = "Ask a question which is related to the following text, but cannot be answered based on the text. Text: {context}"

# Sources: Wikipedia, AWS Documentation
test_paragraphs = [
    """
Adelaide is the capital city of South Australia, the state's largest city and the fifth-most populous city in Australia. "Adelaide" may refer to either Greater Adelaide (including the Adelaide Hills) or the Adelaide city centre. The demonym Adelaidean is used to denote the city and the residents of Adelaide. The Traditional Owners of the Adelaide region are the Kaurna people. The area of the city centre and surrounding parklands is called Tarndanya in the Kaurna language.
Adelaide is situated on the Adelaide Plains north of the Fleurieu Peninsula, between the Gulf St Vincent in the west and the Mount Lofty Ranges in the east. Its metropolitan area extends 20 km (12 mi) from the coast to the foothills of the Mount Lofty Ranges, and stretches 96 km (60 mi) from Gawler in the north to Sellicks Beach in the south.
""",
    """
Amazon Elastic Block Store (Amazon EBS) provides block level storage volumes for use with EC2 instances. EBS volumes behave like raw, unformatted block devices. You can mount these volumes as devices on your instances. EBS volumes that are attached to an instance are exposed as storage volumes that persist independently from the life of the instance. You can create a file system on top of these volumes, or use them in any way you would use a block device (such as a hard drive). You can dynamically change the configuration of a volume attached to an instance.
We recommend Amazon EBS for data that must be quickly accessible and requires long-term persistence. EBS volumes are particularly well-suited for use as the primary storage for file systems, databases, or for any applications that require fine granular updates and access to raw, unformatted, block-level storage. Amazon EBS is well suited to both database-style applications that rely on random reads and writes, and to throughput-intensive applications that perform long, continuous reads and writes.
""",
    """
Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. 
You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. 
All of the Amazon Comprehend features accept UTF-8 text documents as the input. In addition, custom classification and custom entity recognition accept image files, PDF files, and Word files as input. 
Amazon Comprehend can examine and analyze documents in a variety of languages, depending on the specific feature. For more information, see Languages supported in Amazon Comprehend. Amazon Comprehend's Dominant language capability can examine documents and determine the dominant language for a far wider selection of languages.
""",
]

In [None]:
import boto3
import json

# Parameters of (output) text generation. A great introduction to generation
# parameters can be found at https://huggingface.co/blog/how-to-generate
parameters = {
    "max_length": 40,  # restrict the length of the generated text
    "num_return_sequences": 5,  # we will inspect several model outputs
    "num_beams": 10,  # use beam search
}


# Helper functions for running inference queries
def query_endpoint_with_json_payload(payload, endpoint_name):
    encoded_json = json.dumps(payload).encode("utf-8")
    client = boto3.client("runtime.sagemaker")
    response = client.invoke_endpoint(
        EndpointName=endpoint_name, ContentType="application/json", Body=encoded_json
    )
    return response


def parse_response_multiple_texts(query_response):
    model_predictions = json.loads(query_response["Body"].read())
    generated_text = model_predictions["generated_texts"]
    return generated_text


def generate_questions(endpoint_name, text):
    expanded_prompt = prompt.replace("{context}", text)
    payload = {"text_inputs": expanded_prompt, **parameters}
    query_response = query_endpoint_with_json_payload(payload, endpoint_name=endpoint_name)
    generated_texts = parse_response_multiple_texts(query_response)
    for i, generated_text in enumerate(generated_texts):
        print(f"Response {i}: {generated_text}{newline}")

In [None]:
print(f"{bold}Prompt:{unbold} {repr(prompt)}")
for paragraph in test_paragraphs:
    print("-" * 80)
    print(paragraph)
    print("-" * 80)
    print(f"{bold}pre-trained{unbold}")
    generate_questions(pre_trained_name, paragraph)
    print(f"{bold}fine-tuned{unbold}")
    generate_questions(fine_tuned_name, paragraph)

[1mPrompt:[0m 'Ask a question which is related to the following text, but cannot be answered based on the text. Text: {context}'
--------------------------------------------------------------------------------

Adelaide is the capital city of South Australia, the state's largest city and the fifth-most populous city in Australia. "Adelaide" may refer to either Greater Adelaide (including the Adelaide Hills) or the Adelaide city centre. The demonym Adelaidean is used to denote the city and the residents of Adelaide. The Traditional Owners of the Adelaide region are the Kaurna people. The area of the city centre and surrounding parklands is called Tarndanya in the Kaurna language.
Adelaide is situated on the Adelaide Plains north of the Fleurieu Peninsula, between the Gulf St Vincent in the west and the Mount Lofty Ranges in the east. Its metropolitan area extends 20 km (12 mi) from the coast to the foothills of the Mount Lofty Ranges, and stretches 96 km (60 mi) from Gawler in the nor

pre-trained 모델은 답변할 수 없는 질문을 생성하도록 특별히 학습되지 않았습니다. 입력 프롬프트에도 불구하고 텍스트에서 답변할 수 있는 질문을 생성하는 경향이 있습니다. 일반적으로 fine-tuned 모델이 이 작업을 더 잘 수행하며, 이러한 개선은 더 큰 모델(예: 기본이 아닌 xl)에서 더 두드러집니다.

### 5. Cleaning up resources

In [28]:
# Delete resources
pre_trained_predictor.delete_model()
pre_trained_predictor.delete_endpoint()
fine_tuned_predictor.delete_model()
fine_tuned_predictor.delete_endpoint()

INFO:sagemaker:Deleting model with name: jumpstart-demo-pre-trained-huggingface--2023-05-29-06-11-38-706
INFO:sagemaker:Deleting endpoint configuration with name: jumpstart-demo-pre-trained-huggingface--2023-05-29-06-11-38-706
INFO:sagemaker:Deleting endpoint with name: jumpstart-demo-pre-trained-huggingface--2023-05-29-06-11-38-706
INFO:sagemaker:Deleting model with name: jumpstart-demo-fine-tuned-huggingface-t-2023-05-29-11-50-05-886
INFO:sagemaker:Deleting endpoint configuration with name: jumpstart-demo-fine-tuned-huggingface-t-2023-05-29-11-50-05-886
INFO:sagemaker:Deleting endpoint with name: jumpstart-demo-fine-tuned-huggingface-t-2023-05-29-11-50-05-886
