# [model_consumer]flan_t5_xl_cot(chain-of-thought)_prompting

이 노트북은 [model_consumer]flan_t5_xl_in_context_learning_ml_p3_2xl.ipynb을 먼저 실행하시기 바랍니다.
References: 
- https://medium.com/nlplanet/two-minutes-nlp-making-large-language-models-reason-with-chain-of-thought-prompting-401fd3c964d0
- https://arxiv.org/pdf/2201.11903.pdf

In [112]:
import sagemaker
import boto3
sess = sagemaker.Session()
import json

sm_client = boto3.client("sagemaker")
smr_client = boto3.client("sagemaker-runtime")

[JumpStart pre-trained Models](https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html#)에서 SageMaker pre-trained models의 전체 목록을 확인할 수 있습니다.

In [113]:
%store -r
print(endpoint_name)

huggingface-text2text-flan-t5-xl-1686395213


### Supported Parameters

***
이 모델은 추론을 수행하는 동안 많은 매개 변수를 지원합니다. 여기에는 다음이 포함됩니다:

* **max_length:** 모델은 출력 길이(입력 컨텍스트 길이 포함)가 `max_length`에 도달할 때까지 텍스트를 생성합니다. 지정한 경우 양수여야 합니다.
* **num_return_sequences:** 반환되는 출력 시퀀스의 수입니다. 지정하면 양수여야 합니다.
* **num_beams:** greedy 탐색에 사용된 빔의 수입니다. 지정하는 경우 `num_return_sequences`보다 크거나 같은 정수여야 합니다.
* **no_repeat_ngram_size:** 모델은 출력 시퀀스에서 `no_repeat_ngram_size`의 단어 시퀀스가 반복되지 않도록 보장합니다. 지정하는 경우 1보다 큰 양의 정수여야 합니다.
* **temperature:** 출력의 randomness를 제어합니다. temperature가 높을수록 낮은 확률의 단어가 출력되고, temperature가 낮을수록 높은 확률의 단어가 출력됩니다. `temperature` -> 0이면 greedy 디코딩을 수행합니다. 지정하면 양수 실수여야 합니다.
* **early_stopping:** True이면 모든 beam 가설이 문장 토큰의 끝에 도달할 때 텍스트 생성을 종료합니다. 지정하면 boolean이어야 합니다.
* **do_sample:** True인 경우, 확률에 따라 다음 단어를 샘플링합니다. 지정하면 boolean이어야 합니다.
* **top_k:** 텍스트 생성의 각 단계에서 가장 가능성이 높은 `top_k` 단어만 샘플링합니다. 지정하면 양수여야 합니다.
* **top_p:** 텍스트 생성의 각 단계에서 누적 확률 `top_p`로 가능한 가장 작은 단어 집합에서 샘플링합니다. 지정한 경우 0과 1 사이의 실수여야 합니다.
* **seed:** 재현성을 위해 무작위 상태를 고정합니다. 지정하면 정수여야 합니다.
* **return_full_text:** True인 경우, 입력 텍스트가 출력 생성 텍스트의 일부가 됩니다. 지정하면 boolean이어야 합니다. 기본값은 False입니다.

endpoint를 호출하는 동안 위에서 언급한 매개변수의 하위 집합을 지정할 수 있습니다. 다음은 이러한 인수를 사용하여 endpoint를 호출하는 방법의 예입니다.
***

In [258]:
parameters = {
    "early_stopping": True,
    "max_length": 100,
    "num_return_sequences": 3,
    "temperature": .1,
    "num_beams": 3,
    "top_k": 0.7,
    "no_repeat_ngram_size": 2,
}
# parameters = {}

## Mathematical Reasoning 
### Zero-shot Prompting

In [259]:
context = """
QUESTION: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can have 3 tennis balls. How many tennis balls does he have now?
ANSWER: The answer is 11.
"""
query = """
QUESTION: John takes care of 10 dogs. Each dog takes 0.5 hours a day to walk and take care of their business. How many hours a week does he spend taking care of dogs?
ANSWER:
"""

## expected output:
# # [{'generated_text': 'The answer is 7 hours a day because 10 x.5 = 7 hours. He spends 7 hours taking care of dogs a week because 7 x 7 = 56 hours.'}]

In [260]:
prompt = f'{context}\n{query}'
parameters['text_inputs'] = prompt
payload = json.dumps(parameters).encode('utf-8')

In [261]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: He spends 10 x 0.5 = 5 hours a day taking care of the dogs. So he takes 5 * 7 = 35 hours per week. The answer is 35.


### With Chain of Thought Prompting

In [262]:
context = """
QUESTION: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?
ANSWER: Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. The answer is 11.
"""
query = """
QUESTION: John takes care of 10 dogs. Each dog takes 0.5 hours a day to walk and take care of their business. How many hours a week does he spend taking care of dogs?
ANSWER:
"""

# # [{'generated_text': '"He spends 10 *.5 = 5 hours a day taking care of dogs. So he spends 5 * 7 = 35 hours / week taking care dogs. The answer is 35."'}]

In [263]:
prompt = f'{context}\n{query}'
parameters['text_inputs'] = prompt
payload = json.dumps(parameters).encode('utf-8')

In [264]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: He spends 0.5 hours x 10 dogs = 1 hour a day taking care of the dogs. So he spend 1 * 7 = 7 hours per week. The answer is 7.


## Advanced Mathematical Reasoning - with chain of thought prompting

In [265]:
context = """
QUESTION: Ducks need to eat 3.5 pounds of insects each week to survive. If there is a flock of ten ducks, how many pounds of insects do they need per day?
ANSWER: Ducks need 3.5 pounds of insects each week. If there is a flock of 10 ducks, then they need 3.5 x 10 = 35 pounds of insects each week. If they need 35 pounds of insects each week, then they need 35 / 7 = 5 pounds of insects each day. The answer is 5. 

QUESTION: It takes Matthew 3 minutes to dig a small hole for shrubs and 10 minutes to dig a large hole for trees. How many hours will it take him to dig 30 small holes and 15 large holes?
ANSWER: It takes Matthew 3 minutes to dig a small hole and 10 minutes to dig a large hole. So, it takes Matthew 3 x 30 = 90 minutes to dig 30 small holes. It takes Matthew 10 x 15 = 150 minutes to dig 15 large holes. So, it takes Matthew 90 + 150 = 240 minutes to dig 30 small holes and 15 large holes. 240 minutes is 4 hours. The answer is 4 hours. 
"""

query = """
QUESTION: I have 10 liters of orange drink that are two-thirds water and I wish to add it to 15 liters of pineapple drink that is three-fifths water. But as I pour it, I spill one liter of the orange drink. How much water is in the remaining 24 liters?
ANSWER:
"""

# # [{'generated_text': '"The orange drink is 10 x 2 / 3 = 8 liters of water. The pineapple drink is 15 x 3 / 5 = 12 liter of water in it. The total water in the orange and pineapple drinks is 8"'}]


In [266]:
prompt = f'{context}\n{query}'
parameters['text_inputs'] = prompt
payload = json.dumps(parameters).encode('utf-8')

In [267]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: I have 10 liters of orange drink that are two - thirds water, which is 2 / 3 * 10 = 4 litres. I wish to add it to 15 l of pineapple drink, a total of 10 + 15 = 25 ml. The total amount of water in the orange and pineapple drinks is 4 + 25 = 50 % water. So, there is 50 *.1 = 5 percent water left. Since I spilled


As you can see in the above example for complex mathematical reasoning the models might not give you the right predicted output. 
The correnct answer is: 

"The orange drink is 10liters, 1 liter was dropped, remaining drink has 9 * 2/3 = 6 liters of water. The pineapple drink is 15 x 3 / 5 = 9 liter of water in it. The total water in the orange and pineapple drinks is 15"

## Symbolic Reasoning
For symbolic reasoning, consider the tasks of last letter concatenation, reverse list, and coin flip shown in the next image.

### Zero shot prompting

### Last Letter Concatenation

In [268]:
context = ""
query = """Take the last letters of the words in "Elon Musk" and con-catenate them."""

# # [{"generated_text":"musk elon n"}]


In [269]:
prompt = f'{context}\n{query}'
parameters['text_inputs'] = prompt
payload = json.dumps(parameters).encode('utf-8')

In [270]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: elonmu


### With Chain of thought prompting

In [271]:
context = """QUESTION: Take the last letters of the words in "Elon Musk" and con-catenate them.
ANSWER: The last letter of "Elon" is "n". The last letter of "Musk" is "k'. Concatenating them is "nk". So the answer is nk.

QUESTION: Take the last letters of the words in "Mani Khanuja" and con-catenate them.
ANSWER: The last letter of "Mani" is "i". The last letter of "Khanuja" is "a". Concatenating them is "ia". So the answer is ia. 
"""
query = """
QUESTION: Take the last letters of the words in "John Doe" and con-catenate them.
ANSWER:
"""

# #[{"generated_text":"Doe. So the answer is Doe"}]


In [272]:
prompt = f'{context}\n{query}'
parameters['text_inputs'] = prompt
payload = json.dumps(parameters).encode('utf-8')

In [273]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: JohnDoe


### Reverse List

### Zero shot prompting

In [274]:
payload = """
QUESTION: Reverse the sequence "glasses, pen, alarm, license".
ANSWER: 
"""

# # [{"generated_text":"license, alarm, pen, glasses, reversed"}]


In [275]:
parameters['text_inputs'] = payload
payload = json.dumps(parameters).encode('utf-8')

In [276]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: license, alarm, pen, glasses


### With Chain of Thought prompting

In [277]:
payload = """QUESTION: Reverse the sequence "glasses, pen, alarm, license".
ANSWER: First is glasses. Second is pen. Third is alarm. Fourth is license. Now to reverse, change the order to: Fourth is license.
Third is alarm. Second is pen. First is glasses. So the answer is
"license, alarm, pen, glasses".

QUESTION: Reverse the sequence "telephone, clock, board, spectacles".
ANSWER: First is telephone. Second is clock. Third is board. Fourth is spectacles. Now to reverse, change the order to: Fourth is spectacles.
Third is board. Second is clock. First is telephone. So the answer is
"spectacles, board, clock, telephone".

QUESTION: Reverse the sequence "cup, plate, food, fruits".
ANSWER:

"""

# # [{"generated_text":"fruits, food, plate, cup\\" is correct."}]'


In [278]:
parameters['text_inputs'] = payload
payload = json.dumps(parameters).encode('utf-8')

In [279]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: fruits, food, plate, cup


### Coin Flip
### Zero shot prompting

In [280]:
payload = """
QUESTION:  A coin is heads up. John does not flip the coin. Shalonda does not flip the coin. Is the coin still heads up?
ANSWER: 

"""

## [{"generated_text":"yes.... and it will remain heads up."}]

In [281]:
parameters['text_inputs'] = payload
payload = json.dumps(parameters).encode('utf-8')

In [282]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: yes


In [283]:
payload = """
Q: A coin is heads up. Maybelle flips the coin. Shalonda does not flip the coin. Is the coin still heads up?
A: The coin was flipped by Maybelle. So the coin was flipped 1 time, which is an odd number. The coin started heads up, so after an odd number of flips, it will be tails up. So the answer
is no.

QUESTION:  A coin is heads up. John does not flip the coin. Shalonda does not flip the coin. Is the coin still heads up?
ANSWER:

"""

## '[{"generated_text":"The coin is still heads up because John and Shalonda did not flip the coin. So the answer is yes."}]'

In [284]:
parameters['text_inputs'] = payload
payload = json.dumps(parameters).encode('utf-8')

In [285]:
response = smr_client.invoke_endpoint(EndpointName=endpoint_name, 
                                  ContentType="application/json", 
                                  Body=payload)

model_predictions = json.loads(response['Body'].read())
generated_text = model_predictions['generated_texts'][0]
print(f'Response: {generated_text}')

Response: Shalonda does not flip the coin. So the answer is yes.


In [None]:
### Clean up the endpoint

In [None]:
# Delete the SageMaker endpoint
predictor.delete_model()
predictor.delete_endpoint()