# SageMaker JumpStart - invoke text generation endpoint

This notebook demonstrates how to attach a predictor to an existing endpoint name and invoke the endpoint with example payloads.

In [7]:
from sagemaker.predictor import retrieve_default

Retrieve a predictor from your deployed endpoint name.

In [49]:
# endpoint_name = "gsm8k-encs1-meta-textgeneration-l-20240420-203637"
endpoint_name = "jumpstart-dft-meta-textgeneration-l-20240427-203329"
predictor = retrieve_default(endpoint_name)

Now query your endpoint with example payloads.

In [50]:
payload = {
    "inputs": "<s>[INST] what is the recipe of mayonnaise? explain in a few lines [/INST] ",
    "parameters": {
        "max_new_tokens": 256,
        "top_p": 0.9,
        "temperature": 0.6
    }
}
response = predictor.predict(payload)
print(response)

[{'generated_text': 'Mayonnaise is a thick, creamy condiment made from a mixture of egg yolks, oil, vinegar or lemon juice, and seasonings. Here is a basic recipe for homemade mayonnaise:\n\nIngredients:\n\n* 2 egg yolks\n* 1/2 cup (120 ml) neutral-tasting oil, such as canola or grapeseed\n* 1 tablespoon (15 ml) vinegar or lemon juice\n* Salt and pepper to taste\n\nInstructions:\n\n1. In a small bowl, whisk together the egg yolks and vinegar or lemon juice until smooth and slightly thickened.\n2. Slowly pour the oil into the egg yolk mixture while continuously whisking.\n3. Continue whisking until the mixture thickens and emulsifies, forming a smooth and creamy texture.\n4. Season with salt and pepper to taste.\n5. Cover and refrigerate for at least 30 minutes before serving.\n\nNote: You can also add other ingredients'}]


In [4]:
payload = {
    "inputs": "<s>[INST] I am going to Paris, what should I see? [/INST] Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:\n\n1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.\n2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.\n3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.\n\nThese are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.</s><s>[INST] What is so great about #1? [/INST] ",
    "parameters": {
        "max_new_tokens": 256,
        "top_p": 0.9,
        "temperature": 0.6
    }
}
response = predictor.predict(payload)
print(response)

[{'generated_text': '\n\n### What is so great about #1?\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so great about #1?**\n\n> **What is so great about #1?**\n\n**What is so'}]


In [52]:
payload = {
    "inputs": "<s>[INST] I am going to Paris, what should I see? [/INST] Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:\n\n1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.\n2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.\n3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.\n\nThese are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.</s><s>[INST] What is so great about #1? [/INST] ",
    "parameters": {
        "max_new_tokens": 256,
        "top_p": 0.9,
        "temperature": 0.6
    }
}
response = predictor.predict(payload)
print(response)

[{'generated_text': "The Eiffel Tower is considered one of the most iconic and recognizable landmarks in the world, and there are several reasons why it's so great:\n\n1. Unique Design: The Eiffel Tower's design is unlike any other structure in the world. It was created by Gustave Eiffel and his engineering company for the 1889 World's Fair in Paris, and its lattice-like framework was revolutionary for its time. The tower's unique design makes it a standout among other landmarks and a symbol of Paris and France.\n2. Historical Significance: The Eiffel Tower was built for the World's Fair, which was held in Paris to celebrate the 100th anniversary of the French Revolution. It was originally intended to be a temporary structure, but it became an instant icon of the city and a symbol of French culture and engineering.\n3. Breathtaking Views: The Eiffel Tower offers panoramic views of Paris from its observation decks, which are located on the first and second levels. Visitors can see many 

In [103]:
payload = {
    "inputs": "<s>[INST] <<SYS>>\nYou are a diligent math student, logically solving given math problems\n<</SYS>>\n\nThe probability of rain tomorrow is $\frac{3}{10}$. What is the probability that it will not rain tomorrow? Express your answer as a common fraction. [/INST] ",
    "parameters": {
        "max_new_tokens": 256,
        "top_p": 0.9,
        "temperature": 0.6
    }
}
response = predictor.predict(payload)
print(response)

[{'generated_text': 'Great, let\'s get started! The probability of rain tomorrow is given as $frac{3}{10}$. To find the probability that it will not rain tomorrow, we can use the concept of complementary probability.\n\nThe complementary probability of an event A is the probability of the event A\' (the complement of A), which is the probability of all outcomes that are not in A. In this case, the event "it will not rain tomorrow" is the complement of the event "it will rain tomorrow".\n\nSo, the probability that it will not rain tomorrow is:\n\nComplement of $frac{3}{10} = frac{7}{10}$\n\nTherefore, the probability that it will not rain tomorrow is $frac{7}{10}$.'}]


In [56]:

example = """
            \n\nExample 1
            \nSentence: Though the sun set with a brilliant display of colors, casting a warm glow over the serene beach, it was the bitter news I received earlier that clouded my emotions, making it impossible to truly appreciate nature's beauty.
            \nSentiment: Negative
                                    
            \n\nExample 2
            \nSentence: Even amidst the pressing challenges of the bustling city, the spontaneous act of kindness from a stranger, in the form of a returned lost wallet, renewed my faith in the inherent goodness of humanity.
            \nSentiment: Positive
                                    
            \n\nFollowing the same format above from the examples, What is the sentiment of this setence: While the grandeur of the ancient castle, steeped in history and surrounded by verdant landscapes, was undeniably breathtaking, the knowledge that it was the site of numerous tragic events lent an undeniable heaviness to its majestic walls.
            """
payload = {
    "inputs": "<s>[INST] <<SYS>>\nYou are an excellent math student solving the given math problem\n<</SYS>>\n\nTo express 20 as a sum of different powers of 2, we would write $20 = 2^4 + 2^2$. The sum of the exponents of these powers is $4 + 2 = 6$. If 400 were expressed as a sum of at least two distinct powers of 2, what would be the least possible sum of the exponents of these powers? [/INST] ",
    "parameters": {
        "max_new_tokens": 512,
        "top_p": 0.9,
        "temperature": 0.6
    }
}

response = predictor.predict(payload)
print(response)

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (422) from 9deX2bJkth7Kltwjhtm8cdKIBAv0Im0xsOGd with message "Failed to deserialize the JSON body into the target type: inputs: invalid type: sequence, expected a string at line 1 column 11". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/jumpstart-dft-meta-textgeneration-l-20240427-203329 in account 471112774062 for more information.

In [None]:
payload = {
    "inputs": "<s>[INST] <<SYS>>\nExplain your reasoning\n<</SYS>>\n\nForty percent of the students have elected to learn from home during the pandemic. The remaining students are divided into two equal groups, only one of which is physically in school on any day. What percent of students are present in school? [/INST] ",
    "parameters": {
        "max_new_tokens": 256,
        "top_p": 0.9,
        "temperature": 0.6
    }
}

def remove_tags(text):
    # Define the regex pattern to match <se>NUMBER<de>NUMBER<fe> and similar tags
    # This pattern matches the tags and their entire contents
    pattern = r"<se>"
    
    # Use re.sub() to replace all occurrences of the pattern with an empty string
    cleaned_text = re.sub(pattern, "", text)

    pattern = r"<de>"
    cleaned_text = re.sub(pattern, "", cleaned_text)

    pattern = r"<fe>"
    cleaned_text = re.sub(pattern, "", cleaned_text)
    
    # Additionally, if there are nested tags like <<<...>>>, handle these
    # This pattern will match any number of sequential < or > characters
    # cleaned_text = re.sub(r"<+|>+","", cleaned_text)
    
    return cleaned_text

response = predictor.predict(payload)
print(response)
print(remove_tags(response[0]['generated_text']))

# GSM8K 8-shot prompting

In [None]:
x, z, y = [], [], []
x.append("There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. How many trees did the grove workers plant today?")
z.append("There are 15 trees originally. Then there were 21 trees after some more were planted. So there must have been 21 - 15 = 6.")
y.append("6")

x.append("If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?")
z.append("There are originally 3 cars. 2 more cars arrive. 3 + 2 = 5.")
y.append("5")        

x.append("Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?")
z.append("Originally, Leah had 32 chocolates. Her sister had 42. So in total they had 32 + 42 = 74. After eating 35, they had 74 - 35 = 39.")
y.append("39")        

x.append("Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?")
z.append("Jason started with 20 lollipops. Then he had 12 after giving some to Denny. So he gave Denny 20 - 12 = 8.")
y.append("8")        

x.append("Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he have now?")
z.append("Shawn started with 5 toys. If he got 2 toys each from his mom and dad, then that is 4 more toys. 5 + 4 = 9.")
y.append("9")        

x.append("There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?")
z.append("There were originally 9 computers. For each of 4 days, 5 more computers were added. So 5 * 4 = 20 computers were added. 9 + 20 is 29.")
y.append("29")        

x.append("Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?")
z.append("Michael started with 58 golf balls. After losing 23 on tuesday, he had 58 - 23 = 35. After losing 2 more, he had 35 - 2 = 33 golf balls.")
y.append("33")        

x.append("Olivia has $23. She bought five bagels for $3 each. How much money does she have left?")
z.append("Olivia had 23 dollars. 5 bagels for 3 dollars each will be 5 x 3 = 15 dollars. So she has 23 - 15 dollars left. 23 - 15 is 8.")
y.append("8")

In [126]:
import re
import json

payload = {
    "inputs": "<s>[INST] <<SYS>>\nExplain your reasoning\n<</SYS>>\n\nForty percent of the students have elected to learn from home during the pandemic. The remaining students are divided into two equal groups, only one of which is physically in school on any day. What percent of students are present in school? [/INST] ",
    "parameters": {
        "max_new_tokens": 512,
        "top_p": 0.9,
        "temperature": 0.6
    }
}

#  print(payload)

ground_truths = []
inference_answers = []

def compute_accuracy(pred: list, gold: list):
    acc = 0.0
    for p, g in zip(pred, gold):
        if p == g:
            acc += 1

    return acc / len(pred)

def generate_complete_user_prompt(question):
    prompt_template = "<s>[INST] <<SYS>>\\nExplain your reasoning\\n<</SYS>>\\n\\n{} [/INST] ".format(question)
    
    return prompt_template

def format_few_shot_prompt(system_prompt, questions, answers, final_ans, actual_question):
    few_shot_prompt = ""
    # "inputs": "<s>[INST] <<SYS>>\nYou are a diligent math student, logically solving given math problems\n<</SYS>>\n\nIf you double a number and add 5 to the result, then that's 20 more than half of the original number. What's the original number? [/INST] Let x be the original number.\n2*x+5=20+x/2\n2*x-x/2=15\n4*x-x=30\n3*x=30\nx=<<10=10>>10\n#### 4</s><s>[INST] What is the units digit of the product of all the odd positive integers between 10 and 110? [/INST] "
    for i in range(len(questions)):
        if i == 0:
            few_shot_prompt += "\n\n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"
        else:
            few_shot_prompt += "\n<s>\n[INST] \n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"

    few_shot_prompt += "<s>\n[INST] \n" + actual_question + " \n[/INST] "
    few_shot_prompt = "<s>\n[INST] \n<<SYS>> \n" + system_prompt + " \n<</SYS>>" + few_shot_prompt
    # print(few_shot_prompt)
    return few_shot_prompt

def get_ground_truth_numerical_answer(answer):
    ans = answer.split('####')[-1]
    ans = ans.replace(',', '')  # handle numbers like 2,000
    try:
        ans = float(ans)
    except ValueError:
        ans = float("inf")
    return ans

def extract_answer_number(sentence: str) -> float:
    sentence = sentence.replace(',', '')
    pred = [s for s in re.findall(r'-?\d+\.?\d*', sentence)]
    if not pred:
        return float('inf')
    segment = sentence.split("The final answer is ")
    if len(segment) > 1:
        pred_answer = segment[1]
        pred_answer = [s for s in re.findall(r'-?\d+\.?\d*', pred_answer)]
        if len(pred_answer) > 0:
            pred_answer = pred_answer[0]
        else:
            pred_answer = float(pred[-1])
    else:
        # use the last number as the answer
        pred_answer = float(pred[-1])

    if isinstance(pred_answer, str):
        try:
            pred_answer = float(pred_answer)
        except ValueError as e:
            pred_answer = float('inf')
    return pred_answer

def perform_gsm8k_inference(input_filename, output_filename):

    # Uncomment and run this if you want to see how the text looks before and after encoding
    # with open(input_filename, 'r') as input_f:
    #     for line in input_f:
    #         # print(line)
    #         json_obj = json.loads(line)
    #         print(json_obj.get('question'))
    #         print(json_obj.get('answer'))
    #         print('######################')
    #         print(encode_numbers_in_text(json_obj.get('question')))
    #         print(encode_numbers_in_text(json_obj.get('answer')))
    #         break
    
    N = 10
    with open(input_filename, 'r') as input_f, open(output_filename, 'w') as output_f:
        for i, line in enumerate(input_f, 1):
            
                
            json_obj = json.loads(line)
            
            ground_truth_answer = json_obj['answer']
            ground_truths.append(get_ground_truth_numerical_answer(ground_truth_answer))
                                 
            # prompt = generate_complete_user_prompt(json_obj['question'])
            # print(prompt)
            # payload['inputs'] = prompt
            payload['inputs'] = format_few_shot_prompt("You are a diligent math student, logically solving the given math problems.", x, z, y, json_obj['question'])
            # print(payload['inputs'])
            response = predictor.predict(payload)
            # print(response)
            # print(extract_answer_number(response[0]['generated_text']))
            inference_answers.append(extract_answer_number(response[0]['generated_text']))

            # print(ground_truths)
            # print(inference_answers)
            # print("#####")

            llama_text = response[0]
            llama_text['prompt'] = payload['inputs']
            json.dump(llama_text, output_f)
            output_f.write('\n')

            if i % N == 0:
                print(f"Processed {i} lines...")

perform_gsm8k_inference('../test.jsonl', '../test_answers.jsonl')

# print(ground_truths)
# print(inference_answers)
# print("#####")

accuracy = compute_accuracy(inference_answers, ground_truths)
print(accuracy)
with open('accuracy_results', 'w') as output_f:
    output_f.write(str(accuracy))


# generate_complete_user_prompt("Forty percent of the students have elected to learn from home during the pandemic. The remaining students are divided into two equal groups, only one of which is physically in school on any day. What percent of students are present in school?")

# perform_gsm8k_inference('test.jsonl', 'data/gsm8k/processed/test_processed.jsonl')

# for i in range(1000):
#     response = predictor.predict(payload)
#     print(response)
#     print(i)
#     # print(extract_answer_number(response[0]['generated_text']))

Processed 10 lines...
Processed 20 lines...
Processed 30 lines...
Processed 40 lines...
Processed 50 lines...
Processed 60 lines...
Processed 70 lines...
Processed 80 lines...
Processed 90 lines...
Processed 100 lines...
Processed 110 lines...
Processed 120 lines...
Processed 130 lines...
Processed 140 lines...
Processed 150 lines...
Processed 160 lines...
Processed 170 lines...
Processed 180 lines...
Processed 190 lines...
Processed 200 lines...
Processed 210 lines...
Processed 220 lines...
Processed 230 lines...
Processed 240 lines...
Processed 250 lines...
Processed 260 lines...
Processed 270 lines...
Processed 280 lines...
Processed 290 lines...
Processed 300 lines...
Processed 310 lines...
Processed 320 lines...
Processed 330 lines...
Processed 340 lines...
Processed 350 lines...
Processed 360 lines...
Processed 370 lines...
Processed 380 lines...
Processed 390 lines...
Processed 400 lines...
Processed 410 lines...
Processed 420 lines...
Processed 430 lines...
Processed 440 lines.

In [119]:
print(inference_answers)

[18.0, 6.0, 75000.0, 180.0, 45.0, 5.08, 128.0, 16.0, 4.0]


In [121]:
print(inference_answers)

[26.0, 3.0, 40000.0, 180.0, 20.0, 120.0, 20.0, 2.0, 4.0]


# GSM8K RAG Prompting

In [136]:
import json
import re


ground_truths = []
inference_answers = []

def compute_accuracy(pred: list, gold: list):
    acc = 0.0
    for p, g in zip(pred, gold):
        if p == g:
            acc += 1

    return acc / len(pred)

def get_qa_pairs(input_str):
    # Using regex to split the string into individual JSON strings
    json_strings = re.findall(r'\{.*?\}', input_str)
    
    # Initializing lists to store questions and answers
    questions = []
    answers = []
    
    # Parsing each JSON string and extracting questions and answers
    for json_str in json_strings:
        try:
            json_obj = json.loads(json_str)
            questions.append(json_obj['question'])
            answers.append(json_obj['answer'])
        except json.JSONDecodeError:
            print("Error decoding JSON from:", json_str)
    return questions, answers


def format_few_shot_prompt(system_prompt, questions, answers, actual_question):
    few_shot_prompt = ""
    # "inputs": "<s>[INST] <<SYS>>\nYou are a diligent math student, logically solving given math problems\n<</SYS>>\n\nIf you double a number and add 5 to the result, then that's 20 more than half of the original number. What's the original number? [/INST] Let x be the original number.\n2*x+5=20+x/2\n2*x-x/2=15\n4*x-x=30\n3*x=30\nx=<<10=10>>10\n#### 4</s><s>[INST] What is the units digit of the product of all the odd positive integers between 10 and 110? [/INST] "
    for i in range(len(questions)):
        if i == 0:
            few_shot_prompt += "\n\n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"
        else:
            few_shot_prompt += "\n<s>\n[INST] \n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"
        # break

    few_shot_prompt += "<s>\n[INST] \n" + actual_question + " \n[/INST] "
    few_shot_prompt = "<s>\n[INST] \n<<SYS>> \n" + system_prompt + " \n<</SYS>>" + few_shot_prompt
    return few_shot_prompt


def get_ground_truth_numerical_answer(answer):
    ans = answer.split('####')[-1]
    ans = ans.replace(',', '')  # handle numbers like 2,000
    try:
        ans = float(ans)
    except ValueError:
        ans = float("inf")
    return ans


def format_few_shot_prompt(system_prompt, questions, answers, actual_question):
    few_shot_prompt = ""
    # "inputs": "<s>[INST] <<SYS>>\nYou are a diligent math student, logically solving given math problems\n<</SYS>>\n\nIf you double a number and add 5 to the result, then that's 20 more than half of the original number. What's the original number? [/INST] Let x be the original number.\n2*x+5=20+x/2\n2*x-x/2=15\n4*x-x=30\n3*x=30\nx=<<10=10>>10\n#### 4</s><s>[INST] What is the units digit of the product of all the odd positive integers between 10 and 110? [/INST] "
    for i in range(len(questions)):
        if i == 0:
            few_shot_prompt += "\n\n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"
        else:
            few_shot_prompt += "\n<s>\n[INST] \n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"
        # break

    few_shot_prompt += "<s>\n[INST] \n" + actual_question + " \n[/INST] "
    few_shot_prompt = "<s>\n[INST] \n<<SYS>> \n" + system_prompt + " \n<</SYS>>" + few_shot_prompt
    return few_shot_prompt


N = 10
with open("../gsm8k_rag.jsonl", 'r') as input_f, open("../test_answer_gsmNormal_rag.jsonl", 'w') as output_f:
    for i, line in enumerate(input_f, 1):

        json_obj = json.loads(line)
            
        ground_truth_answer = json_obj['answer']
        ground_truths.append(get_ground_truth_numerical_answer(ground_truth_answer))

        questions,answers = get_qa_pairs(json_obj['prompt'])
        input_prompt = format_few_shot_prompt("You are a diligent math student, logically solving the given math problems.", questions, answers, json_obj['question'])
        # print(input_prompt)

        payload = {
            "inputs": input_prompt,
            "parameters": {
                "max_new_tokens": 512,
                "top_p": 0.9,
                "temperature": 0.6
            }
        }

        response = predictor.predict(payload)
        # print(response)
        # break

        inference_answers.append(extract_answer_number(response[0]['generated_text']))

        # print(ground_truths)
        # print(inference_answers)
        # print("#####")

        llama_text = response[0]
        llama_text['prompt'] = payload['inputs']
        json.dump(llama_text, output_f)
        output_f.write('\n')

        if i % N == 0:
            print(f"Processed {i} lines...")


accuracy = compute_accuracy(inference_answers, ground_truths)
print(accuracy)
with open('accuracy_results_gsmNormal.txt', 'w') as output_f:
    output_f.write(str(accuracy))


Processed 10 lines...
Processed 20 lines...
Processed 30 lines...
Processed 40 lines...
Processed 50 lines...
Processed 60 lines...
Processed 70 lines...
Processed 80 lines...
Processed 90 lines...
Processed 100 lines...
Processed 110 lines...
Processed 120 lines...
Processed 130 lines...
Processed 140 lines...
Processed 150 lines...
Processed 160 lines...
Processed 170 lines...
Processed 180 lines...
Processed 190 lines...
Processed 200 lines...
Processed 210 lines...
Processed 220 lines...
Processed 230 lines...
Processed 240 lines...
Processed 250 lines...
Processed 260 lines...
Processed 270 lines...
Processed 280 lines...
Processed 290 lines...
Processed 300 lines...
Processed 310 lines...
Processed 320 lines...
Processed 330 lines...
Processed 340 lines...
Processed 350 lines...
Processed 360 lines...
Processed 370 lines...
Processed 380 lines...
Processed 390 lines...
Processed 400 lines...
Processed 410 lines...
Processed 420 lines...
Processed 430 lines...
Processed 440 lines.

# MATH prompting

In [102]:
import json
import re

def get_qa_pairs(input_str):
    # Using regex to split the string into individual JSON strings
    json_strings = re.findall(r'\{.*?\}', input_str)
    
    # Initializing lists to store questions and answers
    questions = []
    answers = []
    
    # Parsing each JSON string and extracting questions and answers
    for json_str in json_strings:
        try:
            json_obj = json.loads(json_str)
            questions.append(json_obj['question'])
            answers.append(json_obj['answer'])
        except json.JSONDecodeError:
            print("Error decoding JSON from:", json_str)
    return questions, answers

def format_few_shot_prompt(system_prompt, questions, answers, actual_question):
    few_shot_prompt = ""
    # "inputs": "<s>[INST] <<SYS>>\nYou are a diligent math student, logically solving given math problems\n<</SYS>>\n\nIf you double a number and add 5 to the result, then that's 20 more than half of the original number. What's the original number? [/INST] Let x be the original number.\n2*x+5=20+x/2\n2*x-x/2=15\n4*x-x=30\n3*x=30\nx=<<10=10>>10\n#### 4</s><s>[INST] What is the units digit of the product of all the odd positive integers between 10 and 110? [/INST] "
    for i in range(len(questions)):
        if i == 0:
            few_shot_prompt += "\n\n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"
        else:
            few_shot_prompt += "\n<s>\n[INST] \n" + questions[i] + "\n[/INST] \n" + answers[i] + "\n</s>"
        # break

    few_shot_prompt += "<s>\n[INST] \n" + actual_question + " \n[/INST] "
    few_shot_prompt = "<s>\n[INST] \n<<SYS>> \n" + system_prompt + " \n<</SYS>>" + few_shot_prompt
    return few_shot_prompt
                
    

with open("../math_test_set/counting_and_probability/82.json", 'r') as input_f:
    json_obj = json.load(input_f)



questions,answers = get_qa_pairs(json_obj['prompt'])
# print(len(questions))
# print(len(answers))
input_prompt = format_few_shot_prompt("You are a diligent math student, logically solving the given math problems.", questions, answers, json_obj['problem'])
# print(input_prompt)

payload = {
    "inputs": input_prompt,
    "parameters": {
        "max_new_tokens": 512,
        "top_p": 0.9,
        "temperature": 0.6
    }
}
response = predictor.predict(payload)
print(response)

[{'generated_text': 'To find the probability that it will not rain tomorrow, we can subtract the probability of rain from 1:\n\nProbability of not raining = 1 - Probability of raining\n= 1 - (3/10)\n= 7/10\n\nSo, the probability that it will not rain tomorrow is 7/10.\n\nNote: Since the probability of rain is given as a fraction, we can simplify it by multiplying both the numerator and denominator by 10 to make the fraction easier to work with:\n\nProbability of raining = 3/10\nProbability of not raining = 7/10\n\nTherefore, the probability that it will not rain tomorrow is 7/10.'}]


In [69]:
pwd:

'/home/sagemaker-user/DemoNotebooks'

This model supports the following payload parameters. You may specify any subset of these parameters when invoking an endpoint.

* **do_sample:** If True, activates logits sampling. If specified, it must be boolean.
* **max_new_tokens:** Maximum number of generated tokens. If specified, it must be a positive integer.
* **repetition_penalty:** A penalty for repetitive generated text. 1.0 means no penalty.
* **return_full_text:** If True, input text will be part of the output generated text. If specified, it must be boolean. The default value for it is False.
* **stop**: If specified, it must a list of strings. Text generation stops if any one of the specified strings is generated.
* **seed**: Random sampling seed.
* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.
* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.
* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.
* **truncate:** Truncate inputs tokens to the given size.
* **typical_p:** Typical decoding mass, according to [Typical Decoding for Natural Language Generation](https://arxiv.org/abs/2202.00666).
* **best_of:** Generate best_of sequences and return the one if the highest token logprobs.
* **watermark:** Whether to perform watermarking with [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226).
* **details:** Return generation details, to include output token logprobs and IDs.
* **decoder_input_details:** Return decoder input token logprobs and IDs.
* **top_n_tokens:** Return the N most likely tokens at each step.