<div style="background-color: #FFDDDD; border-left: 5px solid red; padding: 10px; color: black;">
    <strong>Kernel: Python 3 (ipykernel)
</div>

## Prompt Engineering with LLMs on SageMaker Studio.

---

## Overview

Prompt engineering is an exciting, new way of making language computer programs, also known as language models, work better for all kinds of jobs and studies. This skill helps us get to know what these big computer programs can do well and what they can't.

Scientists use prompt engineering to make these language models better at doing a bunch of different things, like answering questions or solving math problems. Programmers use it to create strong and useful ways to interact with these big language models and other tech stuff.

But prompt engineering isn't just about making questions or commands for these models. It's a whole set of skills that help us work better with them. We can use these skills to make the language models safer and even add new features, like making them smarter in specific subjects.

In this lab, we learn how to:
1. use SageMaker to setup and send prompts to a Large Language Model, Llama 3.1.
2. Learn basic and Advanced prompting techniques.

In [None]:
import os
import json
import boto3
import sagemaker
from sagemaker import serializers, deserializers
from ipywidgets import Dropdown
from typing import Dict, List
from IPython.display import display, HTML
from datetime import datetime
from langchain import PromptTemplate
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.llms import SagemakerEndpoint
from langchain.llms.sagemaker_endpoint import LLMContentHandler
from langchain.chains import LLMChain
from langchain_core.messages import SystemMessage
from langchain_core.prompts import (
    ChatPromptTemplate, 
    HumanMessagePromptTemplate
)
from langchain_experimental.utilities import PythonREPL
from langchain_community.document_loaders import WebBaseLoader
from langchain.chains.summarize import load_summarize_chain
from langchain_core.output_parsers import StrOutputParser

import sys
sys.path.append(os.path.dirname(os.getcwd()))
from utilities.helpers import (
    pretty_print_html, 
    set_meta_llama_params,
    print_dialog,
    format_messages,
    read_eula,
    ContentHandlerwithTracking
)

import mlflow
from mlflow import MlflowClient

#### Connect to an Hosted `Llama 3.1 8b Instruct` Model

In [None]:
endpoint_name = "meta-llama31-8b-instruct-endpoint" 
mlflow_experiment_name = f"Prompt-Tracker-{datetime.now().strftime('%y%m%d')}"
boto_region = boto3.Session().region_name

In [None]:
sess = sagemaker.session.Session(
    boto_session=boto3.Session(
        region_name=boto_region
    )
)
sm_client = boto3.client(
    "sagemaker", 
    region_name=boto_region
)

smr_client = boto3.client(
    "sagemaker-runtime", 
    region_name=boto_region
)

pretrained_predictor = sagemaker.Predictor(
    endpoint_name=endpoint_name,
    sagemaker_session=sess,
    serializer=serializers.JSONSerializer(),
    deserializer=deserializers.JSONDeserializer(),
)

tracking_server_arn = sm_client.list_mlflow_tracking_servers(
)['TrackingServerSummaries'][0]['TrackingServerArn']
tracking_server_url = sm_client.create_presigned_mlflow_tracking_server_url(
    TrackingServerName=tracking_server_arn.split('/')[-1]
)
mlflow.set_tracking_uri(tracking_server_arn)
experiment_info = mlflow.set_experiment(mlflow_experiment_name)

content_handler = ContentHandlerwithTracking(mlflow_experiment_name)

llm=SagemakerEndpoint(
     endpoint_name=pretrained_predictor.endpoint_name, 
     region_name=sess.boto_region_name, 
     model_kwargs={
         "max_new_tokens": 700, 
         "top_p": 0.8, 
         "temperature": 0.1
     },
    content_handler=content_handler
 )

pretty_print_html(f"Using tracking server {tracking_server_arn}")

In [None]:
def run_llm_inference(llm, template, sanitize_output=False, repl=False, **kwargs):
    # Clean up kwargs inputs
    for _k in kwargs:
        kwargs[_k] = kwargs[_k].rstrip().strip().replace('\n', '')

    # Build the base LLM chain
    llm_chain = template | llm

    # Dynamically add output sanitization if requested
    if sanitize_output:
        llm_chain = llm_chain | StrOutputParser() | _sanitize_output

    if repl:
        llm_chain = llm_chain | repl
    
    # Get the response from the chain
    response = llm_chain.invoke(kwargs)
    
    # Optionally truncate long input values
    for _k in kwargs:
        if len(kwargs[_k]) > 200:
            kwargs[_k] = kwargs[_k][:200] + "..."
    
    # Format and return the full message
    if not sanitize_output and not repl:
        full_message = template.format(**kwargs) + f"\nAI: {response}"
    elif sanitize_output and not repl:
        full_message = template.format(**kwargs) + f"\nAI: \n ```\n{response}```"
    elif sanitize_output and repl:
        full_message = template.format(**kwargs) + f"\nAI: \n (executed output code) \n{response}```"
        
    return full_message

The function below is used to set the inference payload parameters for `Llama` Endpoint

* **max_new_tokens:** Model generates text until the output length (excluding the input context length) reaches max_new_tokens. If specified, it must be a positive integer.

* **temperature:** temperature: Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If temperature -> 0, it results in greedy decoding. If specified, it must be a positive float.

* **top_p:** Top p, also known as nucleus sampling, is another hyperparameter that controls the randomness of language model output. sets a threshold probability and selects the top tokens whose cumulative probability exceeds the threshold. The model then randomly samples from this set of tokens to generate output. This method can produce more diverse and interesting output than traditional methods that randomly sample the entire vocabulary. For example, if you set top p to **0.9**, the model will only consider the most likely words that make up **90%** of the probability mass.

#### Start Tracking your prompts with `MLflow`

In [None]:
display(HTML(
    f'Access MLflow Tracking Server here: <a href="{tracking_server_url["AuthorizedUrl"]}" target="_blank">{tracking_server_arn.split("/")[-1]}</a>'))

## Prompt Engineering Basics

---


## Basic prompts

In this lab, we'll delve Prompt Engineering examples that showcase the utility of well-designed prompts, setting the stage for the more complex topics explored in advanced modules.<br>

Understanding key principles often becomes clearer when illustrated with real-world examples. In the sections that follow, we demonstrate a variety of tasks made possible through the strategic crafting of prompts.

In [None]:
basic_template = '''You are a helpful ai assistant. Keep your answers short and talkative only when required!

{text}
'''

basic_prompt_template = PromptTemplate.from_template(basic_template)

In [None]:
full_response = run_llm_inference(llm=llm, text="The sky is", template=basic_prompt_template)
pretty_print_html(full_response)

In [None]:
full_response = run_llm_inference(llm=llm, text="Translate this sentence to French: I am learning to speak French.", template=basic_prompt_template)
pretty_print_html(full_response)

### Text Summarization

One of the key activities in natural language generation involves text summarization, which comes in various forms and contexts. One of the most intriguing capabilities of language models is their skill in distilling lengthy articles or complex ideas into brief, easy-to-grasp summaries. For this exercise, we will delve into the basics of text summarization using tailored prompts.

Suppose you wish to familiarize yourself with the age-old tale of "The Tortoise and the Hare." You could start with a prompt like the following:

In [None]:
summ_template = '''You are a helpful ai assistant. Summarize the paragraph below in one sentence only:

{text}
'''

summ_prompt_template = PromptTemplate.from_template(summ_template)

In [None]:
full_response = run_llm_inference(llm=llm, text="""The hare was once boasting of his speed before the other animals. "I have never yet been beaten," said he, "when I put forth my full speed. I challenge any one here to race with me." The tortoise said quietly, "I accept your challenge." "That is a good joke," said the hare; "I could dance round you all the way." "Keep your boasting till you've beaten me," answered the tortoise. "Shall we race?" So a course was fixed and a start was made. The hare darted almost out of sight at once, but soon stopped and, to show his contempt for the tortoise, lay down to have a nap. The tortoise plodded on and plodded on, and when the hare awoke from his nap, he saw the tortoise just near the winning-post and could not run up in time to save the race.""", template=summ_prompt_template)
pretty_print_html(full_response)

In [None]:
loader = WebBaseLoader("https://blog.langchain.dev/langgraph/")
docs = loader.load()

In [None]:
full_response = run_llm_inference(llm=llm, text=docs[0].page_content, template=summ_prompt_template)
pretty_print_html(full_response)

### Question Answering

A highly effective method for eliciting precise responses from the model involves refining the structure of the prompt. As previously discussed, a well-designed prompt often amalgamates elements like directives, contextual information, and input-output indicators to yield superior outcomes. While incorporating these elements isn't obligatory, doing so tends to be advantageous; specificity in your instructions is directly correlated with the quality of the results you obtain. The subsequent section offers an illustrative example to demonstrate the impact of a meticulously crafted prompt.

In [None]:
qna_prompt_template = ChatPromptTemplate.from_messages(
    [
        SystemMessage(
            content=(
                "Answer the following question based on the context below. Keep the answer short. Respond 'Unsure about answer' if not sure about the answer."
            )
        ),
        HumanMessagePromptTemplate.from_template("Context: {context_text}\nQuestion:{context_q}"),
    ]
)

In [None]:
context_text = "In 1849, thousands of people rushed to California in search of gold and riches. This was known as the California Gold Rush. Prospectors came from all over the world during this time period."
context_q = "What year did the events take place?"

full_response = run_llm_inference(llm=llm, context_text=context_text, context_q=context_q, template=qna_prompt_template)

pretty_print_html(full_response)

In [None]:
loader = WebBaseLoader("https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/features.html")
docs = loader.load()

context_q = "What is Amazon Q?"
full_response = run_llm_inference(llm=llm, context_text=docs[0].page_content, context_q=context_q, template=qna_prompt_template)
pretty_print_html(full_response)

### Text Classification

Up to this point, you've given straightforward directives to achieve specific outcomes. However, in your role as a prompt engineer, enhancing the quality of your instructions is imperative. It's not just about better commands; for more complex scenarios, mere instructions won't suffice. This is the juncture where contextual understanding and nuanced elements become crucial. Elements such as [input data] or illustrative [examples] can offer further guidance.

In [None]:
sentiment_template = '''Classify the text into negative or positive.

Text: {context_text}

Sentiment:'''

sentiment_prompt_template = PromptTemplate.from_template(sentiment_template)

In [None]:
context_text = "Apple stock is currently trading at 150 dollars per share. Given Apple's strong financial performance lately with increased iPhone sales and new product launches planned, I predict the stock price will increase to around 160 dollars per share over the next month."
full_response = run_llm_inference(llm=llm, context_text=context_text, template=sentiment_prompt_template)
pretty_print_html(full_response)

In [None]:
sentiment_template = '''Determine if this article is about "Technology", "Politics", or "Business".

Text: {context_text}

Category:'''

sentiment_prompt_template = PromptTemplate.from_template(sentiment_template)

In [None]:
context_text = "The article discussed how social media platforms like Facebook and Twitter are dealing with harmful content and political misinformation leading up to the next US presidential election."
full_response = run_llm_inference(llm=llm, context_text=context_text, template=sentiment_prompt_template)
pretty_print_html(full_response)

### Role Playing

In [None]:
role_prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an AI research assistant. Your tone is technical and scientific. Your name is {name}. Keep your answers less than 5 sentences."),
        ("human", "Hello, how are you doing and who are you?"),
        ("ai", "Greeting! I am an AI research assistant named {name}. I am doing well! How can I help you today?"),
        ("human", "{text}"),
    ]
)

In [None]:
text = "Can you tell me about the creation of volcanic mountains?"
full_response = run_llm_inference(llm=llm, name="Jarvis", text=text, template=role_prompt_template)
pretty_print_html(full_response)

### Code Generation

Code generation involves prompting a model to generate code without providing any examples, relying solely on the model's pre-training. We will test this by giving the model instructions to write code snippets without any demonstrations.

The model will attempt to produce valid code based on these descriptions using its implicit knowledge gained during training.

Evaluating zero-shot code generation will allow us to test the limits of the model's unaided coding skills for different languages. The results can reveal strengths, gaps, and opportunities to supplement with few-shot examples.

This section will provide insights into current capabilities and future work needed to move toward general-purpose AI that can code without extensive training.

#### Python code generation:

For Python, we can provide prompts like:
- Write a Python function that prints numbers from 1 to 10
- Generate Python code to open a file and read the contents

In [None]:
def _sanitize_output(text: str):
    if 'java' in text:
        _, after = text.split("```java")
    elif 'python' in text:
        _, after = text.split("```python")
    return after.split("```")[0]

In [None]:
code_template = """Write some python code to solve the user's problem. 

Return only python code in Markdown format, e.g.:

```python
....
```"""
code_prompt_template = ChatPromptTemplate.from_messages([("system", code_template), ("human", "{code_text}")])

In [None]:
code_text = "Download a file from s3://xyz/file/sample.xyz from Amazon S3 to local disk path ./"

full_response = run_llm_inference(
    llm=llm, code_text=code_text, template=code_prompt_template, sanitize_output=_sanitize_output, repl=None
)

pretty_print_html(full_response)

In [None]:
# chain = prompt | llm | StrOutputParser() | _sanitize_output  
code_text = "Generate a Python program that prints the numbers from 1 to 10"
full_response = run_llm_inference(
    llm=llm, code_text=code_text, template=code_prompt_template, 
    sanitize_output=_sanitize_output, repl=PythonREPL().run
)
pretty_print_html(full_response)

#### JavaScript code generation:

For JavaScript:
- Write a JavaScript function that returns the maximum value in an array
- Generate JavaScript code to create a for loop that prints numbers 1 to 5

In [None]:
code_template = """Write some java code to solve the user's problem. 

Return only java code in Markdown format, e.g.:

```java
....
```"""
code_prompt_template = ChatPromptTemplate.from_messages([("system", code_template), ("human", "{code_text}")])

In [None]:
code_text = "Write a JavaScript function that returns the largest number in an array"

full_response = run_llm_inference(llm=llm, code_text=code_text, template=code_prompt_template, sanitize_output=_sanitize_output)

pretty_print_html(full_response)

#### SQL Code generation:

For SQL:
- MySQL query to get the title and quantity for all books where the quantity is greater than 100
- Generate SQL code to join the "orders" and "products" tables

In [None]:
sql_template = """
{tables_text}

Create a MySQL query to get the title and quantity for all books where the quantity is greater than 100 and explain the query in no more than 100 words.
"""

sql_prompt_template = PromptTemplate.from_template(sql_template)

In [None]:
tables_text = """
Table books, columns = [BookId, Title, Author]
Table inventory, columns = [BookId, Quantity]
"""

full_response = run_llm_inference(
    llm=llm, tables_text=tables_text, template=sql_prompt_template, sanitize_output=None, repl=None
)

pretty_print_html(full_response)

### Reasoning

Today, one of the most formidable challenges for Large Language Models (LLMs) lies in the domain of reasoning. This area intrigues me significantly, given the intricate applications that could benefit from enhanced reasoning capabilities in LLMs.

While there have been strides in the model's mathematical functionalities, it's crucial to underscore that tasks involving reasoning are often stumbling blocks for existing LLMs. Specialized techniques in prompt engineering are imperative to navigate these challenges. While we'll delve into these advanced strategies in an upcoming guide, this lab will provide a primer by walking you through basic examples that demonstrate the model's capabilities in deductive reasoning and logical inferences.

In [None]:
reasoning_prompt_template = ChatPromptTemplate.from_messages(
    [
        SystemMessage(
            content=(
                "Given the facts, Use logical reasoning to conclude."
            )
        ),
        HumanMessagePromptTemplate.from_template("Given Fact: {fact_text}\nExplain your reasoning: {explain_text}"),
    ]
)

In [None]:
fact_text = "All men are mortal. Socrates is a man."
explain_text = "Is Socrates mortal?"

full_response = run_llm_inference(llm=llm, fact_text=fact_text, explain_text=explain_text, template=reasoning_prompt_template)

pretty_print_html(full_response)

In [None]:
fact_text = "There are 5 people - Alan, Beth, Cindy, David and Erica. Alan is taller than Beth. Beth is shorter than Cindy. Cindy is taller than David. David is taller than Erica."
explain_text = "Who is the tallest?"

full_response = run_llm_inference(llm=llm, fact_text=fact_text, explain_text=explain_text, template=reasoning_prompt_template)

pretty_print_html(full_response)

## Advanced Prompting Techniques
---

### Zero-shot

Modern large language models like Llama have been optimized to follow instructions and trained on enormous datasets. This enables them to perform certain tasks without any fine-tuning, known as zero-shot learning. Previously, we evaluated some zero-shot prompts. For example:

In [None]:
zs_template = """
Translate this sentence to {language}: I am learning to speak {language}.
"""

zs_prompt_template = PromptTemplate.from_template(zs_template)

In [None]:
language = "French"

full_response = run_llm_inference(
    llm=llm, language=language, template=zs_prompt_template
)

pretty_print_html(full_response)

Although Large Language Models (LLMs) exhibit impressive abilities in zero-shot scenarios, their performance can falter when tackling more intricate tasks within that context. To ameliorate this, the concept of few-shot prompting comes into play. This technique facilitates in-context learning by incorporating example-based guidance directly into the prompt, thereby enhancing the model's output accuracy. These examples act as a form of conditioning that influences the model's responses in subsequent instances.

For illustrative purposes, let's delve into a hands-on example of few-shot prompting. In this exercise, the objective is to accurately incorporate a novel term into a sentence.

### Few-shot prompts

Though large language models can perform impressively without training, their zero-shot abilities still have limitations on more difficult tasks. Few-shot prompting enhances in-context learning by supplying model demonstrations directly in the prompt. These examples provide conditioning to guide the model's response for new inputs. As described by [Brown et al. (2020)](https://arxiv.org/abs/2005.14165), few-shot prompting can be applied to tasks like properly using novel words in sentences. With just a couple demonstrations, the model can acquire new concepts and skills without full training. This technique harnesses the few-shot capabilities of large models to achieve greater generalization and reasoning from small amounts of data.

In [None]:
examples = [
    {
        "statement": "A 'blicket' is a tool used for farming",
        "example_sentence_usage": "The farmer used a blicket to dig holes and plant seeds.",
    },
    {
        "statement": "'Flooping' refers to a dance move where you spin around",
        "example_sentence_usage": "The kids were flooping in circles at the dance party.",
    },
    {
        "statement": "A 'zindle' is a small, magical creature that loves to collect shiny objects",
        "example_sentence_usage": "The zindle scurried across the floor, gathering coins and jewelry in its tiny hands.",
    },
    {
        "statement": "'Quizzle' means to solve a puzzle in a very creative way",
        "example_sentence_usage": "After hours of thinking, she managed to quizzle her way out of the tricky math problem.",
    }
]

# "Splonk" describes the sound something makes when it falls into water. An example of a sentence that uses the word splonk is:
# The rock made a loud splonk as it dropped into the pond.

template_format = """
Statemenent: {statement}
Sample Usage: {example_sentence_usage}
"""

few_shot_prompt_template = PromptTemplate(
    input_variables=[
        "statement", 
        "example_sentence_usage"
    ], 
    template=template_format
)

In [None]:
prefix = "The following are set of example usage of made up terms. Read the latest statement and provide an example usage."
# and the suffix our user input and output indicator
suffix = """
Statemenent: {statement}
Sample Usage: """

# now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=few_shot_prompt_template,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

In [None]:
statement = "'Splonk' describes the sound something makes when it falls into water."

full_response = run_llm_inference(
    llm=llm, statement=statement, template=few_shot_prompt_template
)

pretty_print_html(full_response)

You'll notice that the LLM has the ability to grasp the task with just a single example, commonly known as 1-shot learning. For tasks that are more challenging, the lab allows you to incrementally scale the number of examples or "shots" (such as 3-shot, 5-shot, or even 10-shot) to experiment with improving the model's performance. Below we leveage a 5-shot Few-shot prompt to create a short story.

In [None]:
examples = [
    {
        "example": "A 'blonset' is a tool used for cutting metal."
    },
    {
        "example": "A 'fendle' is a vegetable that grows underground."
    },
    {
        "example": "A 'Vixting' means climbing a tree very quickly."
    },
    {
        "example": "A 'Zugging' refers to a sport played with a small ball."
    }
]

# "Splonk" describes the sound something makes when it falls into water. An example of a sentence that uses the word splonk is:
# The rock made a loud splonk as it dropped into the pond.

template_format = "Made-up tool: {example}"

few_shot_prompt_template = PromptTemplate(
    input_variables=["example"], 
    template=template_format
)

In [None]:
prefix = "The following are set of example usage of made up terms. Read and respond to a user's request."

# and the suffix our user input and output indicator
suffix = """
Query: {query}
Answer: """

# now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=few_shot_prompt_template,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

In [None]:
query = "Create a short story that uses all 4 words:"

full_response = run_llm_inference(
    llm=llm, query=query, template=few_shot_prompt_template
)

pretty_print_html(full_response)

### Chain-of-Thought (CoT) Prompting

Chain-of-thought (CoT) prompting, proposed by [Wei et al. (2022)](https://arxiv.org/abs/2201.11903), is a technique that facilitates complex reasoning in language models by having them show intermediate steps. This method can be used with few-shot prompting, where just a few examples provide the context. The combination enables improved performance on challenging tasks that involve reasoning through multiple steps before generating a final response. By eliciting the reasoning process explicitly, CoT prompting aims to develop stronger logical thinking and rationality in language models when applying them to complex inferential problems with limited data.

### Zero-shot CoT

A new approach called zero-shot chain-of-thought prompting was recently proposed by [Kojima et al. (2022)](https://arxiv.org/abs/2205.11916). This technique involves adding the phrase "Let's think step by step" to prompts to encourage the model to show its reasoning. We can test this method on a simple problem to see how well the model explains its logical thinking process. By explicitly cueing the model to demonstrate step-by-step reasoning, zero-shot chain-of-thought prompting aims to improve transparency and understandability without requiring training on reasoning demonstrations. This emerging technique represents an interesting way to potentially enhance rationality in large language models.

In [None]:
zscot_template = """Let's think step-by-step.
If a standard deck of 52 playing cards has 4 suits (Hearts, Diamonds, Clubs and Spades) with 13 cards in each suit, how many total face cards (Jack, Queen, King) are there?
Please demonstrate the reasoning.
A:
"""

zscot_prompt_template = PromptTemplate.from_template(zscot_template)

In [None]:
full_response = run_llm_inference(llm=llm, template=zscot_prompt_template)

pretty_print_html(full_response)

### Few-Shot CoT

Few-shot Chain of Thought (CoT) prompting is a technique that combines few-shot learning with intermediate reasoning steps. In few-shot learning, just a small number of examples or "shots" are provided to the model to demonstrate the desired behavior. Chain-of-thought prompting has the model show its step-by-step reasoning process explicitly.

In Few-shot CoT, we give the model a couple of examples that demonstrate both the target skill and the reasoning chain. This provides the model with the context needed to apply similar skills and reasoning processes to new situations.

In the example below we show the model a example of a multi-step math problem with reasoning steps:

In [None]:
examples = [
    {
        "step": "1) Originally there were 6 oranges"
    },
    {
        "step": "2) 4 oranges were peeled"
    },
    {
        "step": "3) So there must be 6 - 4 = 2 unpeeled oranges left"
    }
]

# "Splonk" describes the sound something makes when it falls into water. An example of a sentence that uses the word splonk is:
# The rock made a loud splonk as it dropped into the pond.

template_format = """
Step: {step}

"""

few_shot_prompt_template = PromptTemplate(
    input_variables=["step"], 
    template=template_format
)

In [None]:
prefix = """
Let's think through this.

If there were 6 oranges originally and 4 were peeled, how many unpeeled oranges are left?
"""

# and the suffix our user input and output indicator
suffix = """

Let's think step-by-step:

Query: {query}
Answer: """

# now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples,
    example_prompt=few_shot_prompt_template,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

In [None]:
query = "If David had 9 cakes and ate 4 of them, how many are left?"

full_response = run_llm_inference(
    llm=llm, query=query, template=few_shot_prompt_template
)

pretty_print_html(full_response)