# Prompt Engineering Quest to 300 Workshop
Welcome to the Prompt Engineering Quest to 300 Workshop. This workshop is designed as 'Quest to 300' where you will be learning prompt best practices by solving quest questions.  
The workshop starts by introducing various prompting concepts which should be used later to solve the quest questions.

## Workshop model providers
This workshop will use two model providers Anthropic and Mistral AI available on Amazon Bedrock.

__Anthropic__
Anthropic's Claude family of models – Haiku, Sonnet, and Opus – allow customers to choose the exact combination of intelligence, speed, and cost that suits their business needs. Claude 3 Opus, the company's most capable model, has set a market standard on benchmarks. All of the latest Claude models have vision capabilities that enable them to process and analyze image data, meeting a growing demand for multimodal AI systems that can handle diverse data formats. While the family offers impressive performance across the board, Claude 3 Haiku is one of the most affordable and fastest options on the market for its intelligence category.


__Mistral AI__
Mistral AI is a small creative team with high scientific standards. Mistral models are efficient, helpful and trustworthy through ground-breaking innovations.



In [None]:
!pip install --upgrade pip --quiet
!pip install -r requirements.txt --quiet

In [None]:
import boto3

# Default C3 model id: anthropic.claude-3-haiku-20240307-v1:0
# DEfault Mistral model id: mistral.mistral-7b-instruct-v0:2

from completions import get_c3_completion, get_mistral_completion

PROMPT = "Hello, Claude!"
messages = [
    {"role": "user", "content": PROMPT}
]
print(f'\nClaude:\n{get_c3_completion(messages)}')

PROMPT = "<s>[INST] Hello, Mistral [/INST]"
print(f'\nMistral:\n{get_mistral_completion(PROMPT)}')

In [None]:
!pip install --upgrade pip --quiet
!pip install -r requirements.txt --quiet

## Prompt structures
Most LLM prompts contain one or more conceptual prompt components that can help direct the model attention to the correct direction. These components help articulate the prompt designer intent. Prompts that lack certain components can result in long, inefficient conversations, with many query refinements.
The general prompt structure which can be used among most of the LLMs is called COSTAR, each LLM might have it's own refinment for this general structure.
### COSTAR prompt structure
<b>Context (C)</b>

Providing background information helps the LLM understand the specific scenario, ensuring relevance in its responses.

Example: I am a personal productivity developer. In the realm of personal development and productivity, there is a growing demand for systems that not only help individuals set goals but also convert those goals into actionable steps. Many struggle with the transition from aspirations to concrete actions, highlighting the need for an effective goal-to-system conversion process.

<b>Objective (O)</b>

Clearly defining the task directs the LLM’s focus to meet that specific goals.

Example: Your task is to guide me in creating a comprehensive system converter. This involves breaking down the process into distinct steps, including identifying the goal, employing the 5 Whys technique, learning core actions, setting intentions, and conducting periodic reviews. The aim is to provide a step-by-step guide for seamlessly transforming goals into actionable plans.

<b>Style (S)</b>

Specifying the desired writing style, such as emulating a famous personality or professional expert, guides the LLM to align its response with your needs

Example: Write in an informative and instructional style, resembling a guide on personal development. Ensure clarity and coherence in the presentation of each step, catering to an audience keen on enhancing their productivity and goal attainment skills.

<b>Tone (T)</b>

Setting the tone ensures the response resonates with the required sentiment, whether it be formal, humorous, or empathetic.

Example: Maintain a positive and motivational tone throughout, fostering a sense of empowerment and encouragement. It should feel like a friendly guide offering valuable insights.

<b>Audience (A)</b>

Identifying the intended audience tailors the LLM’s response to be appropriate and understandable for specific groups, such as experts or beginners.

Example: The target audience is individuals interested in personal development and productivity enhancement. Assume a readership that seeks practical advice and actionable steps to turn their goals into tangible outcomes.

<b>Response (R)</b>

Providing the response format, like a list or JSON, ensures the LLM outputs in the required structure for downstream tasks.

Example: Provide a structured list of steps for the goal-to-system conversion process. Each step should be clearly defined, and the overall format should be easy to follow for quick implementation.
### Claude 3 prompt basics
![alt text](/assets/c3-prompt-structure.png)  
__Figure 2__: Claude 3 recommended prompt structure  

__Use XML tags__  
Claude is particularly familiar with prompts that have [XML tags](https://docs.anthropic.com/claude/docs/use-xml-tags) as Claude was exposed to such prompts during training. By wrapping key parts of your prompt (such as instructions, examples, or input data) in XML tags, you can help Claude better understand the context and generate more accurate outputs. This is particularly important when working with [long context window](https://docs.anthropic.com/claude/docs/long-context-window-tips) (30K+ tokens).  

Using XML tags improve Claude accuracy, helps Claude understand the hierarchy and relationships within your prompt, and making it simpler to extract key information programmatically by referencing specific tag within the completion.  

__Messages API__  
The messages part of the API is an array of input messages. Claude 3 models are trained to operate on alternating user and assistant conversational turns. When creating a new Message, you specify the prior conversational turns with the messages parameter, and the model then generates the next Message in the conversation.

The Messages API for Claude 3 expects input in a specific [JSON format](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html). The main object is called "messages", which is an array of message objects.

Each message object in the "messages" array should have the following structure:

```json
{
  "role": "user|assistant",
  "content": "The actual text content of the message"
}
```

When role is set to "assistant", "content" represents Claude response.

So a simple example of the "messages" array with one message from a user could be:

```json
{
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ]
}
```
Claude 3 models are trained to operate on alternating user and assistant conversational turns.  
You can include multiple message objects in the "messages" array to represent a full conversation history. Claude will then generate a response based on the entire context provided.  

`user` and `assistant` messages __MUST alternate__, and messages __MUST start__ with a `user` turn. You can have multiple user & assistant pairs in a prompt (as if simulating a multi-turn conversation).

Here is an example of chat:  
```json
{
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    },
    {
      "role": "assistant",
      "content": "The capital of France is Paris."
    },
    {
      "role": "user",
      "content": "Recommend some places to visit in Paris."
    }
  ]
}
```
__Multimodal prompts__  
A multimodal prompt combines multiple modalities (images and text) in a single prompt. You specify the modalities in the content input field. The image should be sent to the model in base64 format. You can supply __up to 20 images__ to the model. __You can't put images in the assistant role__.  
The following example shows how you could ask Anthropic Claude to describe the content of a supplied image:  
```json
{
    "anthropic_version": "bedrock-2023-05-31", 
    "max_tokens": 1024,
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/jpeg",
                        "data": "iVBORw..."
                    }
                },
                {
                    "type": "text",
                    "text": "What's in these images?"
                }
            ]
        }
    ]
}
```
While Claude’s image understanding capabilities are cutting-edge, there are some [limitations](https://docs.anthropic.com/en/docs/vision#limitations) to be aware of.  

__Prefill Claude's response__  
When using Claude, you have the ability to guide its responses and control the output format by prefilling the `assistant` content and providing clear instructions. These techniques allow you to direct Claude's actions, specify the structure and style of the generated content, and even help Claude stay in character during role-play scenarios. By leveraging prefilling and output format control, you can significantly improve Claude's performance and obtain more accurate and tailored responses.  

Note that Claude will generate completions following the `assistant` content, so the prefill string is not part of the model output.  

Lets see an example:

In [None]:
prompt = """
    Please extract the name, size, price, and color from this product description and output it within a JSON object.
    <description>
        The SmartHome Mini is a compact smart home assistant available in black or white for only $49.99.
        At just 5 inches wide, it lets you control lights, thermostats, and other connected devices via voice 
        or app—no matter where you place it in your home.
        This affordable little hub brings convenient hands-free control to your smart devices.
    </description>
"""

messages=[
    {
        "role": "user",
        "content": prompt
    },
    {
        "role": "assistant",
        "content": "{"
    }
]
print(get_c3_completion(messages))


__What is a system prompts?__  
A [system prompts](https://docs.anthropic.com/en/docs/system-prompts) is an effective way to provide context, scope, examples, guardrails, or output format to the model before presenting it with a question or task. System prompt provide an additional layer of guidance and control over Claude's output.
System prompts try to ensure that the AI's output aligns with specific goals or tasks across various domains.  
The system prompt should be populated with any 'global' context or rules that apply to all requests during the conversation, while anything that is unique to a particular request should be in user role.  
While system prompts can increase Claude's robustness and resilience against unwanted behavior, they do not guarantee complete protection against jailbreaks or leaks.


In [None]:
system = "All your output must be pirate speech"
messages=[
    {
        "role": "user",
        "content": "Tell me a very short story"
    }
]
print(get_c3_completion(messages, system))


__Anthropic Claude 3 prompt resources__  
[Prompt Engineering with Anthropic's Claude v3 workshop](https://catalog.us-east-1.prod.workshops.aws/workshops/0644c9e9-5b82-45f2-8835-3b5aa30b1848/en-US)  
[Anthropic's Prompt engineering docs](https://docs.anthropic.com/claude/docs/prompt-engineering)  
[Anthropic's Prompt Engineering Interactive Tutorial - Bedrock Edition](https://github.com/aws-samples/prompt-engineering-with-anthropic-claude-v-3)


### Mistral recommended prompt structure
Mistral _pre-trained_ models provides a strong foundation with broad language understanding and performance across various tasks, the _instruction-tuned_ models are tailored for quick adaptation to specific tasks through fine-tuning based on instructions or prompts.

The template used to build a prompt for Mistral Instruct models is defined as follows:  
```
<s>[INST] Instruction [/INST]
```
Note that `<s>` and `</s>` are special __tokens__<sup>1</sup> for beginning of sequence (__BOS__) and end of sequence (__EOS__)<sup>2</sup>, while `[INST]` and `[/INST]` are regular strings.  
The `[INST]` and `[/INST]` 'tags' indicate to the instruction tuned model where the _INSTruction_ is.  

__This prompt format must be strictly respected__. Otherwise, the model will generate sub-optimal outputs, as we are not following the template used to build a prompt for the Instruct model.

__Mistral chat (multi-turn) prompt pattern__  
For effective chatbot use-case it's recommended to use the following chat template:  
> __\<s\>[INST]__ Instruction __[/INST]__ Model answer __\</s\>[INST]__ Follow-up instruction __[/INST]__

Pay attention that after the last `Follow-up instruction [/INST]` there is no closing `</s>`. This signal the model a completion is expected for the `Follow-up instruction`.  

__Note__: You might have noticed that whenever we have talked about `[INST]` and `[/INST]` we've referred to them as strings, not tokens. The simple reason for this, is that these are strings not tokens. :)  
The model tokenizer was defined when the base model was trained. This tokenizer include two tokens, among many others, which represents `<s>` and `</s>`, but there is no token to represent `[/INST]` or `[/INST]`. So for example, when Mistral 7B was trained, the engineers created or chose a tokenizer to use, and that was that, done. When the model was instruction fine tuned, there is no need (or ability) to change the tokenizer as it's ingrained into the model already. The `[INST]` and `[/INST]` stings arrived in the instruction fine tuning dataset.

[1] __Tokens__ - In order to the train a language model, the engineers first needed to convert the training text from words to numbers, because no matter what type of machine learning model we use, they all only work with numbers. For LLMs, this process is called tokenization. We train a tokenizer to convert words to numbers and output a lookup table so that once it's done, we can easily convert in both directions, words to numbers, and numbers to words. We then use this tokenizaton both during the training of the LLM, and when we are making generations with the LLM - called inference. During inference the prompt is tokenized to token ids (numbers), run through the LLM, which then generates token ids (numbers), which are de-tokenized back to words for our human squishy brains to read.  

[2] __Beginning of sequence tokens__: denote the start of a sequence. If the LLM sees a bos token id, then it knows it's not missing anything that came before, and it indicates a fresh start, free from any previous context that might influence the generated output.   
__End of sequence tokens__: denotes the end of a sequence. At infernece, If this token id is generated by the LLM, then it's an indication that the model is done, and the application that is managing the LLM can stop right there, there is nothing more to be said.

Though Mistral currently doesn't support the messages API, the following code shows how you can format the prompt in similar format:

In [None]:
from typing import Dict, List

def format_instructions(instructions: List[Dict[str, str]]) -> List[str]:
    """Format instructions where conversation roles must alternate user/assistant/user/assistant/..."""
    prompt: List[str] = []
    for user, answer in zip(instructions[::2], instructions[1::2]):
        prompt.extend(["<s>", "[INST] ", (user["content"]).strip(), " [/INST] ", (answer["content"]).strip(), "</s>"])
    prompt.extend(["<s>", "[INST] ", (instructions[-1]["content"]).strip(), " [/INST] "])
    return "".join(prompt)

def print_instructions(prompt: str, response: str) -> None:
    bold, unbold = '\033[1m', '\033[0m'
    print(f"{bold}> Input{unbold}\n{prompt}\n\n{bold}> Output{unbold}\n{response}\n")

In [None]:
# Chat example
instruction1= "Generate 4 very short sentences?"
messages = [
    {"role": "user", "content": instruction1}
]
prompt = format_instructions(messages)
response = get_mistral_completion(prompt)
print_instructions(prompt, response)

instruction2 = "Categorize each sentence for being positive/ negative/neutral"
messages = [
    {"role": "user", "content": instruction1},
    {"role": "assistant", "content": response},
    {"role": "user", "content": instruction2}
]
prompt = format_instructions(messages)
response = get_mistral_completion(prompt)
print_instructions(prompt, response)

__Delimiters__  
Use delimiters to specify the boundary between different sections of the text. Delimiters can be anything, for example: ```""",< >,:,###,<<< >>>```


In [None]:
# In below example, we use ### to indicate examples and <<<>>> to indicate customer inquiry.
inquiry = "I forgot my card security number"
prompt_template = f"""
        You are a bank customer service bot. Your task is to assess customer intent 
        and categorize customer inquiry after <<<>>> into one of the following predefined categories:
        
        card arrival
        change pin
        exchange rate
        country support 
        cancel transfer
        charge dispute
        
        If the text doesn't fit into any of the above categories, classify it as:
        customer service
        
        You will only respond with the predefined category. Do not include the word "Category". Do not provide explanations or notes. 
        
        ####
        Here are some examples:
        
        Inquiry: How do I know if I will get my card, or if it is lost? I am concerned about the delivery process and would like to ensure that I will receive my card as expected. Could you please provide information about the tracking process for my card, or confirm if there are any indicators to identify if the card has been lost during delivery?
        Category: card arrival
        Inquiry: I am planning an international trip to Paris and would like to inquire about the current exchange rates for Euros as well as any associated fees for foreign transactions.
        Category: exchange rate 
        Inquiry: What countries are getting support? I will be traveling and living abroad for an extended period of time, specifically in France and Germany, and would appreciate any information regarding compatibility and functionality in these regions.
        Category: country support
        Inquiry: Can I get help starting my computer? I am having difficulty starting my computer, and would appreciate your expertise in helping me troubleshoot the issue. 
        Category: customer service
        ###
    
        <<<
        Inquiry: {inquiry}
        >>>
        """
messages = [
  { "role": "user", "content": prompt_template }
]
prompt = format_instructions(messages)
response = get_mistral_completion(prompt)
print(response)

#### Mistral AI prompt resources
[How to Prompt Mistral AI models, and Why](https://community.aws/content/2dFNOnLVQRhyrOrMsloofnW0ckZ/how-to-prompt-mistral-ai-models-and-why)  
[Prompting Capabilities](https://docs.mistral.ai/guides/prompting_capabilities/)  
[A deep dive into Mistral 7B and Mixtral 8x7B, available on Amazon Bedrock](https://community.aws/content/2cZUf75V80QCs8dBAzeIANl0wzU/winds-of-change---deep-dive-into-mistral-ai-models)   

## Inference parameters for foundation models
Inference parameters to influence the response generated by the model. You set inference parameters in the body field of the Bedrock InvokeModel or InvokeModelWithResponseStream API. Each model may support different set of inference parameters, read [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html) for details specific to your model. Most of the models supports these parameters (though their naming might slightly differ). [Inference parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html) describe in more depth the various common parameters. Below is simplified explenation:

__max_tokens__ - The maximum  number of tokens to generate in the response. The model might stop generating tokens before reaching the value of max_tokens. Different models have different maximum values for this parameter. The number of output tokens is computed against the 'Max tokens' parameter on the Bedrock model card, which defines the maximal input + output tokens.

__temperature__ - How creative is the model completion. Controls the divercity of potential next tokens. When set to zero '0' the model oputput will silightly change, if at all, between same prompt invocations. Use cases:  
![](assets/temperature.png) />

__top_k__ - Refers to the number of most likely tokens to consider at each step of generating text. It helps control the diversity of generated text by restricting the choices to the top_k most probable tokens. However, top_k doesn't enforce the probability threshold for tokens, thus you can have a token with low probability as part of your K final tokens. For this we have top_p.  

__top_p__ - The percentage of most-likely tokens in a pool that the model considers for the next token. This ensures that the overall likelihood of the next token remains high.

__stop_sequences__ - Custom text sequences that stop the model from generating further tokens. If the model generates a stop sequence that you specify, it will stop generating after that sequence.



## Basic prompting concepts
### Zero-shot prompting
A zero-shot prompt is a prompt that does not include any examples of the desired response, and the output is based on the model In-Context Learning.  
Recommended uses:  
* Comparatively long inputs and outputs, where providing examples could be cost-prohibitive or response time-prohibitive
* Less precise output structure required, or structure can be reliably generated without providing examples.

Example:
> I thought it was pretty decent.  
What is the sentiment of the above text? Respond with either "Positive" or "Negative"

### Few-shot prompting
Model is given multiple examples to learn from, plus the instruction. This can help the model understand exactly what a “good” response looks like. Few-shot prompting can provide additional context and guidance and improve performance. To avoid overfitting or unexpected behaviors, ensure your examples are diverse and representative of the full range of desired outputs.  
Recommended uses:  
* Comparatively short inputs or outputs, (try to minimize excess token consumption)
* Precise output structure required, which could be challenging to achieve with zero-shot prompting

Example:  
> I liked it //Positive  
It was OK //Positive  
Couldn't believe how bad it was!//Negative  
Not my favorite //Negative  
This is meh.  
What is the sentiment of the above text?

### Chain of Thought Prompting (CoT)
Technique that breaks down complex tasks through intermediate reasoning steps. Encourages model to explain its reasoning process by decomposing the solution into a series of steps. You can combine it with few-shot prompting to get better results on more complex tasks that require reasoning before responding. Prompting for step-by-step reasoning will increase the output length, which can impact latency. Consider this tradeoff when deciding whether to use this technique. Also, there's no way to have the model think privately and only return the final answer. To make it easier to separate step-by-step output from its final response, consider using delimiters or XML tags.  

Benefits:  
* Improves performance on arithmetic, commonsense, and symbolic reasoning tasks
* CoT allows models to decompose multi-step problems into intermediate steps
* Provides an interpretable window into the behavior of the model. Can be used for explainability and debugging
* Reasoning can be used for tasks such as math word problems
* Including examples of chain of thought sequences into the exemplars of few-shot prompting can elicit reasoning ability of LLMs  

Example:
> [Rest of prompt] Before answering the question, please think about it step-by-step within \<thinking\>\</thinking\> tags. Then, provide your final answer within \<answer\>\</answer\> tags.

### Prompt chaining (Prompt Decomposition)
[Prompt chaining](https://www.promptingguide.ai/techniques/prompt_chaining) is the process of taking large prompts and breaking them down into a logical flow of smaller prompts linked together by an orchestration layer.  
Why do we need it?  

* Helps to boost the transparency of your LLM application, increases controllability, and reliability. This means that you can debug problems with model responses much more easily and analyze and improve performance in the different stages that need improvement.
* Large prompts are hard to maintain and scale
    * Each change could potentially impact the rest of the prompt
    * “Dead zones” ('unreachable code' depending on input, some parts will never be used.)
    * Longer prompts are harder for LLM’s to understand
* Potential latency improvements by moving to smaller prompt
* Cost optimization by moving to smaller prompt
* Can help to useEnable switching to smaller, more performant, and more cost effective models

Orchestration can be done in many ways:

* Direct in code (common)
* Traditional tools: Airflow, Step functions, etc.
* 3rd party libraries: LangChain, Llamaindex, etc.

![alt text](/assets/prompt-decomposition.png)


<u>__Example - Prompt Chaining for Document QA__</u>:  
One common use case of LLMs involves answering questions about a large text document. It helps if you design two different prompts where the first prompt is responsible for extracting relevant quotes to answer a question and a second prompt takes as input the quotes and original document to answer a given question. In other words, you will be creating two different prompts to perform the task of answering a question given in a document.

The first prompt below extracts the relevant quotes from the document given the question.
The second prompt then takes the relevant quotes extracted by prompt 1 and prepares a helpful response to the question given in the document and those extracted quotes.

<u>Prompt 1</u>:  
> You are a helpful assistant. Your task is to help answer a question given in a document. The first step is to extract quotes relevant to the question from the document, delimited by ####. Please output the list of quotes using \<quotes\>\</quotes\>. Respond with "No relevant quotes found!" if no relevant quotes were found.  
>
> ####{{document}}####

The quotes that were returned in the first prompt can now be used as input to the second prompt below.  
<u>Prompt 2</u>:  
> Given a set of relevant quotes (delimited by \<quotes\>\</quotes\>) extracted from a document and the original document (delimited by ####), please compose an answer to the question. Ensure that the answer is accurate, has a friendly tone, and sounds helpful.  
>
> ####{{document}}####  
> \<quotes\>  
> ...  
> \</quotes\>

#### Tool use, Agents - are these the same?
There seems to be a lot of confusion surrounding Tool use and Agents - whether they are the same thing. Agents and tool use (function calling) are distinct concepts. Agents employ an LLM to dynamically determine which tools to utilize and in what sequence, making them suitable for open-ended scenarios where the solution path is unknown. In contrast, tool use is suitable when you have pre-determined workflow, and you need the application to interact with other systmes (text-to-sql, code functions, calculator, etc) based on input from an LLM to perform a task, following a deterministic fashion. This approach offers better performance and control for well-defined use cases.

Agents operate using the ReAct paradigm, involving multiple calls to plan, act, and evaluate, can become chatty, leading to accumulation of tokens and cost, and increased latency. Tool use, on the other hand, typically implemented by using a semantic router, which uses a lightweight model, such as Claude 3 Haiku, to classify which tool(s) to use and with what inputs. On the client side, you should extract the tool name and input from the response, and call the appropriate tool. This modular design simplifies prompt engineering, isolates validations, and enables easy maintenance and scalability.  
By transitioning from Agents to a semantic router, latency can be significantly reduced, and prompt engineering becomes more straightforward, allowing for the execution of conditional logic based on true/false validations.

### Prompt Catalog
Curated collection of prompts designed to elicit certain behaviors, attributes, or capabilities from a generative AI system.  
Examples:  
* [Anthropic Claude Prompt Library](https://docs.anthropic.com/claude/prompt-library) - Collection of Claude optimized prompts for a breadth of tasks.  
* [LangSmith Prompt Hub](https://docs.smith.langchain.com/hub/quickstart) - Discover, share, and version control prompts for LangChain and LLMs in general


## Prompt Patterns
Prompt patterns, are similar to software design patterns, offer reusable strategies for crafting effective prompts to generate desired outputs from large language models (LLMs). As a component of prompt engineering, these patterns serve as a knowledge transfer method, providing solutions to common challenges encountered during output generation and interaction with LLMs.
### Persona pattern
Provide context and guidelines for the AI model when generating responses by specifying a fictional or role-based identity for the LLM to adopt when responding to prompts.  
__When to use__:  
When we want the LLM to adopt a specific point of view instead of generic response.  
__Prompt pattern__:  
_From now on, act as [persona]. Pay close attention to [details to focus on]. Provide outputs that [persona] would regarding the input._  
__Example__:  
> You are a history professor with expertise in ancient civilizations. You have a passion for archaeology and enjoy discussing historical mysteries. In your answer maintain a formal and knowledgeable tone suitable for academic discussions.

In [None]:
system = ""
no_persona = """
    what is schrodinger's cat paradox?
    Answer concisely
"""
messages=[
    {
        "role": "user",
        "content": no_persona
    }
]
print(f'no_persona:\n{get_c3_completion(messages, system)}\n')

system = "From now on answer as a quantum mechanics philosopher."
print(f'with_persona:\n{get_c3_completion(messages, system)}\n')

### Flipped Interaction pattern
In a typical interaction, the user asks questions to the LLM, and the LLM provides answers based on its training data and knowledge. However, in the Flipped Interaction Pattern, the LLM is encouraged to ask questions to the user to guide the process of solving a problem or achieving a specific objective.  
__When to use__:  
This pattern is helpful at times where it needs information you don't yet know or haven't thought of and would want to rely on LLM’s vast knowledge to help you.  
__Prompt pattern__:  
_From now on, I would like you to ask me questions to [do a specific task]. When you have enough information to [do the task], create [output you want]._  
__Example__:  
> I would like you to ask me questions to write a marketing campaign for a new product launch. When you have enough information list the steps needed for the successful launch of the product

In [None]:
no_flip = "<s>[INST] what is the best AWS service to handle streaming ingestion? [/INST]"
print(f'no_flip:\n{get_mistral_completion(no_flip)}\n')

with_flip = f"""
  <s>[INST] You are an AWS solutions architect with vast knowledge in AWS analytics servies.
  From now on, I would like you to ask me questions to suggest on a streaming architecture to a customer.
  When you have enough information to suggest the achitecture that best fit customer needs,
  create concise summary explaining the architecture and its benefits.  [/INST]
"""
print(f'with_flip:\n{get_mistral_completion(with_flip)}')


### Question Refinement Pattern
Ask the LLM to offer a better version of the question asked. This is especially useful when the user asking the question is not an expert in the field of question being asked and would like to rely on the knowledge an LLM has in that field.  
__When to use__:  
This pattern works best in areas where our initial question is broad or vague or we may not have as much information or underlying thought behind our question as LLM has from it's training data.  
__Prompt pattern__:  
_From now on, when I ask a question, suggest a better version of the question to use that incorporates information specific to [use case] and ask me if I would like to use your question instead._  
__Example__:  
> I Whenever I ask a question about implementing a feature, suggest a better version of the question that focuses on best practices and the specific programming language or framework I’m using, and ask me if I would like to use your question instead

In [None]:
# Note how we place the refined question within XML tage to simplify it's extraction from the completion text
system = """
    From now on, when I ask a question, suggest a better version of the question to use that 
    incorporates information specific to the question asked and ask me if I would like 
    to use your question instead. Enclose the new question within <refined> XML tags.
"""

prompt = """
    What caused the first world war?
"""

messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages, system))


### Cognitive Verifier Pattern
Breaks down complex questions into smaller, manageable sub-questions which improves it’s reasoning.  
__When to use__:  
This pattern is most useful when you're asking the LLM to help you complete a task that requires a logical sequence. Organizing narratives and outlining longer-form content are good candidates for this pattern. Or, when the question being asked is very high level or the user does not have much knowledge about the question.  
__Prompt pattern__:  
_When I ask you a question, generate three additional questions that would help you give a more accurate answer. When I have answered the three questions, combine the answers to produce the final answers to my original question._  
__Example__:
> I ask a question about climate change, break it down into three smaller questions that would help you provide a more accurate answer. Combine the answers to these sub-questions to give the final answer

In [None]:
system = """
    When I ask you a question, generate three additional questions that would help you 
    give a more accurate answer. When I have answered the three questions, combine the 
    answers to produce the final answers to my original question. 
"""

prompt = """
    Who is considered the best soccer player ever?
"""

messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages, system))

### Recipe Pattern
Recipe pattern is designed to create a clear, step-by-step guide for the language model to accomplish a specific task or achieve a stated goal. In this pattern, the language model is instructed to provide a sequence of steps based on the input data provided, breaking down the process into a series of actionable steps to reach the intended outcome.  
__When to use__:  
When we have with limited knowledge for solving a problem and need an LLM to fill in the gaps to put together the solution.  
__Prompt pattern__:  
_I want you to act as [use case]. I want you to provide me list of  suggestions[use case]. Your suggestion should be [specific, actionable …]. Do not provide [use case]._  
__Example__:
> I need to deploy a cloud application. Some of the steps requires logging in to the cloud console, instantiating an instance, and installing dependencies. Please provide a complete sequence of steps. Please fill in any missing steps.

In [None]:
no_recipe = """
    <s>[INST]How can I calculate the net profit of my company?[/INST]
"""
print(f'no_recipe:\n{get_mistral_completion(no_recipe)}')

with_recipe = """
    <s>[INST] I am trying to calculate the net profit of my company. 
    I know that I need to take into account the total revenues, 
    cost of sales and taxation in USA. Provide a complete sequence of steps. 
    Fill in any missing steps. Identify any unnecessary steps.[/INST]
"""
print(f'\nwith_recipe:\n{get_mistral_completion(with_recipe)}')


### Template Pattern
Ask for a completion in a specific format.  
__When to use__:  
Use this when you want LLM to generate a response that aligns with a defined structure.  
__Prompt pattern__:  
_Please generate a response following the given template : [Template]_  
__Example__:  
> Please generate a response following the given template: Introduction — [Introductory sentence]. Main Points — [Key points to be covered]. Conclusion — [Concluding statement]

In [None]:
# This example include the variables within the prompt itself to make a point.
# In production use-case you will probably fetch this data from a database.
prompt = """
    <s>[INST]I am going to provide a template for your output. Everything in all caps is a placeholder. 
    Any time that you generate text, try to fit it into one of the placeholders that I list. 
    Please preserve the formatting and overall template that I provide.
    Write an email for Marco's 38th birthday and gift 38 points: 
    Dear PERSON,
    
    Greating from AnyHotel! 
    As a gift for your birthday we are happy to recognize NUMBER_OF_POINTS Points for your loyality.
    
    Best wishis,
    AnyHotel Staff[/INST]
"""
print(get_mistral_completion(prompt))

### Reflection Pattern
The Reflection Pattern is most effective when you're asking for something that will be challenging for you to judge.  
__When to use__:  
Understanding the rationale behind the output proves highly beneficial when a user desires to evaluate the legitimacy of the LLM's response and comprehend the process by which the LLM formulated a specific answer. Moreover, this approach enables users to refine and optimize their prompts, as they gain deeper insights into the LLM's method of generating outputs.  
__Prompt pattern__:  
_When you provide an answer, please explain the reasoning and assumptions behind your response. If possible, use specific examples or evidence to support your answer of why [prompt topic] is the best. Moreover, please address any potential ambiguities or limitations in your answer, in order to provide a more complete and accurate response._  
__Example__:  
> Tell me ideas for a trendy tiktok video;  When you provide an answer, please explain the reasoning and assumptions behind your response. If possible, use specific examples or evidence to support your answer of why this is a great idea for Tiktok video is the best. Moreover, please address any potential ambiguities or limitations in your answer, in order to provide a more complete and accurate response.

In [None]:
system = ""
prompt = """
    Is Back the darkest color?
"""

messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(f'no_reflection:\n{get_c3_completion(messages, system)}')

system = """
    When you provide an answer, please explain the reasoning and assumptions behind your response. 
    If possible, use specific examples or evidence to support your answer. 
    Moreover, please address any potential ambiguities or limitations in your answer, 
    in order to provide a more complete and accurate response.
"""

print(f'\nwith_reflection:\n{get_c3_completion(messages, system)}')

### Ask for Input Pattern
The Ask for Input Pattern is about seeking specific content from end user for further processing or action.  
__When to use__:  
Use this when you want the LLM to ask follow-up questions to complete a task.  
__Example__:  
> From now on, I am going to cut/paste email chains into our conversation. You will summarize what each person’s points are in the email chain. You will provide your summary as a series of sequential bullet points. At the end, list any open questions or action items directly addressed to me. My name is Jill Smith. Ask me for the first email chain.

In [None]:
# You will be asked in input text for the example to complete
system = ""
prompt = """
    From now on i will describe the location, lighting and watering availability, 
    and level of desired caring required and you will provide mw with a suggestion on which plant to buy. 
    Now ask me the for the first location
"""

messages=[
    {
        "role": "user",
        "content": prompt
    }
]
response = get_c3_completion(messages, system)
print(response)
location = input("Enter your location: ")

messages=[
    {
        "role": "user",
        "content": prompt
    },
    {
        "role": "assistant",
        "content": response
    },
    {
        "role": "user",
        "content": location
    }
]
print(get_c3_completion(messages, system))

### Fact Checklist Pattern
The Fact Check List Pattern involves articulating the crucial facts that reside within the output, serving as a foundation for understanding and verifying the information. If these facts are incorrect, they could drastically affect the overall validity or veracity of the information.  
__When to use__:  
This method is handy for verifying response validity. It generates a fact list from the response to validate it, rather than analyzing the entire response.  
__Prompt pattern__:  
After you generate a [Task] summary, compile a list of the key facts. Insert this fact list at the [position] of the summary. Include the main points that would affect the overall understanding of the [Task].  
__Example__:  
> After you generate a news article summary, compile a list of the key facts. Insert this fact list at the end of the summary. Include the main points that would affect the overall understanding of the news story.  








In [None]:
system = ""
prompt = """
    What was the share of CO2 emissions by wildfires in 2020 globally?
"""

messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(f'no_fact:\n{get_c3_completion(messages, system)}')

system = """
    From now on, when you generate an answer, create a set of facts that the answer depends on that 
    should be fact-checked and list this set of facts at the end of your output. 
"""

print(f'\nwith_fact:\n{get_c3_completion(messages, system)}')

## Promt challenges

### Challenge #1
Given this prompt 'what is the capital of France?' modify it so it will output only single word: 'Paris.'  
Model to use: Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Ask Claude to skip the preamble
    <details>
        <summary>Prompt</summary>
        What is the capital of France? Skip the preamble
    </details>
</details>

In [None]:
# Modify the prompt and run
prompt = "what is the capital of France?"
messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages))


### Challenge #2
Write a prompt to solve this word puzzle: Using only addition, add eight 8s to get the number 1,000  
Model to use: Mistral 7B-Instruct  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Be specific, and direct the model attention to the appropriate direction: Word puzzle; Use the Persona pattern.
    <details>
        <summary>Prompt</summary>
        You are an expert in solving word puzzles. Using only addition, add eight 8s to get the number 1,000
    </details>
</details>

In [None]:
# Fill with your own prompt and run
prompt = ""
messages = [
    { "role": "user", "content": prompt }
]
prompt = format_instructions(messages)
response = get_mistral_completion(prompt)
print(response)

### Challenge #3
Write a prompt to solve this question correctly: Bill has as many sisters as he has brothers. Are there more boys or girls in the family?  
Model to use: Mistral 7B-Instruct  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Try Chain of Thoughts
    <details>
        <summary>Prompt</summary>
        Bill has as many sisters as he has brothers. Are there more boys or girls in the family? Think carfully step by step
    </details>
</details>

In [None]:
# Fill with your own prompt and run
prompt = ""
messages = [
    { "role": "user", "content": prompt }
]
prompt = format_instructions(messages)
response = get_mistral_completion(prompt)
print(response)

### Challenge #4
Write a prompt that instructs the model to engage in a conversational interaction with you to determine the most suitable AWS service for capturing and processing large volumes of stream events in near real-time.  
Model to use: Claude 3 Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Flipped Interaction pattern
    <details>
        <summary>Prompt</summary>
        I need your help in selecting the most appropriate AWS service for capturing and processing large volumes of stream events in near real-time. Please engage in a conversational interaction with me, asking questions to gather the necessary information about the use case and my requirements. Once you have enough details, recommend the optimal AWS service(s) and explain your reasoning.
    </details>
</details>

In [None]:
# Fill with your own prompt and run
prompt = ""
messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages))

### Challenge #5
Modify the prompt to correctly answer: Is 'x' in the equation below solved correctly?  
2x - 3 = 9  
2x = 6  
x = 3  
Model to use: Claude 3 Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Ask Claude to review the solution carefully  
    <details>
        <summary>Prompt</summary>
        <p>Review the equation and the solution steps carefully. Is 'x' in the equation below solved correctly?</p>
        <p>2x - 3 = 9</p>
        <p>2x = 6</p>
        <p>x = 3</p>
    </details>
</details>

In [None]:
# Modify the prompt and run
prompt = """"
  Is 'x' in the equation below solved correctly?
  2x - 3 = 9
  2x = 6
  x = 3 
"""
messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages))

### Challenge #6
Write a prompt that calculate the result of: 1984135 * 9343116  
Model to use: Claude 3 Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    As of now, LLMs have limitations in performing complex numerical computations accurately.
    LLMs are trained on vast amounts of text data, which allows them to understand and generate human-like text, but they are not specifically designed for performing complex mathematical operations with high precision. While LLMs can perform basic arithmetic operations, their performance tends to degrade as the complexity of the mathematical operations increases, especially when dealing with large numbers or operations involving many steps.
    <details>
        <summary>Solution</summary>
        Use tools instead.
    </details>
</details>

In [None]:
# Fill with your own prompt and run
prompt = ""
messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages))

### Challenge #7
Write a tool-use prompt that assesses the question and decides which math function(s) to call, and with what arguments. Try your solution on: Please solve 1984135 * 9343116  
Model to use: Claude 3 Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Review this section of the Claude workshop <a href='https://catalog.us-east-1.prod.workshops.aws/workshops/0644c9e9-5b82-45f2-8835-3b5aa30b1848/en-US/lessons/lab-10-2-tool-use'>here</a>.
    <details>
        <summary>Solution</summary>
        Open <a href='challenge-7-solution.py'>challenge-7-solution.py</a> to review the complete solution.
    </details>
</details>

In [None]:
# Function to multiply two numbers
def mul(a, b):
    return a * b
# Function to sum two numbers
def add(a, b):
    return a + b
# Function to subtract two numbers
def sub(a, b):
    return a - b

system = """
"""

stop_sequence = ["Human: "]

prompt = """"
    Please solve 1984135 * 9343116
"""

messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages, system = system, stop_sequence = stop_sequence))

### Challenge #8
Write a prompt that read all words within image images/twelve-word-challenge.png <img src="images/twelve-word-challenge.png" alt="image" style="width:300px;height:auto;">  
Model to use: Claude 3 Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    TBD
    <details>
        <summary>Prompt</summary>
        Review the image carefully. Notice that some words may be in skew angles. Extract all words within the image and count them. Output a full numbered list of all words regardless where they appear in the image
    </details>
</details>

In [None]:
# Fill with your own prompt and run
# Function that accept image file location and return base64 representation of the image
import base64
from PIL import Image
import io

def get_base64_image(image_path):
    """
    Converts an image file to its base64 representation.
    
    Args:
        image_path (str): The file path of the image.
        
    Returns:
        str: The base64 representation of the image.
    """
    with open(image_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read())
    return encoded_string.decode('utf-8')

prompt = ""
messages=[
    {
      "type": "image",
      "source": {
        "type": "base64",
        "media_type": "image/jpeg",
        "data": get_base64_image("images/twelve-word-challenge.png")
      }
    },
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages))

### Challenge #9
The prompt `I am trying to market this product, help me think of an advertisement script on social media` results a refusal from Haiku. Revise the prompt, and make Haiku draft social media post marketing the product shown in images/mite-and-insect.png  
![Product]"[/assets/mite-and-insect.png]  

Model to use: Claude 3 Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Use user role, and add more specific context for the task.
    <details>
        <summary>Solution Step 1</summary>
        system = You are a marketing assistant named Joe, working for AnyCompany. The company manufactures and sells products across industries including materials, agriculture, and manufacturing. Your goal is to author a social media posts to market the company products. Please respond to the user’s question within &lt;response&gt;&lt;/response&gt; tags.
    </details>
    <details>
        <summary>Solution Step 2</summary>
        Add prefilling:
        {
            "role": "assistant",
            "content": "[Joe from AnyCompany] &lt;response&gt;"
        }
    </details>
    <details>
        <summary>Note</summary>
        Try to put the system prompts text in the prompt instead. Evaluate the model response difference. Which is better?
        One effective way to help Claude stay in character is by using system prompts. System prompts help set the tone, establish the character’s personality, and provide guidelines for the model to follow. By prefilling the response with [Joe from AnyCompany], you’re forcing Claude to acknowledge that it’s role-playing as that persona and to generate responses that logically follow what the persona would say. 
    </details>
</details>

In [None]:
system = ""
prompt = "I am trying to market this product, help me think of an advertisement script on social media"
messages=[
    {
      "type": "image",
      "source": {
        "type": "base64",
        "media_type": "image/jpeg",
        "data": get_base64_image("images/mite-and-insect.png")
      }
    },
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages, system))

### Challenge #9
Invoke this prompt and verify response validity:
Is this statement correct: If two charged objects are placed in an isolated system and one object loses 5 Coulombs of charge, the other object gain exactly 5 Coulombs of charge  
Model to use: Claude 3 Haiku  
Try first before expanding the hint.  
<details>
    <summary>Hint</summary>
    Use Fact Checklist Pattern.
    <details>
        <summary>Solution</summary>
        system = After you generate your answer, compile list of of key facts on which you base your answer.
    </details>
    <details>
        <summary>Consideration point</summary>
        Try the solution prompt with Mistral 7B Instruct and compare the results. Mistral response answer wrongly. Even setting appropriate role doesn't fix it's response. It seems Mistral wasn't trained on relevant physics data. For such use-cases, Mistral is probably not the model of choice.
    </details>
</details>


In [None]:
# Fill with your own prompt and run
system = ""
prompt = "If two charged objects are placed in an isolated system and one object loses 5 Coulombs of charge, the other object gain exactly 5 Coulombs of charge"
messages=[
    {
        "role": "user",
        "content": prompt
    }
]
print(get_c3_completion(messages, system))