<a href="https://colab.research.google.com/github/wandb/edu/blob/main/prompting/prompt_engineering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
<!--- @wandbcode{prompt-engineering-course} -->

# Prompt Engineering with Weights & Biases - [Anish Shah](https://www.linkedin.com/in/anish-shah/)
This is a companion notebook to the Weights & Biases [Prompt Engineering course](https://www.wandb.courses/courses/prompting).

In [1]:
%%capture
!pip install set-env-colab-kaggle-dotenv -q
!pip install weave -U -q
!pip install litellm -U -q

In [2]:
try:
    import google.colab
    !git clone https://github.com/wandb/edu.git
    %cd edu/prompting
except:
    pass

Cloning into 'edu'...
remote: Enumerating objects: 4193, done.[K
remote: Counting objects: 100% (1095/1095), done.[K
remote: Compressing objects: 100% (367/367), done.[K
remote: Total 4193 (delta 905), reused 753 (delta 725), pack-reused 3098 (from 1)[K
Receiving objects: 100% (4193/4193), 27.72 MiB | 20.90 MiB/s, done.
Resolving deltas: 100% (2300/2300), done.
/content/edu/prompting


To pass your environment keys the recommended approach is to use secrets, especially if using Google Colab.

### How to add your key to Colab Secrets

Add your API key to the Colab Secrets manager to securely store it.

1. Open your Google Colab notebook and click on the 🔑 **Secrets** tab in the left panel.
   
   <img src="https://storage.googleapis.com/generativeai-downloads/images/secrets.jpg" alt="The Secrets tab is found on the left panel." width=50%>

2. Create a new secret with a name from below.
3. Copy/paste your API key into the `Value` input box of `SECRET`.
4. Toggle the button on the left to allow notebook access to the secret.

### Or create a `.env` file

Create a `.env` file similar to

```shell
ANTHROPIC_API_KEY="<your-anthropic-api-key>"
OPENAI_API_KEY="<your-openai-api-key>"
WANDB_API_KEY="<your-wandb-api-key>"
```

In [3]:
%%capture
from text_formatting import render
from set_env import set_env
set_env("ANTHROPIC_API_KEY")
set_env("WANDB_API_KEY")
set_env("OPENAI_API_KEY")

Welcome to the Prompt Engineering course with Weights and Biases, led by Anish Shah. This course is designed to explore the fascinating world of prompt engineering, a crucial aspect of interacting with and leveraging the capabilities of large language models (LLMs). Throughout this session, we'll dive into various techniques for crafting effective prompts that can significantly enhance the performance of LLMs across a wide range of tasks.

Whether you're new to AI and machine learning or looking to deepen your understanding of prompt engineering, this course will provide you with valuable insights and practical skills. By the end of this session, you'll be equipped to design and implement prompts that effectively communicate your intentions to LLMs, enabling more accurate and relevant responses.

This section of the notebook focuses on setting up the environment and installing the required libraries:

- It installs the [W&B Weave](https://wandb.github.io/weave/?utm_source=github&utm_medium=course&utm_campaign=prompting) library which is used for tracking llm model operations
- It installs `litellm` which is used to standardize model interaction and also make it easy to swap model providers
- Some accessory functions are provided for better rendering and environment variable setting

In [4]:
%%capture
import weave
import litellm
completion = litellm.completion

## Model and Prompt Configuration
The code snippets define important configuration variables for the prompting course:

In [10]:
# These variables store the names of different language models from Anthropic and OpenAI.
# The "SMART" models (`claude-3-opus` and `gpt-4-turbo`) are more capable but slower,
# while the "FAST" models (`claude-3-haiku` and `gpt-3.5-turbo`) are faster but less powerful.
ANTHROPIC_SMART_MODEL_NAME = "claude-3-opus-20240229"
ANTHROPIC_FAST_MODEL_NAME = "claude-3-haiku-20240307"
OPENAI_SMART_MODEL_NAME = "gpt-4-turbo-2024-04-09"
OPENAI_FAST_MODEL_NAME = "gpt-3.5-turbo"

# These variables point to two different markdown files containing prompt engineering guides.
# `AMAN_PROMPT_GUIDE` refers to Aman Chadha's guide, while `LILIAN_PROMPT_GUIDE` refers to Lilian Weng's guide.
AMAN_PROMPT_GUIDE = "aman_prompt_engineering.md"
LILIAN_PROMPT_GUIDE = "lilianweng_prompt_engineering.md"

# Here, the `MODEL_NAME` variable is set to use Anthropic's fast model (`claude-3-haiku`),
# and the `PROMPT_GUIDE` variable selects Lilian Weng's prompt engineering guide.
MODEL_NAME = OPENAI_SMART_MODEL_NAME
PROMPT_GUIDE = LILIAN_PROMPT_GUIDE

These configuration variables allow course participants to easily switch between different models and prompt guides throughout the course by modifying the assigned values.

## Initializing Weave

This line initializes the [W&B Weave library](https://wandb.github.io/weave/?utm_source=github&utm_medium=course&utm_campaign=prompting) - Weave is a toolkit for developing Generative AI applications, providing features like logging, debugging, evaluations, and organization of LLM workflows.

Initializing Weave at the start allows you to leverage its capabilities throughout your project, such as decorating Python functions with `@weave.op()` to enable automatic tracing and versioning.

By specifying the project name below, you are setting up a dedicated workspace for this course within Weave. This helps keep the course-related experiments, models, and data organized and separate from other projects.

Weave brings structure and best practices to the experimental nature of Generative AI development, making it easier to track, reproduce, and share your work. Initializing it early in the notebook ensures you can take full advantage of its features as you progress through the course.

To get started, you'll need to sign up for a [Weights & Biases account here](https://wandb.ai/site/?utm_source=github&utm_medium=course&utm_campaign=prompting). When you run `weave.init` below you'll be prompted for your W&B API key which you can find [here](https://wandb.ai/authorize?utm_source=github&utm_medium=course&utm_campaign=prompting). Copy & paste it into the input box below when prompted.

In [6]:
weave.init("beginner-llm-prompting-course")

Logged in as Weights & Biases user: imvenkata.
View Weave data at https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/weave


<weave.trace.weave_client.WeaveClient at 0x780fac48a230>

## Defining the get_completion function

This code defines a function called `get_completion` that is decorated with `@weave.op()`. The `@weave.op()` decorator is provided by Weave and enables automatic tracing and versioning of the function.

The `get_completion` function takes several parameters:
- `system_message`: The system message to provide context or instructions to the language model.
- `messages`: A list of messages representing the conversation history.
- `model_name`: The name of the language model to use (defaults to `MODEL_NAME`).
- `max_tokens`: The maximum number of tokens to generate in the response (defaults to 4096).
- `temperature`: The sampling temperature for controlling the randomness of the generated text (defaults to 0).

Inside the function, it calls the `completion` function (from the `litellm` library) with the provided parameters to generate a completion from the language model. The `temperature` parameter is set to 0, which is recommended for evaluations and RAG (Retrieval-Augmented Generation) systems to ensure deterministic results.

The generated response is then printed as JSON using `response.json()`, and the JSON response is returned by the function.

By using the `@weave.op()` decorator, Weave automatically tracks and versions the inputs and outputs of the `get_completion` function, making it easier to reproduce and analyze the results later in the course.

In [11]:
@weave.op()
def get_completion(system_message: str, messages: list, model: str, max_tokens: int = 4096, temperature: float = 0, **kwargs) -> dict:
    """
    Generates a completion using the specified model, taking into account the system message, conversation history, and additional arguments.

    Parameters:
        system_message (str): A message providing context or instructions for the model.
        messages (list): A list of dictionaries representing the conversation history, where each dictionary has keys 'role' and 'content'.
        model (str): The identifier of the model to use for generating completions.
        max_tokens (int, optional): The maximum number of tokens to generate in the completion. Defaults to 4096.
        temperature (float, optional): The sampling temperature to control the randomness of the generated text. Defaults to 0.
        **kwargs: A dictionary of additional keyword arguments. Expected keys include 'system_message', 'model', 'max_tokens', and 'temperature'.

    Returns:
        dict: A dictionary representing the generated completion as JSON.
    """
    # Adjust messages format based on the model type
    if "gpt" in model.lower():
        formatted_messages = [{"role": "system", "content": system_message}] + messages
    else:
        kwargs["system"] = system_message  # For non-gpt models, use system_message directly in kwargs

    # Common arguments for the completion function
    completion_args = {
        "model": model,
        "max_tokens": max_tokens,
        "temperature": temperature,
        "messages": formatted_messages if "gpt" in model.lower() else messages,
        **kwargs
    }

    # Generate and return the completion
    response = completion(**completion_args)
    return response.json()


## Use Case: Building a Prompting Assistant

In this section, we explore the practical application of prompt engineering by building a bot that helps users understand prompting techniques and answers questions based on the provided information. This use case demonstrates the power of prompt engineering in creating helpful AI assistants that can make complex topics more accessible and engaging.

By leveraging the knowledge contained in a comprehensive guide on prompting techniques, we can develop a bot that provides accurate and relevant answers to user queries. Through careful crafting of system messages and prompt templates, we ensure that the bot's responses are not only informative but also easy to understand, even for beginners.

Throughout this use case, course participants will learn how to:

- Incorporate context to improve the relevance and accuracy of the model's responses
- Use system messages to guide the model's behavior and output style
- Standardize inputs and outputs for consistent and reusable prompting assistants
- Experiment with different configurations to optimize the bot's performance

By engaging with this use case, participants will gain hands-on experience in applying prompt engineering techniques to build a practical and helpful AI assistant. They will develop a deeper understanding of how to effectively communicate with language models and tailor their outputs to specific audiences and use cases.

### Step 1: Raw Prompting

We start by sending a question to the language model without any additional context, using a basic `prompt_llm` function. This demonstrates the model's limitations when lacking the necessary information to provide relevant answers.

In [12]:
@weave.op()
def prompt_llm(question: str, **kwargs) -> str:
    """
    Sends a question to the language model and returns its response.

    This function prepares a message with the user's question, handles additional
    arguments for the language model, and invokes the get_completion function to
    obtain a response. The response's content is then returned.

    Parameters:
        question (str): The question intended for the language model.
        **kwargs: A dictionary of additional keyword arguments. Expected keys include 'system_message', 'model', 'max_tokens', and 'temperature'.

    Returns:
        str: The language model's response to the question.
    """
    # Prepare the user's question for the language model
    messages = [{"role": "user", "content": question}]

    # Extract additional parameters, applying defaults if necessary
    system_message = kwargs.pop('system_message', "")
    model = kwargs.pop('model', MODEL_NAME)
    max_tokens = kwargs.pop('max_tokens', 4096)
    temperature = kwargs.pop('temperature', 0)

    # Compile arguments for the completion request
    completion_args = {
        "system_message": system_message,
        "messages": messages,
        "model": model,
        "max_tokens": max_tokens,
        "temperature": temperature
    }
    completion_args.update(kwargs)  # Properly include any other additional arguments

    # Request a completion from the language model
    response = get_completion(**completion_args)

    # Extract and return the content of the model's response
    return response["choices"][0]["message"]["content"]


In [13]:
raw_prompt_response = prompt_llm(
    "Explain the latest prompting techniques and provide an example of each"
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/01921375-7157-7450-9d0a-ae0f353e2c82


In [14]:
render(raw_prompt_response)

As of my last update in 2023, prompting techniques in AI, particularly in the context of language
models like OpenAI's GPT series, have evolved significantly. These techniques are designed to
improve the interaction with AI models, enhancing their ability to understand and generate more
accurate, relevant, and contextually appropriate responses. Here are some of the latest prompting
techniques along with examples for each:  1. **Zero-Shot Learning**:    - **Description**: This
technique involves presenting a task to the model without any prior specific training on that task.
The model uses its pre-trained knowledge to generate a response.    - **Example**: Asking the model,
"What is the capital of France?" without having explicitly trained it on geographical facts.  2.
**Few-Shot Learning**:    - **Description**: This involves giving the model a few examples to
demonstrate the task before asking it to perform on a new example. This helps the model understand
the context or the type of 

The model's response to the raw prompt is inadequate because it lacks the necessary context to provide a meaningful answer. Without any background information or specific details about prompting techniques, the model can only generate a generic, high-level response that fails to address the question effectively, many times providing no response at all.

This poor performance highlights the importance of providing relevant context when prompting language models. By supplying the model with additional information related to the topic at hand, we can guide it towards generating more accurate, detailed, and useful responses.

### Step 2. Prompting with Context

We can provide the necessary context to the language model by including it directly alongside the question. In this example, we use a comprehensive guide on prompting techniques written by [Lilian Weng](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/). This guide is particularly useful as it condenses many great papers and articles into a single-page resource, covering various prompting techniques.

To incorporate the context, we:

1. Load the markdown file containing the prompting guide using the `load_markdown_file` function.
2. Concatenate the loaded context with the question in the `context_prompt_response` variable.
3. Pass the combined context and question to the `prompt_llm` function to generate a response.

By providing the model with relevant context, we expect to receive more accurate and informative answers to our questions about prompting techniques.

Note: As an alternative, you can also use a more extensive guide by [Aman Chadha](https://aman.ai/primers/ai/prompt-engineering/) for additional context and information on prompt engineering.

In [15]:
def load_markdown_file(file_path: str) -> str:
    """
    Reads and returns the content of a markdown file specified by its path.

    Parameters:
        file_path (str): The path to the markdown file to be read.

    Returns:
        str: The content of the markdown file as a string.
    """
    with open(file_path, 'r', encoding='utf-8') as file:
        markdown_content = file.read()
    return markdown_content
context = load_markdown_file(PROMPT_GUIDE)

Note: Anthropic has an amazingly large context size and as a result we can luckily just shove the whole document into the prompt in this situation. This can get quite expensive however so it typically makes more sense to use techniques that chunk the document into better sizes or use a RAG based pipeline


In [16]:
context_prompt_response = prompt_llm(
    context + "\n\nExplain the latest prompting techniques and provide an example of each"
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/0192137e-1fec-7991-9d8e-beebb6d9bc19


In [17]:
render(context_prompt_response)

Prompt engineering has evolved significantly with the advent of large language models (LLMs),
offering various techniques to enhance the interaction and output quality of these models. Below,
I'll explain some of the latest prompting techniques and provide examples for each.  ### 1. **Zero-
Shot Prompting** Zero-shot prompting involves presenting a task to the model without any prior
examples or context, relying solely on the model's pre-trained knowledge.  **Example:** ``` Prompt:
"Translate the following sentence into French: 'Hello, how are you?'" Model Output: "Bonjour,
comment ça va?" ```  ### 2. **Few-Shot Prompting** Few-shot prompting provides the model with a few
examples to help it understand the task context and expected output format better.  **Example:** ```
Prompt: - English: "I am happy."   French: "Je suis heureux." - English: "She is running."   French:
"Elle court." - English: "They are watching a movie."   French: "Ils regardent un film." - English:
"Translate this s

This response is a significant improvement! We can see that by providing the model with relevant context, it can generate an answer that includes details about various prompting techniques covered in the course. The model effectively utilizes the information from the provided guide to address the question more comprehensively.

However, there is still room for improvement. The model tends to regurgitate the technical information from the guide without simplifying or explaining the concepts in an easily understandable manner. The response may be too complex or jargon-heavy for beginners or those new to the topic of prompt engineering.

To address this issue, we need to guide the model towards providing explanations that are more accessible and beginner-friendly. This is where the next step of conditioning the model's responses with a carefully crafted system prompt comes into play. By instructing the model to break down the technical details and present the information in a more digestible format, we can ensure that the responses are not only informative but also easy to understand for a broader audience.

### Step 3. Condition Responses with a System Prompt

To ensure that the bot explains the information in a way that is easy to understand, we can provide a system prompt that guides the model to present the content in a beginner-friendly manner. Here’s why this system prompt is effective:

1. **Objective Clarity**: It directly states the task — simplifying prompt engineering concepts with examples. This aligns with the principle of having a clear and specific objective, which helps the LLM focus on the exact task ([source](https://medium.com/the-modern-scientist/best-prompt-techniques-for-best-llm-responses-24d2ff4f6bca)).

2. **Tone Specification**: Setting a friendly and educational tone guides the LLM on the desired interaction style, making the information approachable and digestible.

3. **Context Awareness**: By acknowledging the user's basic AI knowledge, the prompt tailors the complexity of the content, ensuring it is suitable for beginners without being overly simplistic.

4. **Guidance on Style**: Instructing the use of analogies and simple examples helps in breaking down complex topics into understandable segments, which is crucial for teaching technical subjects effectively.

5. **Verification of Output**: Emphasizing clarity and relevance ensures that the responses are not only correct but also useful and directly applicable to the user’s needs.

6. **Highlighting Benefits**: Mentioning the benefits of simplifying technical concepts engages users by showing the value of what they are learning, enhancing their motivation and the educational impact.

### General Approach to Constructing Effective System Prompts

When constructing system prompts for LLMs, consider the following steps to ensure effectiveness and clarity:

- **Define the Objective**: Clearly state what you want the LLM to achieve. This should be specific and concise.
- **Set the Tone and Style**: Indicate how the response should feel or sound. This helps the LLM adjust its language and approach.
- **Provide Necessary Context**: Include any background information that will help the LLM understand the scope and depth of the response required.
- **Incorporate Guidance for Content**: Direct the LLM on how to structure its response or what elements to include, such as examples or analogies.
- **Specify Output Format**: If necessary, define how the response should be formatted. This is particularly important for tasks requiring a specific output structure.
- **Use Clear and Direct Language**: Avoid ambiguity by using straightforward and direct language. This reduces the chances of misinterpretation.

In [18]:
system_message = """
Objective: Simplify prompt engineering concepts for easy understanding. Provide clear examples for each technique.
Tone: Friendly and educational, suitable for beginners.
Context: Assume basic AI knowledge; avoid deep technical jargon.
Guidance: Use metaphors and simple examples to explain concepts. Keep explanations concise and applicable.
Verification: Ensure clarity and relevance in responses, with practical examples.
Benefits: Help users grasp prompt engineering basics, enhancing their AI interaction experience.
"""

In [19]:
system_and_context_prompt_response = prompt_llm(
    system_message=system_message,
    question=context + "\n\nExplain the latest prompting techniques and provide an example of each"
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/01921386-1021-7fc1-b147-46add5bc6004


In [20]:
render(system_and_context_prompt_response)

Prompt engineering is like giving your AI a recipe to follow when cooking up answers. It's about
crafting the right questions or instructions to get the most accurate and useful responses. Let's
break down some of the latest techniques in prompt engineering and provide examples for each:  ###
1. **Zero-Shot Learning** Imagine you've never baked a cake before, and someone asks you to make one
without any recipe or prior experience. That's zero-shot learning. You provide the AI with a task it
has never seen before and ask for a response based on its pre-existing knowledge.  **Example:** ```
Prompt: "Explain the theory of relativity." Response: [AI generates an explanation based on its
training] ```  ### 2. **Few-Shot Learning** This is like having a few recipes as guides before you
try to bake a new cake. You show the AI several examples of the task done correctly, and then ask it
to perform a similar task.  **Example:** ``` Prompt: 1. "Translate 'Hello, how are you?' into
French." Respo

Great! Now we're able to get a response that is easy to understand and provides a lot of context. The model has successfully broken down the technical concepts into beginner-friendly explanations, using simple language, analogies, and examples to convey the key points. This approach makes the information more accessible and engaging for those new to prompt engineering, fostering a deeper understanding of the subject matter.

The next step is to standardize the inputs and outputs in a way that allows us to ask different questions and pass different context in the future. By creating a consistent structure for our prompts and responses, we can streamline the development of LLM applications and make it easier to experiment with various configurations. This standardization will enable us to quickly iterate on our prompts, test different contexts, and fine-tune our models to achieve the best possible results.

### Step 4: System Prompts - Inputs

To make it easier to experiment with different parameters and retrieve our best models, we can wrap our prompting logic in a collection of modular functions decorated with `@weave.op()`. This allows us to track and version our operations, making it easier to reproduce and analyze our results. We can also define a standard input structure for our system prompts, ensuring consistency across different prompts and use cases.

In [25]:
prompt_template = "{context}\n{question}"

In [21]:
@weave.op()
def format_prompt(prompt_template: str, **kwargs):
    """
    Formats a prompt template with provided keyword arguments.

    This function takes a template string and a dictionary of keyword arguments,
    then formats the template string using these arguments.

    Parameters:
        prompt_template (str): The template string to be formatted.
        **kwargs (dict): Keyword arguments to format the template string with.

    Returns:
        str: The formatted prompt template.
    """
    return prompt_template.format(**kwargs)

In [22]:
# In this app we assume only a context and question are passed to the prompt template but that need not be true
@weave.op()
def llm_app(prompt_template: str, context: str, question: str, **kwargs):
    """
    Generates a response using a formatted prompt based on a template, context, and question.

    This function formats a given prompt template with the specified context and question, then
    generates a response using the prompt_llm function with additional keyword arguments.

    Parameters:
        prompt_template (str): The template string used to format the prompt.
        context (str): The context information to be included in the prompt.
        question (str): The specific question to be asked in the prompt.
        **kwargs (dict): Additional keyword arguments to be passed to the prompt_llm function.

    Returns:
        str: A string representing the generated response.
    """
    formatted_prompt = format_prompt(prompt_template=prompt_template, context=context, question=question)
    response = prompt_llm(
        question=formatted_prompt,
        **kwargs
    )
    return response


We defined our llm_app which in turns acts as the core of our prompting assistant.

In [23]:
question = """
Explain the differences between chain of thought and Self-Consistency Sampling
prompting techniques? Please provide a clear explanation and a practical example
for each technique within a structured format.
"""

In [26]:
input_template_response = llm_app(
    system_message=system_message,
    prompt_template=prompt_template,
    context=context,
    question=question,
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/0192138a-ce8d-7b51-9a43-797397b3f298


In [27]:
render(input_template_response)

### Chain of Thought (CoT) Prompting  **Explanation:** Chain of Thought (CoT) prompting is a
technique used to guide large language models (LLMs) to solve complex reasoning tasks by explicitly
asking them to generate intermediate reasoning steps before arriving at a final answer. This method
encourages the model to "think aloud" by breaking down the problem into smaller, manageable parts,
which are sequentially addressed to build up to the solution.  **Example:** Imagine you're asking an
AI to solve the following math problem:  **Problem:** "Alice has 10 apples. She gives 3 to Bob and
then receives 5 more from Carol. How many apples does Alice have now?"  **CoT Prompt:** ``` Let's
think step by step. Alice starts with 10 apples. She gives 3 to Bob, so 10 - 3 = 7 apples. Then,
Carol gives her 5 more apples. So, 7 + 5 = 12 apples. Therefore, Alice has 12 apples now. ```  In
this example, the AI breaks down the problem into smaller steps, articulating each part of the
process, which helps

Now we can easily and consistently swap system messages, context, and questions to get the best results. This allows us to test various combinations of system messages, context, and questions to find the most effective prompts for our specific use case.

However, as we can see from the example, the current system prompt doesn't work as well with the newly asked question. This highlights the importance of tailoring the system prompt to the specific task at hand. Additionally, we may want to enforce more consistency in the format of our model's outputs. While using third-party packages like Instructor is beyond the scope of this course, we can achieve similar results by using proper tags in our prompt. By including specific tags or formatting instructions in the prompt, we can guide the model to respond in a way that is more consistent and easier to parse on our end



### Step 5: System Prompts - Outputs

In this step, we focus on improving the consistency and structure of our model's outputs by modifying the prompt template. By including specific tags and formatting instructions in the prompt, we can guide the model to respond in a way that is easier to parse and process.

In the system message, we've added a new tag called `Format` which provides instructions for the model to respond within an <answer></answer> tag, with separate <explanation> and <example> tags for each concept. This structured format helps organize the information and makes it easier to extract and analyze the responses programmatically.


In [28]:
def update_prompt_with_output_indicator(system_message: str, prompt_template: str):
    if "gpt" in MODEL_NAME:
        system_format_msg = """
    Format: Respond within a structured JSON object, using the keys provided in the prompt to organize your response.
    Provide a condensed answer under the 'condensed_answer' key, detailed explanations under 'explanation' keys,
    and examples under 'example' keys within each explanation.
    """
        prompt_format_msg = """
    You must respond in JSON format.
    Your response should follow this structure:
    {{
      "answer": {{
        "condensed_answer": "CONDENSED_ANSWER",
        "explanation_1": {{
          "detail": "EXPLANATION_1",
          "example": "EXAMPLE_1"
        }},
        "explanation_2": {{
          "detail": "EXPLANATION_2",
          "example": "EXAMPLE_2"
        }},
        ...
      }}
    }}
    """
    else:
        system_format_msg = """
    Format: Respond within an <answer></answer> tag, with as many <explanation></explanation> tags as needed,
    ensuring that the <detail></detail> and <example></example> tags are used within each <explanation></explanation> tag.
    Provide a condensed answer for the question in the <condensed_answer></condensed_answer> tag.
    """
        prompt_format_msg = """
    You must respond within an <answer></answer> XML tags.
    Inside of the <answer> markdown tag, you must provide a format of
    <answer>
        <condensed_answer> CONDENSED_ANSWER </condensed_answer>
        <explanation>
          <detail> EXPLANATION </detail>
          <example> EXAMPLE </example>
        </explanation>
        <explanation>
          <detail> EXPLANATION </detail>
          <example> EXAMPLE </example>
        </explanation>
        ...
    </answer>
    """
    formatted_system_message = system_message + "\n" + system_format_msg
    formatted_prompt_template = prompt_template + "\n" + prompt_format_msg

    return formatted_system_message, formatted_prompt_template

In [29]:
formatted_system_message, formatted_prompt_template = update_prompt_with_output_indicator(system_message, prompt_template)

Similarly, we've updated the `prompt_template` to include the new formatting instructions, ensuring that the model generates responses that adhere to the specified structure. By enforcing a consistent output format, we can streamline the processing of the model's responses and facilitate further analysis and evaluation of the results.

In [30]:
question = """
Explain the differences between zero-shot, few-shot, and chain of thought
prompting techniques? Please provide a clear explanation and a practical example
for each technique within a structured format.
"""

In [31]:
output_indicator_response = llm_app(
    system_message=formatted_system_message,
    prompt_template=formatted_prompt_template,
    context=context,
    question=question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/0192138c-e304-74f0-9d19-156c0b08058b


In [32]:
render(output_indicator_response)

{   "answer": {     "condensed_answer": "Zero-shot, few-shot, and chain-of-thought prompting are
techniques used to guide language models in generating responses. Zero-shot involves no prior
examples, relying solely on the task description. Few-shot uses a small set of examples to guide the
model. Chain-of-thought involves prompting the model to generate a step-by-step reasoning process
before arriving at the final answer.",     "explanation_1": {       "detail": "Zero-shot prompting
is when the model is given a task without any previous examples or context, relying solely on its
pre-trained knowledge to generate a response. This method tests the model's ability to understand
and respond based on its initial training alone.",       "example": {         "prompt": "Text: What
is the capital of France?\\nAnswer:",         "response": "The capital of France is Paris."       }
},     "explanation_2": {       "detail": "Few-shot prompting provides the model with a small number
of examples be

## Advanced Prompting Techniques

### Zero-shot Prompting

In [33]:
def update_with_zero_shot_prompt(question: str):
    zero_shot_instruction = "Without using any specific examples, "
    return zero_shot_instruction + question

In [35]:
zero_shot_question = update_with_zero_shot_prompt(question)

In [36]:
zero_shot_response = llm_app(
    system_message=formatted_system_message,
    prompt_template=formatted_prompt_template,
    context=context,
    question=zero_shot_question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/0192138f-4cd1-7f80-b381-da722b9d4da6


In [37]:
render(zero_shot_response)

{   "answer": {     "condensed_answer": "Zero-shot, few-shot, and chain-of-thought prompting are
techniques used to guide language models in generating responses. Zero-shot involves no prior
examples, relying solely on the prompt. Few-shot uses a small set of examples to guide the model.
Chain-of-thought involves prompting the model to generate a step-by-step reasoning process before
arriving at the final answer.",     "explanation_1": {       "detail": "Zero-shot prompting is when
the model is given a task without any previous examples or context, relying solely on its pre-
trained knowledge to generate a response.",       "example": "Prompt: 'What is the capital of
France?' The model directly answers 'Paris' without any prior examples."     },     "explanation_2":
{       "detail": "Few-shot prompting provides the model with a few examples of the task at hand,
helping it understand the context and desired output format better.",       "example": "Prompt:
'Translate the following sent

### Few-shot Prompting

In [38]:
def update_with_few_shot_prompt(question: str):
    if "gpt" in MODEL_NAME:
        few_shot_examples = """
        Here are a few examples of prompting techniques in JSON format:
        {{
            "answer": {{
                "condensed_answer": "Different prompting techniques are used to guide language models in generating desired outputs.",
                "explanation_1": {{
                    "detail": "Translation prompts provide the model with a source language text and request the translation in a target language.",
                    "example": "Translate the following English text to French: 'Hello, how are you?'"
                }},
                "explanation_2": {{
                    "detail": "Sentiment classification prompts ask the model to determine the sentiment expressed in a given text.",
                    "example": "Classify the sentiment of the following text: 'The movie was terrible.'"
                }},
                "explanation_3": {{
                    "detail": "Factual question prompts require the model to provide an answer along with an explanation or reasoning.",
                    "example": "What is the capital of Germany? Explain your reasoning."
                }}
            }}
        }}
        """
    else:
        few_shot_examples = """
        Here are a few examples of prompting techniques in XML format:
        <answer>
            <condensed_answer>Different prompting techniques are used to guide language models in generating desired outputs.</condensed_answer>
            <explanation>
                <detail>Translation prompts provide the model with a source language text and request the translation in a target language.</detail>
                <example>Translate the following English text to Spanish: 'Good morning, how can I help you?'</example>
            </explanation>
            <explanation>
                <detail>Sentiment classification prompts ask the model to determine the sentiment expressed in a given text.</detail>
                <example>Classify the sentiment of the following text: 'I love this product!'</example>
            </explanation>
            <explanation>
                <detail>Factual question prompts require the model to provide an answer along with an explanation or reasoning.</detail>
                <example>What is the capital of Canada? Provide your thought process.</example>
            </explanation>
        </answer>
        """

    return few_shot_examples + "\n" + question

In [39]:
few_shot_question = update_with_few_shot_prompt(question)

In [40]:
few_shot_response = llm_app(
    system_message=formatted_system_message,
    prompt_template=formatted_prompt_template,
    context=context,
    question=few_shot_question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/0192138f-cd3e-7a10-b534-cad92c86f8f5


In [41]:
render(few_shot_response)

{   "answer": {     "condensed_answer": "Zero-shot, few-shot, and chain of thought (CoT) are
prompting techniques that guide language models to generate desired outputs based on different
levels of context and instruction.",     "explanation_1": {       "detail": "Zero-shot prompting
involves presenting a task to the model without any prior examples or context. The model uses its
pre-trained knowledge to generate a response.",       "example": {         "prompt": "Translate the
sentence 'Hello, how are you?' to French.",         "model_response": "Bonjour, comment allez-vous?"
}     },     "explanation_2": {       "detail": "Few-shot prompting provides the model with a few
examples of the task before presenting the actual query. This helps the model understand the task
context and expected output format better.",       "example": {         "prompt": "Given the
examples: (1) 'The sky is blue.' - Sentiment: Positive, (2) 'It is raining again.' - Sentiment:
Negative. Classify the sentimen

### Chain of Thought

Note: We do not use the output indicators in this case as it will negate the chain of thought to instead enforce the formatting. It is important to explicitly incorporate the thought process desired in the prompt

In [42]:
def update_with_chain_of_thought_prompt(system_message: str, prompt_template: str):
    chain_of_thought_instruction = "Let's explicitly think step by step. My thought process is:\n"
    chain_of_thought_system_format = "Format: You must explicitly define the thought process and knowledge from the context to come to your conclusion for the question."
    return system_message + "\n" + chain_of_thought_system_format, prompt_template + "\n" + chain_of_thought_instruction

In [43]:
chain_of_thought_system_message, chain_of_thought_prompt = update_with_chain_of_thought_prompt(system_message, prompt_template)


In [44]:
chain_of_thought_response = llm_app(
    system_message=chain_of_thought_system_message,
    prompt_template=chain_of_thought_prompt,
    context=context,
    question=question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/imvenkata-ubs/beginner-llm-prompting-course/r/call/01921390-55de-76c2-89a8-d16fb632bf5f


In [45]:
render(chain_of_thought_response)

### Zero-Shot Prompting **Explanation:** Zero-shot prompting involves presenting a task to a
language model without providing any previous examples or context. The model uses its pre-trained
knowledge to generate a response based solely on the input prompt.  **Example:** Imagine you want to
know the sentiment of a movie review but you don't provide any prior examples of sentiment analysis.
You simply ask: ``` Text: "The movie was breathtaking and beautifully executed." Sentiment: ``` The
model must infer the sentiment based solely on its pre-trained understanding of the language used in
the review.  ### Few-Shot Prompting **Explanation:** Few-shot prompting gives the model a small
number of examples (demonstrations) before presenting the actual task. This helps the model
understand the specific task requirements and adjust its responses accordingly.  **Example:** You
want to perform sentiment analysis again, but this time you provide a few examples first: ``` Text:
"This film was a fan