<a href="https://colab.research.google.com/github/imusicmash/wandb_workshop/blob/main/Copy_of_prompt_engineering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href="https://colab.research.google.com/github/wandb/llm-workshop-fc2024/blob/main/part_1_prompting/prompt_engineering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
<!--- @wandbcode{llm-workshop-fc2024-prompting} -->

# Prompting Workshop with Weights and Biases - [Anish Shah](https://www.linkedin.com/in/anish-shah/)

In [None]:
%%capture
!pip install set-env-colab-kaggle-dotenv -q
!pip install weave -U -q
!pip install litellm -U -q

In [None]:
try:
    import google.colab
    !git clone https://github.com/wandb/llm-workshop-fc2024.git
    %cd llm-workshop-fc2024/part_1_prompting
except:
    pass

Cloning into 'llm-workshop-fc2024'...
remote: Enumerating objects: 160, done.[K
remote: Counting objects: 100% (160/160), done.[K
remote: Compressing objects: 100% (116/116), done.[K
remote: Total 160 (delta 90), reused 97 (delta 37), pack-reused 0[K
Receiving objects: 100% (160/160), 727.72 KiB | 12.13 MiB/s, done.
Resolving deltas: 100% (90/90), done.
/content/llm-workshop-fc2024/part_1_prompting


In [None]:
%%capture
from text_formatting import render
from set_env import set_env
set_env("ANTHROPIC_API_KEY")
set_env("WANDB_API_KEY")
set_env("OPENAI_API_KEY")

Welcome to the Prompting Workshop with Weights and Biases, led by Anish Shah. This workshop is designed to explore the fascinating world of prompt engineering, a crucial aspect of interacting with and leveraging the capabilities of large language models (LLMs). Throughout this session, we'll dive into various techniques for crafting effective prompts that can significantly enhance the performance of LLMs across a wide range of tasks.

Whether you're new to AI and machine learning or looking to deepen your understanding of prompt engineering, this workshop will provide you with valuable insights and practical skills. By the end of this session, you'll be equipped to design and implement prompts that effectively communicate your intentions to LLMs, enabling more accurate and relevant responses.

This section of the notebook focuses on setting up the environment and installing the required libraries:

- It installs the [weave](https://wandb.github.io/weave/) which is used for tracking llm model operations
- It installs the `litellm` which is used to standardize model interaction and also make it easy to swap model providers
- Some accessory functions are provided for better rendering and environment variable setting

In [None]:
%%capture
import weave
from litellm import completion

## Model and Prompt Configuration
The code snippets define important configuration variables for the prompting workshop:

In [None]:
# These variables store the names of different language models from Anthropic and OpenAI.
# The "SMART" models (`claude-3-opus` and `gpt-4-turbo`) are more capable but slower,
# while the "FAST" models (`claude-3-haiku` and `gpt-3.5-turbo`) are faster but less powerful.
ANTHROPIC_SMART_MODEL_NAME = "claude-3-opus-20240229"
ANTHROPIC_FAST_MODEL_NAME = "claude-3-haiku-20240307"
OPENAI_SMART_MODEL_NAME = "gpt-4-turbo-2024-04-09"
OPENAI_FAST_MODEL_NAME = "gpt-3.5-turbo"

# These variables point to two different markdown files containing prompt engineering guides.
# `AMAN_PROMPT_GUIDE` refers to Aman Chadha's guide, while `LILIAN_PROMPT_GUIDE` refers to Lilian Weng's guide.
AMAN_PROMPT_GUIDE = "aman_prompt_engineering.md"
LILIAN_PROMPT_GUIDE = "lilianweng_prompt_engineering.md"

# Here, the `MODEL_NAME` variable is set to use Anthropic's fast model (`claude-3-haiku`),
# and the `PROMPT_GUIDE` variable selects Lilian Weng's prompt engineering guide.
MODEL_NAME = ANTHROPIC_SMART_MODEL_NAME
PROMPT_GUIDE = LILIAN_PROMPT_GUIDE

These configuration variables allow workshop participants to easily switch between different models and prompt guides throughout the workshop by modifying the assigned values.

## Initializing Weave

This line initializes the Weave library for the "prompting-workshop" project. Weave is a toolkit for developing Generative AI applications, providing features like logging, debugging, evaluations, and organization of LLM workflows.

Initializing Weave at the start allows you to leverage its capabilities throughout your project, such as decorating Python functions with `@weave.op()` to enable automatic tracing and versioning.

By specifying the project name "prompting-workshop", you are setting up a dedicated workspace for this workshop within Weave. This helps keep the workshop-related experiments, models, and data organized and separate from other projects.

Weave brings structure and best practices to the experimental nature of Generative AI development, making it easier to track, reproduce, and share your work. Initializing it early in the notebook ensures you can take full advantage of its features as you progress through the workshop.

In [None]:
weave.init("prompting-workshop-test")

Logged in as W&B user alsmail10.
View Weave data at https://wandb.ai/alsmail10/prompting-workshop-test/weave




## Defining the get_completion function

This code defines a function called `get_completion` that is decorated with `@weave.op()`. The `@weave.op()` decorator is provided by Weave and enables automatic tracing and versioning of the function.

The `get_completion` function takes several parameters:
- `system_message`: The system message to provide context or instructions to the language model.
- `messages`: A list of messages representing the conversation history.
- `model_name`: The name of the language model to use (defaults to `MODEL_NAME`).
- `max_tokens`: The maximum number of tokens to generate in the response (defaults to 4096).
- `temperature`: The sampling temperature for controlling the randomness of the generated text (defaults to 0).

Inside the function, it calls the `completion` function (from the `litellm` library) with the provided parameters to generate a completion from the language model. The `temperature` parameter is set to 0, which is recommended for evaluations and RAG (Retrieval-Augmented Generation) systems to ensure deterministic results.

The generated response is then printed as JSON using `response.json()`, and the JSON response is returned by the function.

By using the `@weave.op()` decorator, Weave automatically tracks and versions the inputs and outputs of the `get_completion` function, making it easier to reproduce and analyze the results later in the workshop.

In [None]:
@weave.op()
def get_completion(system_message: str, messages: list, model: str, max_tokens: int = 4096, temperature: float = 0, **kwargs) -> dict:
    """
    Generates a completion using the specified model, taking into account the system message, conversation history, and additional arguments.

    Parameters:
        system_message (str): A message providing context or instructions for the model.
        messages (list): A list of dictionaries representing the conversation history, where each dictionary has keys 'role' and 'content'.
        model (str): The identifier of the model to use for generating completions.
        max_tokens (int, optional): The maximum number of tokens to generate in the completion. Defaults to 4096.
        temperature (float, optional): The sampling temperature to control the randomness of the generated text. Defaults to 0.
        **kwargs: A dictionary of additional keyword arguments. Expected keys include 'system_message', 'model', 'max_tokens', and 'temperature'.

    Returns:
        dict: A dictionary representing the generated completion as JSON.
    """
    # Adjust messages format based on the model type
    kwargs = kwargs.get("kwargs", {})
    if "gpt" in model.lower():
        formatted_messages = [{"role": "system", "content": system_message}] + messages
    else:
        kwargs["system"] = system_message  # For non-gpt models, use system_message directly in kwargs

    # Common arguments for the completion function
    completion_args = {
        "model": model,
        "max_tokens": max_tokens,
        "temperature": temperature,
        "messages": formatted_messages if "gpt" in model.lower() else messages,
        **kwargs
    }

    # Generate and return the completion
    response = completion(**completion_args)
    return response.json()


## Use Case: Building a Prompting Assistant

In this section, we explore the practical application of prompt engineering by building a bot that helps users understand prompting techniques and answers questions based on the provided information. This use case demonstrates the power of prompt engineering in creating helpful AI assistants that can make complex topics more accessible and engaging.

By leveraging the knowledge contained in a comprehensive guide on prompting techniques, we can develop a bot that provides accurate and relevant answers to user queries. Through careful crafting of system messages and prompt templates, we ensure that the bot's responses are not only informative but also easy to understand, even for beginners.

Throughout this use case, workshop participants will learn how to:

- Incorporate context to improve the relevance and accuracy of the model's responses
- Use system messages to guide the model's behavior and output style
- Standardize inputs and outputs for consistent and reusable prompting assistants
- Experiment with different configurations to optimize the bot's performance

By engaging with this use case, participants will gain hands-on experience in applying prompt engineering techniques to build a practical and helpful AI assistant. They will develop a deeper understanding of how to effectively communicate with language models and tailor their outputs to specific audiences and use cases.

### Step 1: Raw Prompting

We start by sending a question to the language model without any additional context, using a basic `prompt_llm` function. This demonstrates the model's limitations when lacking the necessary information to provide relevant answers.

In [None]:
@weave.op()
def prompt_llm(question: str, **kwargs) -> str:
    """
    Sends a question to the language model and returns its response.

    This function prepares a message with the user's question, handles additional
    arguments for the language model, and invokes the get_completion function to
    obtain a response. The response's content is then returned.

    Parameters:
        question (str): The question intended for the language model.
        **kwargs: A dictionary of additional keyword arguments. Expected keys include 'system_message', 'model', 'max_tokens', and 'temperature'.

    Returns:
        str: The language model's response to the question.
    """
    # Prepare the user's question for the language model
    kwargs = kwargs.get("kwargs", {})
    messages = [{"role": "user", "content": question}]

    # Extract additional parameters, applying defaults if necessary
    system_message = kwargs.pop('system_message', "")
    model = kwargs.pop('model', MODEL_NAME)
    max_tokens = kwargs.pop('max_tokens', 4096)
    temperature = kwargs.pop('temperature', 0)

    # Compile arguments for the completion request
    completion_args = {
        "system_message": system_message,
        "messages": messages,
        "model": model,
        "max_tokens": max_tokens,
        "temperature": temperature
    }
    completion_args.update(kwargs)  # Properly include any other additional arguments

    # Request a completion from the language model
    response = get_completion(**completion_args)

    # Extract and return the content of the model's response
    return response["choices"][0]["message"]["content"]


In [None]:
raw_prompt_response = prompt_llm(
    "Explain the latest prompting techniques and provide an example of each"
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/8e110eca-c54e-4a9c-8182-e5b02b2b597f


In [None]:
render(raw_prompt_response)

I do not actually have knowledge about the latest prompting techniques. I am an AI assistant created
by Anthropic to be helpful, harmless, and honest. I do not keep up with the latest developments in
AI prompting methods.

The model's response to the raw prompt is inadequate because it lacks the necessary context to provide a meaningful answer. Without any background information or specific details about prompting techniques, the model can only generate a generic, high-level response that fails to address the question effectively, many times providing no response at all.

This poor performance highlights the importance of providing relevant context when prompting language models. By supplying the model with additional information related to the topic at hand, we can guide it towards generating more accurate, detailed, and useful responses.

### Step 2. Prompting with Context

We can provide the necessary context to the language model by including it directly alongside the question. In this example, we use a comprehensive guide on prompting techniques written by [Lilian Weng](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/). This guide is particularly useful as it condenses many great papers and articles into a single-page resource, covering various prompting techniques.

To incorporate the context, we:

1. Load the markdown file containing the prompting guide using the `load_markdown_file` function.
2. Concatenate the loaded context with the question in the `context_prompt_response` variable.
3. Pass the combined context and question to the `prompt_llm` function to generate a response.

By providing the model with relevant context, we expect to receive more accurate and informative answers to our questions about prompting techniques.

Note: As an alternative, you can also use a more extensive guide by [Aman Chadha](https://aman.ai/primers/ai/prompt-engineering/) for additional context and information on prompt engineering.

In [None]:
def load_markdown_file(file_path: str) -> str:
    """
    Reads and returns the content of a markdown file specified by its path.

    Parameters:
        file_path (str): The path to the markdown file to be read.

    Returns:
        str: The content of the markdown file as a string.
    """
    with open(file_path, 'r', encoding='utf-8') as file:
        markdown_content = file.read()
    return markdown_content
context = load_markdown_file(PROMPT_GUIDE)

Note: Anthropic has an amazingly large context size and as a result we can luckily just shove the whole document into the prompt in this situation. This can get quite expensive however so it typically makes more sense to use techniques that chunk the document into better sizes or use a RAG based pipeline


In [None]:
context_prompt_response = prompt_llm(
    context + "\n\nExplain the latest prompting techniques and provide an example of each"
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/1c6866c3-f466-4b7f-affc-c922d966771f


In [None]:
render(context_prompt_response)

Here are some of the latest prompting techniques along with examples of each:  1. Chain-of-Thought
(CoT) Prompting: Generates a sequence of short sentences describing step-by-step reasoning to arrive
at the final answer. Especially helpful for complex reasoning tasks.  Example: Question: Marty has
100 centimeters of ribbon that he must cut into 4 equal parts. Each of the cut parts must be divided
into 5 equal parts. How long will each final cut be?  Answer: Let's think step by step. - Marty has
100 cm of ribbon total - He needs to cut it into 4 equal parts. So 100 cm / 4 = 25 cm per part after
the first set of cuts.   - Then each 25 cm part needs to be cut into 5 equal pieces. - 25 cm / 5 = 5
cm Therefore, each final cut piece will be 5 cm long.  2. Zero-Shot CoT: Uses a natural language
prompt like "Let's think step by step" to encourage the model to generate reasoning chains before
providing the final answer, without including few-shot examples.  Example:  Question: Jack is a
soccer 

This response is a significant improvement! We can see that by providing the model with relevant context, it can generate an answer that includes details about various prompting techniques covered in the workshop. The model effectively utilizes the information from the provided guide to address the question more comprehensively.

However, there is still room for improvement. The model tends to regurgitate the technical information from the guide without simplifying or explaining the concepts in an easily understandable manner. The response may be too complex or jargon-heavy for beginners or those new to the topic of prompt engineering.

To address this issue, we need to guide the model towards providing explanations that are more accessible and beginner-friendly. This is where the next step of conditioning the model's responses with a carefully crafted system prompt comes into play. By instructing the model to break down the technical details and present the information in a more digestible format, we can ensure that the responses are not only informative but also easy to understand for a broader audience.

### Step 3. Condition Responses with a System Prompt

To ensure that the bot explains the information in a way that is easy to understand, we can provide a system prompt that guides the model to present the content in a beginner-friendly manner. Here’s why this system prompt is effective:

1. **Objective Clarity**: It directly states the task — simplifying prompt engineering concepts with examples. This aligns with the principle of having a clear and specific objective, which helps the LLM focus on the exact task ([source](https://medium.com/the-modern-scientist/best-prompt-techniques-for-best-llm-responses-24d2ff4f6bca)).

2. **Tone Specification**: Setting a friendly and educational tone guides the LLM on the desired interaction style, making the information approachable and digestible.

3. **Context Awareness**: By acknowledging the user's basic AI knowledge, the prompt tailors the complexity of the content, ensuring it is suitable for beginners without being overly simplistic.

4. **Guidance on Style**: Instructing the use of analogies and simple examples helps in breaking down complex topics into understandable segments, which is crucial for teaching technical subjects effectively.

5. **Verification of Output**: Emphasizing clarity and relevance ensures that the responses are not only correct but also useful and directly applicable to the user’s needs.

6. **Highlighting Benefits**: Mentioning the benefits of simplifying technical concepts engages users by showing the value of what they are learning, enhancing their motivation and the educational impact.

### General Approach to Constructing Effective System Prompts

When constructing system prompts for LLMs, consider the following steps to ensure effectiveness and clarity:

- **Define the Objective**: Clearly state what you want the LLM to achieve. This should be specific and concise.
- **Set the Tone and Style**: Indicate how the response should feel or sound. This helps the LLM adjust its language and approach.
- **Provide Necessary Context**: Include any background information that will help the LLM understand the scope and depth of the response required.
- **Incorporate Guidance for Content**: Direct the LLM on how to structure its response or what elements to include, such as examples or analogies.
- **Specify Output Format**: If necessary, define how the response should be formatted. This is particularly important for tasks requiring a specific output structure.
- **Use Clear and Direct Language**: Avoid ambiguity by using straightforward and direct language. This reduces the chances of misinterpretation.

In [None]:
system_message = """
Objective: Simplify prompt engineering concepts for easy understanding. Provide clear examples for each technique.
Tone: Friendly and educational, suitable for beginners.
Context: Assume basic AI knowledge; avoid deep technical jargon.
Guidance: Use metaphors and simple examples to explain concepts. Keep explanations concise and applicable.
Verification: Ensure clarity and relevance in responses, with practical examples.
Benefits: Help users grasp prompt engineering basics, enhancing their AI interaction experience.
"""

In [None]:
system_and_context_prompt_response = prompt_llm(
    system_message=system_message,
    question=context + "\n\nExplain the latest prompting techniques and provide an example of each"
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/f491b5d0-a877-4f98-9a0a-6f14aa29fc82


In [None]:
render(system_and_context_prompt_response)

Here is a summary of the latest prompting techniques with examples for each:  1. Zero-Shot Prompting
Provide the task directly to the model without any examples.  Example: Text: i'll bet the video game
is a lot more fun than the film.   Sentiment:  2. Few-Shot Prompting  Provide a few examples
demonstrating the task before the actual input.  Example: Text: (lawrence bounces) all over the
stage, dancing, running, sweating, mopping his face and generally displaying the wacky talent that
brought him fame in the first place. Sentiment: positive  Text: despite all evidence to the
contrary, this clunker has somehow managed to pose as an actual feature movie, the kind that charges
full admission and gets hyped on tv and purports to amuse small children and ostensible adults.
Sentiment: negative  Text: i'll bet the video game is a lot more fun than the film. Sentiment:  3.
Instruction Prompting Provide detailed instructions describing the task requirements.  Example:
Please label the sentiment

Great! Now we're able to get a response that is easy to understand and provides a lot of context. The model has successfully broken down the technical concepts into beginner-friendly explanations, using simple language, analogies, and examples to convey the key points. This approach makes the information more accessible and engaging for those new to prompt engineering, fostering a deeper understanding of the subject matter.

The next step is to standardize the inputs and outputs in a way that allows us to ask different questions and pass different context in the future. By creating a consistent structure for our prompts and responses, we can streamline the development of LLM applications and make it easier to experiment with various configurations. This standardization will enable us to quickly iterate on our prompts, test different contexts, and fine-tune our models to achieve the best possible results.

### Step 4: System Prompts - Inputs

To make it easier to experiment with different parameters and retrieve our best models, we can wrap our prompting logic in a collection of modular functions decorated with `@weave.op()`. This allows us to track and version our operations, making it easier to reproduce and analyze our results. We can also define a standard input structure for our system prompts, ensuring consistency across different prompts and use cases.

In [None]:
prompt_template = "{context}\n{question}"

In [None]:
@weave.op()
def format_prompt(prompt_template: str, **kwargs):
    """
    Formats a prompt template with provided keyword arguments.

    This function takes a template string and a dictionary of keyword arguments,
    then formats the template string using these arguments.

    Parameters:
        prompt_template (str): The template string to be formatted.
        **kwargs (dict): Keyword arguments to format the template string with.

    Returns:
        str: The formatted prompt template.
    """
    kwargs = kwargs.get("kwargs", {})
    return prompt_template.format(**kwargs)

In [None]:
# In this app we assume only a context and question are passed to the prompt template but that need not be true
@weave.op()
def llm_app(prompt_template: str, context: str, question: str, **kwargs):
    """
    Generates a response using a formatted prompt based on a template, context, and question.

    This function formats a given prompt template with the specified context and question, then
    generates a response using the prompt_llm function with additional keyword arguments.

    Parameters:
        prompt_template (str): The template string used to format the prompt.
        context (str): The context information to be included in the prompt.
        question (str): The specific question to be asked in the prompt.
        **kwargs (dict): Additional keyword arguments to be passed to the prompt_llm function.

    Returns:
        str: A string representing the generated response.
    """
    kwargs = kwargs.get("kwargs", {})
    formatted_prompt = format_prompt(prompt_template=prompt_template, context=context, question=question)
    response = prompt_llm(
        question=formatted_prompt,
        **kwargs
    )
    return response


We defined our llm_app which in turns acts as the core of our prompting assistant.

In [None]:
question = """
Explain the differences between zero-shot, few-shot, and chain of thought
prompting techniques? Please provide a clear explanation and a practical example
for each technique within a structured format.
"""

In [None]:
input_template_response = llm_app(
    system_message=system_message,
    prompt_template=prompt_template,
    context=context,
    question=question,
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/04e114a1-5487-4766-b4dd-34d45d52d116


In [None]:
render(input_template_response)

Here is a clear explanation of the differences between zero-shot, few-shot, and chain of thought
prompting techniques, with a practical example for each:  Zero-Shot Prompting - Explanation: The
task text is directly fed to the language model without any examples. The model tries to complete
the task based solely on its pre-existing knowledge and capabilities. - Example:  Text: The movie
was terrible. I hated every minute of it. Sentiment: <model_output>  Expected Output:  Sentiment:
negative  Few-Shot Prompting   - Explanation: A small set of examples (usually 1-5) demonstrating
the desired input/output format are provided to the model before the actual task. This helps prime
the model on what kind of output is expected. - Example: Text: Best movie ever! Loved the acting and
cinematography. Sentiment: positive  Text: The plot made no sense and the acting was atrocious.
Sentiment: negative  Text: While it had some funny moments, overall the movie was just average.
Sentiment: neutral  Te

Now we can easily and consistently swap system messages, context, and questions to get the best results. This allows us to test various combinations of system messages, context, and questions to find the most effective prompts for our specific use case.

However, as we can see from the example, the current system prompt doesn't work as well with the newly asked question. This highlights the importance of tailoring the system prompt to the specific task at hand. Additionally, we may want to enforce more consistency in the format of our model's outputs. While using third-party packages like Instructor is beyond the scope of this workshop, we can achieve similar results by using proper tags in our prompt. By including specific tags or formatting instructions in the prompt, we can guide the model to respond in a way that is more consistent and easier to parse on our end



### Step 5: System Prompts - Outputs

In this step, we focus on improving the consistency and structure of our model's outputs by modifying the prompt template. By including specific tags and formatting instructions in the prompt, we can guide the model to respond in a way that is easier to parse and process.

In the system message, we've added a new tag called `Format` which provides instructions for the model to respond within an <answer></answer> tag, with separate <explanation> and <example> tags for each concept. This structured format helps organize the information and makes it easier to extract and analyze the responses programmatically.


In [None]:
def update_prompt_with_output_indicator(system_message: str, prompt_template: str):
    if "gpt" in MODEL_NAME:
        system_format_msg = """
    Format: Respond within a structured JSON object, using the keys provided in the prompt to organize your response.
    Provide a condensed answer under the 'condensed_answer' key, detailed explanations under 'explanation' keys,
    and examples under 'example' keys within each explanation.
    """
        prompt_format_msg = """
    You must respond in JSON format.
    Your response should follow this structure:
    {{
      "answer": {{
        "condensed_answer": "CONDENSED_ANSWER",
        "explanation_1": {{
          "detail": "EXPLANATION_1",
          "example": "EXAMPLE_1"
        }},
        "explanation_2": {{
          "detail": "EXPLANATION_2",
          "example": "EXAMPLE_2"
        }},
        ...
      }}
    }}
    """
    else:
        system_format_msg = """
    Format: Respond within an <answer></answer> tag, with as many <explanation></explanation> tags as needed,
    ensuring that the <detail></detail> and <example></example> tags are used within each <explanation></explanation> tag.
    Provide a condensed answer for the question in the <condensed_answer></condensed_answer> tag.
    """
        prompt_format_msg = """
    You must respond within an <answer></answer> XML tags.
    Inside of the <answer> markdown tag, you must provide a format of
    <answer>
        <condensed_answer> CONDENSED_ANSWER </condensed_answer>
        <explanation>
          <detail> EXPLANATION </detail>
          <example> EXAMPLE </example>
        </explanation>
        <explanation>
          <detail> EXPLANATION </detail>
          <example> EXAMPLE </example>
        </explanation>
        ...
    </answer>
    """
    formatted_system_message = system_message + "\n" + system_format_msg
    formatted_prompt_template = prompt_template + "\n" + prompt_format_msg

    return formatted_system_message, formatted_prompt_template

In [None]:
formatted_system_message, formatted_prompt_template = update_prompt_with_output_indicator(system_message, prompt_template)

Similarly, we've updated the `prompt_template` to include the new formatting instructions, ensuring that the model generates responses that adhere to the specified structure. By enforcing a consistent output format, we can streamline the processing of the model's responses and facilitate further analysis and evaluation of the results.

In [None]:
question = """
Explain the differences between zero-shot, few-shot, and chain of thought
prompting techniques? Please provide a clear explanation and a practical example
for each technique within a structured format.
"""

In [None]:
output_indicator_response = llm_app(
    system_message=formatted_system_message,
    prompt_template=formatted_prompt_template,
    context=context,
    question=question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/a7ad898a-1962-4a8f-a6dd-a7a008485406


In [None]:
render(output_indicator_response)

Here is a structured explanation of the differences between zero-shot, few-shot, and chain of
thought prompting techniques:  <answer> <condensed_answer> Zero-shot prompting provides the task to
the model without any examples. Few-shot prompting includes a small number of input-output examples
to demonstrate the task. Chain of thought prompting encourages the model to break down its reasoning
into a series of steps. </condensed_answer>  <explanation> <detail> Zero-shot prompting simply
provides the task or question to the language model without any examples of the desired output. The
model must infer what is being asked and how to respond appropriately based solely on the prompt.
</detail> <example> Zero-shot sentiment classification prompt: Text: The food was terrible and the
service was even worse.  Sentiment: </example> </explanation>  <explanation> <detail> Few-shot
prompting includes a small number (usually 1-5) examples of input-output pairs that demonstrate the
desired task. By s

## Advanced Prompting Techniques

### Zero-shot Prompting

In [None]:
def update_with_zero_shot_prompt(question: str):
    zero_shot_instruction = "Without using any specific examples, "
    return zero_shot_instruction + question

In [None]:
zero_shot_question = update_with_zero_shot_prompt(question)

In [None]:
zero_shot_response = llm_app(
    system_message=formatted_system_message,
    prompt_template=formatted_prompt_template,
    context=context,
    question=zero_shot_question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/b1e3b221-860d-402f-870f-855593a30e1d


In [None]:
render(zero_shot_response)

Here is a comparison of zero-shot, few-shot, and chain-of-thought prompting techniques: <answer>
<condensed_answer> Zero-shot prompting provides the task to the model without any examples. Few-shot
prompting includes a small number of input-output examples to demonstrate the task. Chain-of-thought
prompting encourages the model to break down its reasoning into a series of steps.
</condensed_answer>  <explanation> <detail> Zero-shot prompting simply provides the task or question
to the language model without any examples of the desired output. The model must infer what is being
asked and how to respond appropriately based solely on the prompt. </detail> <example> Prompt:
Classify the sentiment of the following movie review as either positive or negative: "This was the
most boring, unoriginal film I've seen in years. I walked out of the theater feeling like I
completely wasted my time and money." </example> </explanation>  <explanation> <detail> Few-shot
prompting includes a small number

### Few-shot Prompting

In [None]:
def update_with_few_shot_prompt(question: str):
    if "gpt" in MODEL_NAME:
        few_shot_examples = """
        Here are a few examples of prompting techniques in JSON format:
        {{
            "answer": {{
                "condensed_answer": "Different prompting techniques are used to guide language models in generating desired outputs.",
                "explanation_1": {{
                    "detail": "Translation prompts provide the model with a source language text and request the translation in a target language.",
                    "example": "Translate the following English text to French: 'Hello, how are you?'"
                }},
                "explanation_2": {{
                    "detail": "Sentiment classification prompts ask the model to determine the sentiment expressed in a given text.",
                    "example": "Classify the sentiment of the following text: 'The movie was terrible.'"
                }},
                "explanation_3": {{
                    "detail": "Factual question prompts require the model to provide an answer along with an explanation or reasoning.",
                    "example": "What is the capital of Germany? Explain your reasoning."
                }}
            }}
        }}
        """
    else:
        few_shot_examples = """
        Here are a few examples of prompting techniques in XML format:
        <answer>
            <condensed_answer>Different prompting techniques are used to guide language models in generating desired outputs.</condensed_answer>
            <explanation>
                <detail>Translation prompts provide the model with a source language text and request the translation in a target language.</detail>
                <example>Translate the following English text to Spanish: 'Good morning, how can I help you?'</example>
            </explanation>
            <explanation>
                <detail>Sentiment classification prompts ask the model to determine the sentiment expressed in a given text.</detail>
                <example>Classify the sentiment of the following text: 'I love this product!'</example>
            </explanation>
            <explanation>
                <detail>Factual question prompts require the model to provide an answer along with an explanation or reasoning.</detail>
                <example>What is the capital of Canada? Provide your thought process.</example>
            </explanation>
        </answer>
        """

    return few_shot_examples + "\n" + question

In [None]:
few_shot_question = update_with_few_shot_prompt(question)

In [None]:
few_shot_response = llm_app(
    system_message=formatted_system_message,
    prompt_template=formatted_prompt_template,
    context=context,
    question=few_shot_question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/156143df-6b31-4e83-a4ca-3a4e16c64e62


In [None]:
render(few_shot_response)

Here is a comparison of zero-shot, few-shot, and chain of thought prompting techniques in the
requested format:  <answer> <condensed_answer> Zero-shot uses no examples, few-shot uses a small
number of examples, and chain of thought breaks down the reasoning process into steps, to guide
language models in generating desired outputs. </condensed_answer>  <explanation> <detail> Zero-shot
prompting provides the language model with only a task description or instruction, without any
examples. The model must rely on its existing knowledge to generate a response. </detail> <example>
Classify the sentiment of the following movie review text: "The plot was full of holes and the
characters were unbelievable. I wanted to walk out of the theater." </example> </explanation>
<explanation> <detail> Few-shot prompting includes a small number of examples (typically 1-5) that
demonstrate the desired input/output format or task. This helps prime the model to generate better
responses. </detail> <example>

### Chain of Thought

Note: We do not use the output indicators in this case as it will negate the chain of thought to instead enforce the formatting. It is important to explicitly incorporate the thought process desired in the prompt

In [None]:
def update_with_chain_of_thought_prompt(system_message: str, prompt_template: str):
    chain_of_thought_instruction = "Let's explicitly think step by step. My thought process is:\n"
    chain_of_thought_system_format = "Format: You must explicitly define the thought process and knowledge from the context to come to your conclusion for the question."
    return system_message + "\n" + chain_of_thought_system_format, prompt_template + "\n" + chain_of_thought_instruction

In [None]:
chain_of_thought_system_message, chain_of_thought_prompt = update_with_chain_of_thought_prompt(system_message, prompt_template)


In [None]:
chain_of_thought_response = llm_app(
    system_message=chain_of_thought_system_message,
    prompt_template=chain_of_thought_prompt,
    context=context,
    question=question,
    # response_format={"type": "json_object"} # Comment this out for `Claude` models or `litellm.drop_params=True``
)

🍩 https://wandb.ai/alsmail10/prompting-workshop-test/r/call/c373fbe1-df08-4f01-954b-42d2718c10ec


In [None]:
render(chain_of_thought_response)

To explain the differences between zero-shot, few-shot, and chain of thought prompting techniques, I
will:  1. Define each technique clearly 2. Provide a practical example for each  3. Summarize the
key differences in a structured format  Zero-Shot Prompting: Definition: Providing only the task
input to the language model, with no examples of the desired output format. The model generates the
output based solely on the input prompt.  Example:  Input: "Classify the sentiment of the following
movie review: I thought the movie was entertaining and fun to watch." Output: "Positive"  Few-Shot
Prompting:  Definition: Providing a small number of input-output examples (usually 1-5)
demonstrating the desired task, before giving the actual input prompt. The model learns from the
examples to generate the output in the same format.  Example: Input:  "Classify the sentiment of the
following movie reviews:  Review: The acting was terrible and the plot made no sense.  Sentiment:
Negative  Review: The