# Prompting Workshop with Weights and Biases - [Anish Shah](https://www.linkedin.com/in/anish-shah/)

In [None]:
%%capture
!pip install set-env-colab-kaggle-dotenv -q
!pip install weave -U -q
!pip install litellm -q

In [None]:
try:
    import google.colab
    !git clone https://github.com/wandb/llm-workshop-fc2024.git
    %cd llm-workshop-fc2024/prompting
except:
    pass

In [None]:
%%capture
from text_formatting import render
from set_env import set_env
set_env("ANTHROPIC_API_KEY")
# set_env("WANDB_API_KEY")
set_env("OPENAI_API_KEY")

Welcome to the Prompting Workshop with Weights and Biases, led by Anish Shah. This workshop is designed to explore the fascinating world of prompt engineering, a crucial aspect of interacting with and leveraging the capabilities of large language models (LLMs). Throughout this session, we'll dive into various techniques for crafting effective prompts that can significantly enhance the performance of LLMs across a wide range of tasks.

Whether you're new to AI and machine learning or looking to deepen your understanding of prompt engineering, this workshop will provide you with valuable insights and practical skills. By the end of this session, you'll be equipped to design and implement prompts that effectively communicate your intentions to LLMs, enabling more accurate and relevant responses.

This section of the notebook focuses on setting up the environment and installing the required libraries:

- It installs the [weave](https://wandb.github.io/weave/) which is used for tracking llm model operations
- It installs the `litellm` which is used to standardize model interaction and also make it easy to swap model providers
- If running in Google Colab, it retrieves the API keys for Weights & Biases, OpenAI, and Anthropic from the Colab user data and sets them as environment variables.
- If not running in Colab, it loads the `.env` file using `load_dotenv()`, making the API keys available as environment variables. This is useful for managing secrets in a local development environment.

In [None]:
%%capture
import weave
from litellm import completion

## Model and Prompt Configuration
The code snippets define important configuration variables for the prompting workshop:

In [None]:
# These variables store the names of different language models from Anthropic and OpenAI. 
# The "SMART" models (`claude-3-opus` and `gpt-4`) are more capable but slower, 
# while the "FAST" models (`claude-3-haiku` and `gpt-3.5-turbo`) are faster but less powerful.
ANTHROPIC_SMART_MODEL_NAME = "claude-3-opus-20240229"
ANTHROPIC_FAST_MODEL_NAME = "claude-3-haiku-20240307"
OPENAI_SMART_MODEL_NAME = "gpt-4-turbo-2024-04-09"
OPENAI_FAST_MODEL_NAME = "gpt-3.5-turbo"

# These variables point to two different markdown files containing prompt engineering guides. 
# `AMAN_PROMPT_GUIDE` refers to Aman Chadha's guide, while `LILIAN_PROMPT_GUIDE` refers to Lilian Weng's guide.
AMAN_PROMPT_GUIDE = "aman_prompt_engineering.md"
LILIAN_PROMPT_GUIDE = "lilianweng_prompt_engineering.md"

# Here, the `MODEL_NAME` variable is set to use Anthropic's fast model (`claude-3-haiku`), 
# and the `PROMPT_GUIDE` variable selects Lilian Weng's prompt engineering guide.
MODEL_NAME = ANTHROPIC_FAST_MODEL_NAME
PROMPT_GUIDE = LILIAN_PROMPT_GUIDE

These configuration variables allow workshop participants to easily switch between different models and prompt guides throughout the workshop by modifying the assigned values.

## Initializing Weave

This line initializes the Weave library for the "prompting-workshop" project. Weave is a toolkit for developing Generative AI applications, providing features like logging, debugging, evaluations, and organization of LLM workflows. 

Initializing Weave at the start allows you to leverage its capabilities throughout your project, such as decorating Python functions with `@weave.op()` to enable automatic tracing and versioning.

By specifying the project name "prompting-workshop", you are setting up a dedicated workspace for this workshop within Weave. This helps keep the workshop-related experiments, models, and data organized and separate from other projects.

Weave brings structure and best practices to the experimental nature of Generative AI development, making it easier to track, reproduce, and share your work. Initializing it early in the notebook ensures you can take full advantage of its features as you progress through the workshop.

In [None]:
weave.init("prompting-workshop")

## Defining the get_completion function

This code defines a function called `get_completion` that is decorated with `@weave.op()`. The `@weave.op()` decorator is provided by Weave and enables automatic tracing and versioning of the function.

The `get_completion` function takes several parameters:
- `system_message`: The system message to provide context or instructions to the language model.
- `messages`: A list of messages representing the conversation history.
- `model_name`: The name of the language model to use (defaults to `MODEL_NAME`).
- `max_tokens`: The maximum number of tokens to generate in the response (defaults to 4096).
- `temperature`: The sampling temperature for controlling the randomness of the generated text (defaults to 0).

Inside the function, it calls the `completion` function (from the `litellm` library) with the provided parameters to generate a completion from the language model. The `temperature` parameter is set to 0, which is recommended for evaluations and RAG (Retrieval-Augmented Generation) systems to ensure deterministic results.

The generated response is then printed as JSON using `response.json()`, and the JSON response is returned by the function.

By using the `@weave.op()` decorator, Weave automatically tracks and versions the inputs and outputs of the `get_completion` function, making it easier to reproduce and analyze the results later in the workshop.

In [None]:
@weave.op()
def get_completion(system_message, messages, model, max_tokens, temperature):
    response = completion(
        model=model,
        max_tokens=max_tokens,
        temperature=temperature,
        system=system_message,
        messages=messages
    )
    return response.json()

## Use Case: Building a Prompting Assistant

In this section, we explore the practical application of prompt engineering by building a bot that helps users understand prompting techniques and answers questions based on the provided information. This use case demonstrates the power of prompt engineering in creating helpful AI assistants that can make complex topics more accessible and engaging.

By leveraging the knowledge contained in a comprehensive guide on prompting techniques, we can develop a bot that provides accurate and relevant answers to user queries. Through careful crafting of system messages and prompt templates, we ensure that the bot's responses are not only informative but also easy to understand, even for beginners.

Throughout this use case, workshop participants will learn how to:

- Incorporate context to improve the relevance and accuracy of the model's responses
- Use system messages to guide the model's behavior and output style
- Standardize inputs and outputs for consistent and reusable prompting assistants
- Experiment with different configurations to optimize the bot's performance

By engaging with this use case, participants will gain hands-on experience in applying prompt engineering techniques to build a practical and helpful AI assistant. They will develop a deeper understanding of how to effectively communicate with language models and tailor their outputs to specific audiences and use cases.

### Step 1: Raw Prompting

We start by sending a question to the language model without any additional context, using a basic `prompt_llm` function. This demonstrates the model's limitations when lacking the necessary information to provide relevant answers.

In [None]:
@weave.op()
def prompt_llm(question, **kwargs):
    messages = [{"role": "user", "content": question}]
    # Extract expected arguments from kwargs with defaults if not provided
    system_message = kwargs.get('system_message', "")
    model = kwargs.get('model', MODEL_NAME)
    max_tokens = kwargs.get('max_tokens', 4096)
    temperature = kwargs.get('temperature', 0)

    # Call get_completion with explicit arguments
    response = get_completion(
        system_message=system_message,
        messages=messages,
        model=model,
        max_tokens=max_tokens,
        temperature=temperature
    )
    return response["choices"][0]["message"]["content"]

In [None]:
raw_prompt_response = prompt_llm(
    "What are the latest prompting techniques?"
)

In [None]:
render(raw_prompt_response)

The model's response to the raw prompt is inadequate because it lacks the necessary context to provide a meaningful answer. Without any background information or specific details about prompting techniques, the model can only generate a generic, high-level response that fails to address the question effectively, many times providing no response at all.

This poor performance highlights the importance of providing relevant context when prompting language models. By supplying the model with additional information related to the topic at hand, we can guide it towards generating more accurate, detailed, and useful responses.

### Step 2. Prompting with Context

We can provide the necessary context to the language model by including it directly alongside the question. In this example, we use a comprehensive guide on prompting techniques written by [Lilian Weng](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/). This guide is particularly useful as it condenses many great papers and articles into a single-page resource, covering various prompting techniques.

To incorporate the context, we:

1. Load the markdown file containing the prompting guide using the `load_markdown_file` function.
2. Concatenate the loaded context with the question in the `context_prompt_response` variable.
3. Pass the combined context and question to the `prompt_llm` function to generate a response.

By providing the model with relevant context, we expect to receive more accurate and informative answers to our questions about prompting techniques.

Note: As an alternative, you can also use a more extensive guide by [Aman Chadha](https://aman.ai/primers/ai/prompt-engineering/) for additional context and information on prompt engineering.

In [None]:
def load_markdown_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        markdown_content = file.read()
    return markdown_content
context = load_markdown_file(PROMPT_GUIDE)

Note: Anthropic has an amazingly large context size and as a result we can luckily just shove the whole document into the prompt in this situation. This can get quite expensive however so it typically makes more sense to use techniques that chunk the document into better sizes or use a RAG based pipeline


In [None]:
context_prompt_response = prompt_llm(
    context + "\n\nExplain the latest prompting techniques and provide an example of each"
)

In [None]:
render(context_prompt_response)

This response is a significant improvement! We can see that by providing the model with relevant context, it can generate an answer that includes details about various prompting techniques covered in the workshop. The model effectively utilizes the information from the provided guide to address the question more comprehensively.

However, there is still room for improvement. The model tends to regurgitate the technical information from the guide without simplifying or explaining the concepts in an easily understandable manner. The response may be too complex or jargon-heavy for beginners or those new to the topic of prompt engineering.

To address this issue, we need to guide the model towards providing explanations that are more accessible and beginner-friendly. This is where the next step of conditioning the model's responses with a carefully crafted system prompt comes into play. By instructing the model to break down the technical details and present the information in a more digestible format, we can ensure that the responses are not only informative but also easy to understand for a broader audience.

### Step 3. Condition Responses with a System Prompt

To ensure that the bot explains the information in a way that is easy to understand, we can provide a system prompt that guides the model to present the content in a beginner-friendly manner. The system prompt acts as a set of instructions for the model, outlining the desired tone, style, and level of complexity for the generated responses.

In this step, we create a detailed system message that includes the following guidelines:

- **Objective**: Analyze the technical markdown page on prompt engineering and simplify the concepts for easy understanding.
- **Personality and Tone**: Adopt a friendly and knowledgeable teacher's role, breaking down complex subjects into digestible pieces.
- **Contextual Information**: Assume the user has a basic understanding of AI but may not be familiar with advanced prompting concepts.
- **Creativity Constraints and Style Guidance**: Avoid jargon and technical terminology, using metaphors, analogies, and simple examples to convey points.
- **External Knowledge**: Draw upon general knowledge of AI, machine learning, and prompt engineering practices, but avoid diving into highly specialized research.
- **Rules and Guidelines**: Steer clear of overly complex explanations or examples that might confuse beginners, ensuring examples are realistic and directly applicable.
- **Output Verification Standards**: Ensure responses are clear, accurate, and directly responsive to the task, with examples that effectively demonstrate the discussed concepts.
- **Benefits of Task**: Highlight how simplifying these technical concepts will help demystify prompt engineering for those new to the subject, fostering a deeper understanding and appreciation of its importance in AI interactions.

By incorporating this system prompt, we condition the model to generate responses that are more accessible, engaging, and tailored to the needs of beginners in the field of prompt engineering.

In [None]:
system_message = """
Objective: You will analyze a highly technical markdown page about prompt engineering. Your task is to simplify the concepts discussed on this page and explain them in an easy-to-understand manner. For each prompting technique mentioned, provide clear, concise examples that illuminate how the same task would be approached differently.
Personality and Tone: Adopt the role of a friendly and knowledgeable teacher who excels at breaking down complex subjects into easily digestible pieces. Your explanations should be patient and encouraging, aiming to enlighten without overwhelming. The tone should be casual yet informative, making technical content accessible to a broad audience, including beginners.
Contextual Information: Assume the user has a basic understanding of AI but may not be familiar with advanced concepts of prompt engineering. Wherever possible, relate technical details to everyday scenarios or familiar contexts to enhance comprehension.
Creativity Constraints and Style Guidance: Your explanations should avoid jargon and technical terminology without sacrificing accuracy. Use metaphors, analogies, and simple examples to convey your points. Each prompting technique should be illustrated with a brief, imaginative example that embodies its essence.
External Knowledge: Feel free to draw upon general knowledge of AI, machine learning, and prompt engineering practices. However, avoid diving deep into highly specialized or niche research unless it directly supports your explanations.
Rules and Guidelines: Steer clear of overly complex explanations or examples that might confuse someone new to the topic. Ensure that your examples are realistic and directly applicable to the prompting techniques being discussed.
Output Verification Standards: Your responses should be clear, accurate, and directly responsive to the task. Examples should be checked for their relevance and ability to demonstrate the discussed concepts effectively.
Benefits of Task: By simplifying these technical concepts, you will help demystify prompt engineering for those new to the subject, fostering a deeper understanding and appreciation of its importance in AI interactions. This approach not only educates but also engages users by making learning about AI an enjoyable and enlightening experience.
"""

In [None]:
system_and_context_prompt_response = prompt_llm(
    system_message=system_message,
    question=context + "\n\nExplain the latest prompting techniques and provide an example of each"
)

In [None]:
render(system_and_context_prompt_response)

Great! Now we're able to get a response that is easy to understand and provides a lot of context. The model has successfully broken down the technical concepts into beginner-friendly explanations, using simple language, analogies, and examples to convey the key points. This approach makes the information more accessible and engaging for those new to prompt engineering, fostering a deeper understanding of the subject matter.

The next step is to standardize the inputs and outputs in a way that allows us to ask different questions and pass different context in the future. By creating a consistent structure for our prompts and responses, we can streamline the development of LLM applications and make it easier to experiment with various configurations. This standardization will enable us to quickly iterate on our prompts, test different contexts, and fine-tune our models to achieve the best possible results.

## Step 4: System Prompts - Inputs

To make it easier to experiment with different parameters and retrieve our best models, we can wrap our prompting logic in a collection of modular functions decorated with `@weave.op()`. This allows us to track and version our operations, making it easier to reproduce and analyze our results. We can also define a standard input structure for our system prompts, ensuring consistency across different prompts and use cases.

In [None]:
# @weave.op()
# def format_prompt(prompt_template, **kwargs):
#     return prompt_template.format(**kwargs)

In [None]:
# Note we're assuming we're only using context and question in our prompt template
@weave.op()
def format_prompt(prompt_template, context, question):
    return prompt_template.format(context=context, question=question)

In [None]:
@weave.op()
def llm_app(prompt_template, context, question, **kwargs):
    formatted_prompt = format_prompt(prompt_template=prompt_template, context=context, question=question)
    response = prompt_llm(
        question=formatted_prompt,
        **kwargs
    )
    return response 

We defined our llm_app which in turns acts as the core of our prompting assistant.

In [None]:
prompt_template = "{context}\n{question}"

In [None]:
question = """
I am building a chatbot for analyzing workshop attendee satisfaction. 
What are some good examples of Few Shot prompts to put into another prompt?
"""

In [None]:
input_template_response = llm_app(
    system_message=system_message,
    prompt_template=prompt_template,
    context=context,
    question=question,
)

In [None]:
render(input_template_response)

Now we can easily and consistently swap system messages, context, and questions to get the best results. By using the `PromptingModel` class, we have a standardized way to experiment with different configurations and quickly iterate on our prompts. This allows us to test various combinations of system messages, context, and questions to find the most effective prompts for our specific use case.

However, as we can see from the example, the current system prompt doesn't work as well with the newly asked question. This highlights the importance of tailoring the system prompt to the specific task at hand. Additionally, we may want to enforce more consistency in the format of our model's outputs. While using third-party packages like Instructor is beyond the scope of this workshop, we can achieve similar results by using proper tags in our prompt. By including specific tags or formatting instructions in the prompt, we can guide the model to respond in a way that is more consistent and easier to parse on our end



### Step 5: System Prompts - Outputs

In this step, we focus on improving the consistency and structure of our model's outputs by modifying the prompt template. By including specific tags and formatting instructions in the prompt, we can guide the model to respond in a way that is easier to parse and process.

In [None]:
system_message = """
Objective: You will analyze a highly technical markdown page about prompt engineering. Your task is to simplify the concepts discussed on this page and explain them in an easy-to-understand manner. For each prompting technique mentioned, provide clear, concise examples that illuminate how the same task would be approached differently.
Personality and Tone: Adopt the role of a friendly and knowledgeable teacher who excels at breaking down complex subjects into easily digestible pieces. Your explanations should be patient and encouraging, aiming to enlighten without overwhelming. The tone should be casual yet informative, making technical content accessible to a broad audience, including beginners.
Contextual Information: Assume the user has a basic understanding of AI but may not be familiar with advanced concepts of prompt engineering. Wherever possible, relate technical details to everyday scenarios or familiar contexts to enhance comprehension.
Creativity Constraints and Style Guidance: Your explanations should avoid jargon and technical terminology without sacrificing accuracy. Use metaphors, analogies, and simple examples to convey your points. Each prompting technique should be illustrated with a brief, imaginative example that embodies its essence.
External Knowledge: Feel free to draw upon general knowledge of AI, machine learning, and prompt engineering practices. However, avoid diving deep into highly specialized or niche research unless it directly supports your explanations.
Rules and Guidelines: Steer clear of overly complex explanations or examples that might confuse someone new to the topic. Ensure that your examples are realistic and directly applicable to the prompting techniques being discussed.
Output Verification Standards: Your responses should be clear, accurate, and directly responsive to the task. Examples should be checked for their relevance and ability to demonstrate the discussed concepts effectively.
Benefits of Task: By simplifying these technical concepts, you will help demystify prompt engineering for those new to the subject, fostering a deeper understanding and appreciation of its importance in AI interactions. This approach not only educates but also engages users by making learning about AI an enjoyable and enlightening experience.
"""

In [None]:
prompt_template = """
<context>{context}</context>
<question>{question}</question>\n

You must respond within an <answer></answer> markdown tag.
Inside of the <answer> markdown tag, you must provide a format of
<answer>
    <explanation> EXPLANATION </explanation>
    <example> EXAMPLE </example>
</answer>

Fill in the rest of the tag given:
<answer>
"""

In [None]:
question = """
I am building a chatbot for analyzing workshop attendee satisfaction. 
What are some good examples of Few Shot prompts to put into another prompt?
"""

In [None]:
output_indicator_response = llm_app(
    system_message=system_message,
    prompt_template=prompt_template,
    context=context,
    question=question,
    model=ANTHROPIC_SMART_MODEL_NAME
)

In [None]:
render(output_indicator_response)

## Advanced Prompting Techniques

### Zero-shot Prompting

### Few-shot Prompting

### Chain of Thought

#### CoT + Few-shot

### Self-Consistency

#### Self-Consistency + CoT