## **Outline**

- Prompting Techniques: Zero-shot, Few-shot, Chain-of-Thought
- Hands-On Exercises: Crafting Complex Prompts
- Introduction to Prompt Optimization Strategies

<img src="./images/border.jpg" height="10" width="1500" align="center"/>

## **Prompting Techniques**

- Prompt Engineering optimizes prompt design for enhanced LLM task performance.
- Transition from basic to advanced techniques for complex task achievement.
- Focus on improving LLM reliability and performance through advanced prompting.

- There are different techniques of prompting
- the most popular and common ones are:
  - Zero-shot 
  - Few-shot 
  - Chain-of-Though
  - Self-Consistency
  - Retrieval Augmented Generation

<img src="./images/border.jpg" height="10" width="1500" align="center"/>

## **Zero-Shot Prompting**

- Large LLMs like GPT-3 excel in "zero-shot" tasks due to extensive training and data.
- Capable of understanding and executing instructions without prior examples.
- Illustration provided with a zero-shot example from the previous section.

```
Classify the text into neutral, negative or positive. 

Text: I think the vacation is okay.
Sentiment:
```

Output:
```
Neutral
```

- Zero-shot capability allows LLMs to understand tasks like "sentiment" analysis without prior examples.
- Instruction tuning can enhance zero-shot learning, as demonstrated by Wei et al. (2022).
- This involves finetuning models with datasets described through instructions.
- RLHF (reinforcement learning from human feedback) scales instruction tuning by aligning models with human preferences, exemplified by ChatGPT.
  - Discussion on instruction tuning and RLHF methods to follow in upcoming sections.
- For tasks where zero-shot is insufficient, 
  - Few-shot prompting with examples improves performance.

<img src="./images/border.jpg" height="10" width="1500" align="center"/>

## **Few-Shot Prompting**

- Large language models excel in zero-shot tasks but may struggle with complex challenges.
- Few-shot prompting enhances performance by providing **in-context** learning through examples.
- Demonstrations in prompts serve as conditioning for better model responses.
- Few-shot capabilities emerged as models scaled in size, per Kaplan et al., 2020.
- Example from Brown et al., 2020, illustrates few-shot prompting to use a new word correctly in a sentence.
```
A "whatpu" is a small, furry animal native to Tanzania. An example of a sentence that uses the word whatpu is:
We were traveling in Africa and we saw these very cute whatpus.
 
To do a "farduddle" means to jump up and down really fast. An example of a sentence that uses the word farduddle is:

```

- The model can learn tasks from a single example, known as 1-shot learning.
- For complex tasks, effectiveness may improve with more examples (3-shot, 5-shot, 10-shot, etc.).
- Experimentation with the number of demonstrations can optimize task performance.

<img src="./images/border.jpg" height="10" width="1500" align="center"/>

- Demonstrations' label space and input text distribution significantly impact model performance, as found by Min et al. (2022).
- Correctness of labels for individual inputs is less critical than the overall structure provided by demonstrations.
- The format of demonstrations, including the use of labels, enhances performance—even random labels are beneficial.
- Opting for a true distribution of labels over a uniform distribution further improves task execution.

- Experimenting with random label assignments to inputs.
- Utilizes labels "Negative" and "Positive" without regard to their actual relevance.
- This approach tests the impact of label format on model understanding and response quality.

- **Example**
```
This is awesome! // Negative
This is bad! // Positive
Wow that movie was rad! // Positive
What a horrible show! //
```

- Correct answers achieved despite randomized label assignment.
- Maintaining a consistent format contributes to successful outcomes.
- Newer GPT models exhibit increased robustness to variations in formatting.
- Further experimentation reveals improved model adaptability to random formats.

```
Positive This is awesome! 
This is bad! Negative
Wow that movie was rad!
Positive
What a horrible show! --
```

- The model correctly predicts labels despite inconsistent formatting.
- Indicates newer GPT models' ability to interpret and adapt to varied prompt structures.
- A more detailed analysis is required to assess performance across diverse and complex tasks.
- Investigating different prompt variations will help understand model robustness and limitations.

## **Limitations of Few-shot Prompting**
- Few-shot prompting is effective for various tasks but has limitations with complex reasoning.
- Demonstrating these limitations through a previously discussed example task.
- The example will illustrate the challenges few-shot prompting faces with intricate reasoning scenarios.

```
The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. 

A: 
```
Output:
```
Yes, the odd numbers in this group add up to 107, which is an even number.
```

- Incorrect responses underscore the limitations of current language models.
- Indicates a pressing need for advanced prompt engineering techniques.
- Enhancing prompt design can address shortcomings in complex task handling.

## **Limitations of Few-shot Prompting**

- Another example:
```
The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: The answer is False.

The odd numbers in this group add up to an even number: 17,  10, 19, 4, 8, 12, 24.
A: The answer is True.

The odd numbers in this group add up to an even number: 16,  11, 14, 4, 8, 13, 24.
A: The answer is True.

The odd numbers in this group add up to an even number: 17,  9, 10, 12, 13, 4, 2.
A: The answer is False.

The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. 
A: 
```

Output:

```
The answer is True.

```

## **Limitations of Few-shot Prompting**

- Few-shot prompting shows limitations in handling complex reasoning problems.
- The task's complexity requires breaking down the problem into manageable steps.
- Chain-of-Thought (CoT) prompting emerges as a solution for intricate tasks involving arithmetic, commonsense, and symbolic reasoning.
- Demonstrating the problem-solving process step-by-step helps the model generate more accurate and reliable responses.

- Examples in prompts aid in solving certain tasks effectively.
- Limitations of zero-shot and few-shot prompting suggest a need for further model learning or adaptation.
- Fine-tuning models or exploring advanced prompting techniques becomes necessary for improvement.
- Introduction to Chain-of-Thought (CoT) prompting as a next step, highlighting its growing popularity in tackling complex problems.

## **Chain-of-Thought (CoT) Prompting**

- Chain-of-Thought (CoT) prompting enhances complex reasoning by outlining intermediate steps.
- Combining CoT with few-shot prompting improves outcomes on tasks needing detailed reasoning.
- This approach aids models in navigating through complex problems before generating responses.


<img src="./images/cot.webp" width="800" align="center"/>

Image Source: Wei et al. (2022)



```
The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: Adding all the odd numbers (9, 15, 1) gives 25. The answer is False.

The odd numbers in this group add up to an even number: 17,  10, 19, 4, 8, 12, 24.
A: Adding all the odd numbers (17, 19) gives 36. The answer is True.

The odd numbers in this group add up to an even number: 16,  11, 14, 4, 8, 13, 24.
A: Adding all the odd numbers (11, 13) gives 24. The answer is True.

The odd numbers in this group add up to an even number: 17,  9, 10, 12, 13, 4, 2.
A: Adding all the odd numbers (17, 9, 13) gives 39. The answer is False.

The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. 
A:
```

- Providing reasoning steps significantly improves model accuracy.
- Demonstrates the effectiveness of CoT prompting with minimal examples.
- A single example can suffice to guide the model to correct conclusions.

```
The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: Adding all the odd numbers (9, 15, 1) gives 25. The answer is False.

The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. 
A:
```

- The emergence of advanced reasoning capabilities is attributed to the scale of language models.
- Authors note that sufficiently large models develop the ability to follow complex reasoning paths.

## **Zero-shot COT Prompting**

- Zero-shot Chain-of-Thought (CoT) introduces "Let's think step by step" to prompts.
- Concept by Kojima et al. 2022 aims to enable reasoning without prior examples.
- Testing this approach with a simple problem evaluates the model's reasoning performance.

<img src="./images/zero-cot.webp" width="800" align="center"/>
Image Source: Kojima et al. (2022)



``` 
I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with?
```
 
- The answer is incorrect! Now Let's try with the special prompt.

``` 
I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with?

```

- The simplicity and effectiveness of the zero-shot CoT prompt are notable.
- Valuable for situations with limited examples available for prompting.
- Demonstrates the potential for efficient reasoning in minimal-setup scenarios.

## **Self-Consistency**
  - Proposed by Wang et al. (2022) to enhance Chain-of-Thought (CoT) prompting.
  - Moves beyond simplistic greedy decoding by incorporating multiple, diverse reasoning paths.
  - Technique involves sampling various reasoning paths via few-shot CoT.
  - The most consistent answer among these generations is selected.
  - Significantly improves performance on arithmetic and commonsense reasoning tasks.


```
When I was 6 my sister was half my age. Now
I’m 70 how old is my sister?
```

> The output is wrong!

```
Q: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done,
there will be 21 trees. How many trees did the grove workers plant today?
A: We start with 15 trees. Later we have 21 trees. The difference must be the number of trees they planted.
So, they must have planted 21 - 15 = 6 trees. The answer is 6.

Q: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?
A: There are 3 cars in the parking lot already. 2 more arrive. Now there are 3 + 2 = 5 cars. The answer is 5.

Q: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?
A: Leah had 32 chocolates and Leah’s sister had 42. That means there were originally 32 + 42 = 74
chocolates. 35 have been eaten. So in total they still have 74 - 35 = 39 chocolates. The answer is 39.

Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops
did Jason give to Denny?
A: Jason had 20 lollipops. Since he only has 12 now, he must have given the rest to Denny. The number of
lollipops he has given to Denny must have been 20 - 12 = 8 lollipops. The answer is 8.

Q: Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does
he have now?
A: He has 5 toys. He got 2 from mom, so after that he has 5 + 2 = 7 toys. Then he got 2 more from dad, so
in total he has 7 + 2 = 9 toys. The answer is 9.

Q: There were nine computers in the server room. Five more computers were installed each day, from
monday to thursday. How many computers are now in the server room?
A: There are 4 days from monday to thursday. 5 computers were added each day. That means in total 4 * 5 =
20 computers were added. There were 9 computers in the beginning, so now there are 9 + 20 = 29 computers.
The answer is 29.

Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many
golf balls did he have at the end of wednesday?
A: Michael initially had 58 balls. He lost 23 on Tuesday, so after that he has 58 - 23 = 35 balls. On
Wednesday he lost 2 more so now he has 35 - 2 = 33 balls. The answer is 33.

Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?
A: She bought 5 bagels for $3 each. This means she spent $15. She has $8 left.

Q: When I was 6 my sister was half my age. Now I’m 70 how old is my sister?
A:
```