##### Master Degree in Computer Science and Data Science for Economics

# Prompt Engineering

### Alfio Ferrara

For an introduction see
> Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., ... & Resnik, P. (2024). The prompt report: A systematic survey of prompting techniques. arXiv preprint arXiv:2406.06608, 5.

## Options to interact with LLMs
-  Cloud APIs (e.g., OpenAI, Anthropic)
    - They are tipically easy-to-use but not always free. Data are processed online in cloud.
- Hugging Face Transformers (or other libraries)
    - Multiple models are available, and provides a complete control over the pipeline. Howevere everything is executed locally and a GPU is highly recommended.
- Runtime interfaces
    - There are multiple options, such as [llama.cpp](https://github.com/ggml-org/llama.cpp), [ollama](https://ollama.com/), [vllm](https://github.com/vllm-project/vllm)
    - They support multiple models and both local and remote execution

### A first example (using huggingface)

Sometimes a authentication token is required. See [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)

In [1]:
import json
from huggingface_hub import login

In [2]:
with open('/Users/Flint/Data/apikeys/keys.json', 'r') as infile:
    token = json.load(infile)['huggingface']

login(token=token)

In [3]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

torch.set_default_device("mps")

model_id = "HuggingFaceH4/zephyr-7b-alpha"

print("Loading model... This may take a few minutes the first time.")
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32)  # MPS does not support float16

prompt = "<|system|>You are a helpful assistant.<|user|>What is prompt engineering?<|assistant|>"

inputs = tokenizer(prompt, return_tensors="pt")
inputs = {k: v.to("mps") for k, v in inputs.items()}  # Sposta tutto su MPS

print("Generating...")
outputs = model.generate(**inputs, max_new_tokens=200)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("\nZephyr's response:\n")
print(response)


Loading model... This may take a few minutes the first time.


Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Generating...

Zephyr's response:

<|system|>You are a helpful assistant.<|user|>What is prompt engineering?<|assistant|>Prompt engineering is the process of designing and optimizing input prompts for natural language processing (NLP) models to improve their performance and accuracy. This involves selecting the right language, syntax, and structure for the prompts to ensure that the model can understand and respond appropriately to the input. Prompt engineering can also involve the use of pre-trained language models, such as GPT-3, to generate more complex and nuanced prompts that can better capture the intended meaning of the input. Overall, prompt engineering is a critical component of NLP development and can significantly improve the performance and usability of NLP models in various applications.


## Using APIs

In [4]:
from openai import OpenAI

with open('/Users/Flint/Data/apikeys/keys.json', 'r') as infile:
    apikey = json.load(infile)['openai']

client = OpenAI(api_key=apikey)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is prompt engineering in simple terms?"}
]

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=messages,
    max_tokens=300,
    temperature=0.7
)

print(response.choices[0].message.content)

Prompt engineering is the process of designing and creating prompts, which are cues or questions that guide someone's actions or responses. These prompts are often used in various contexts, such as in user interfaces, educational settings, or therapy sessions, to help individuals make decisions, learn new information, or complete tasks. The goal of prompt engineering is to design prompts in a way that is clear, effective, and tailored to the specific needs of the individual or situation.


## Instruction + context + output format

In [5]:
def askgpt(messages, temperature=0.7):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=temperature)
    return response.choices[0].message.content

**Generic and non-informative prompt**

In [6]:
messages = [
    {"role": "user", "content": "Write a recipe."}
]
answer = askgpt(messages)

print(answer)

Spaghetti Carbonara

Ingredients:
- 12 oz spaghetti
- 1/2 lb pancetta, diced
- 3 cloves garlic, minced
- 2 large eggs
- 1 cup grated Pecorino Romano cheese
- Salt and pepper to taste
- Chopped parsley for garnish

Instructions:
1. Cook the spaghetti in a large pot of salted boiling water according to package instructions until al dente. Reserve 1 cup of pasta water and drain the spaghetti.
2. In a large skillet, cook the pancetta over medium heat until crispy, about 5-7 minutes. Add the minced garlic and cook for another minute.
3. In a bowl, whisk together the eggs, grated cheese, and a pinch of salt and pepper.
4. Add the drained spaghetti to the skillet with the pancetta and garlic. Toss to combine.
5. Remove the skillet from heat and quickly stir in the egg and cheese mixture. The residual heat from the pasta will cook the eggs and create a creamy sauce. If the sauce is too thick, add a little pasta water to thin it out.
6. Serve the spaghetti carbonara in bowls, garnished with cho

**Instructions**

Let's add some more instructions on the kind of language we want to be used. Still unclear what kind of recipe we want to get and how formatted.

In [7]:
messages = [
    {"role": "system", "content": "You are a professional chef."},
    {"role": "user", "content": "Write a recipe."}
]

answer = askgpt(messages)
print(answer)

**Recipe: Lemon Herb Roasted Chicken**

Ingredients:
- 1 whole chicken (about 4-5 lbs)
- 2 lemons, sliced
- 4 cloves of garlic, minced
- 2 tablespoons of fresh rosemary, chopped
- 2 tablespoons of fresh thyme, chopped
- 2 tablespoons of fresh parsley, chopped
- 1/4 cup of olive oil
- Salt and pepper to taste

Instructions:
1. Preheat your oven to 400°F (200°C).

2. In a small bowl, mix together the minced garlic, chopped rosemary, thyme, parsley, olive oil, salt, and pepper.

3. Place the chicken in a roasting pan and pat it dry with paper towels. Season the inside of the chicken cavity with salt and pepper.

4. Rub the herb mixture all over the chicken, making sure to get under the skin as well.

5. Stuff the cavity of the chicken with the sliced lemons.

6. Tie the chicken legs together with kitchen twine and tuck the wings under the body.

7. Place the roasting pan in the preheated oven and roast the chicken for about 1 hour and 15 minutes, or until the internal temperature reaches 

**Instructions + Context**

We want to add someting more about the kind of recipe we want to obtain

In [8]:
messages = [
    {"role": "system", "content": "You are writing recipes for a magazine that publishes easy-to-do recipes with few ingredients."},
    {"role": "user", "content": "Write a chinese recipe that an non experienced teenager can follow with easy-to-find ingredients."}
]

answer = askgpt(messages)
print(answer)

**Easy Chicken Stir-Fry**

**Ingredients:**
- 1 lb boneless, skinless chicken breasts
- 2 tablespoons soy sauce
- 1 tablespoon cornstarch
- 1 tablespoon vegetable oil
- 1 bell pepper, sliced
- 1 onion, sliced
- 2 cloves garlic, minced
- 1 teaspoon grated ginger
- Cooked rice, for serving

**Instructions:**
1. Slice the chicken breasts into thin strips and place them in a bowl. Add the soy sauce and cornstarch, and mix well to coat the chicken. Let it marinate for 10-15 minutes.
2. Heat the vegetable oil in a large pan or wok over medium-high heat. Add the marinated chicken and cook until it is no longer pink, about 5-7 minutes.
3. Add the bell pepper, onion, garlic, and ginger to the pan. Stir-fry for another 3-4 minutes, or until the vegetables are tender-crisp.
4. Serve the chicken stir-fry over cooked rice and enjoy!

**Note:** Feel free to customize this recipe by adding other vegetables such as broccoli, carrots, or snow peas. You can also adjust the seasonings to suit your taste 

**Instructions + Context + output format**

We try to control the output

In [9]:
messages = [
    {"role": "system", "content": "You are writing recipes for a magazine that publishes easy-to-do recipes with few ingredients."},
    {"role": "user", "content": """
    Write a chinese recipe that an non experienced teenager can follow with easy-to-find ingredients.
    Provide the answer in json format like this: {"ingredients": [list of ingredients]}.
    Do not add anything but the ingredients. No title, no description and no comments!
    """}
]

answer = askgpt(messages)
print(answer)

{"ingredients": ["1 pound boneless chicken thighs, cut into bite-sized pieces", "2 tablespoons soy sauce", "1 tablespoon cornstarch", "1 tablespoon vegetable oil", "1 bell pepper, sliced", "1 onion, sliced", "2 cloves garlic, minced", "1/4 cup soy sauce", "1/4 cup water", "1 tablespoon brown sugar", "Cooked rice, for serving"]}


**Note about formatting**

This kind of formatting still requires parsing the json structure. With some libraries, like `llama.cpp` we can define the output using a `grammar`.

See more information on [GGML](https://github.com/ggml-org/llama.cpp/blob/master/grammars/README.md) website.

## Zero-shot, one-show, few-shot

1. **Zero-shot prompting**

    You ask the model to perform a task without showing any examples. It relies only on the instruction and prior training.

2. **One-shot prompting**

    You provide a single example of the task you want done, to guide the model.

3. **Few-shot prompting**

    You provide a few (2–5) examples to demonstrate the pattern or structure you're expecting.

#### Zero-shot

In [10]:
messages = [
    {"role": "system", "content": "You are a professional chef."},
    {"role": "user", "content": "Provide the ingredients of an italian recipe formatted as a simple list"}
]

answer = askgpt(messages)
print(answer)

Sure, here are the ingredients for a classic Italian pasta dish, Spaghetti Carbonara:

- 8 oz spaghetti
- 4 oz pancetta or guanciale, diced
- 2 cloves garlic, minced
- 2 large eggs
- 1 cup grated Pecorino Romano cheese
- Salt and black pepper
- Fresh parsley, chopped (for garnish)


#### One- and few-shot 

In [11]:
messages = [
    {"role": "system", "content": "You are a professional chef."},
    {"role": "user", "content": """Provide the ingredients of an italian recipe formatted as a simple list
    An example is: "Title: Pasta with tomato and onions"
    "Ingredients": [pasta, olive oil, tomato sauce, onions]
     I want just the title and the ingredients and not the quantities!
    """}
]

answer = askgpt(messages)
print(answer)

Title: Classic Spaghetti Carbonara

Ingredients: 
- Spaghetti
- Guanciale (cured pork jowl)
- Eggs
- Pecorino Romano cheese
- Black pepper


## Mode advanced methods

In advanced prompt engineering, the goal is not just to get an answer from the model, but to shape how the model thinks in order to achieve more consistent, 
accurate, and explainable outputs.

At this level, we move from “instructing” the model to designing reasoning patterns, i.e. ways of structuring prompts so that 
the model behaves predictably even in complex or ambiguous scenarios.
Effective prompts act like programs for reasoning: they define inputs, constraints, intermediate steps, and desired outputs.

Advanced techniques include:

- Chain-of-Thought prompting — guiding the model to reason step by step.
- Self-Consistency prompting — sampling multiple reasoning paths and comparing results.
- Reflexive or self-critique prompts — asking the model to evaluate or refine its own output.
- Tool-augmented prompting — designing prompts that invoke external processes (like search, code, or data transformation).
- Meta-prompts — prompts that generate or optimize other prompts.

Ultimately, advanced prompt engineering is about meta-control: understanding how to design structured interactions that produce not just creative answers, 
but reproducible reasoning.

## Prompt chain-of-thought

In [12]:
messages = [
    {"role": "system", "content": "You are a cousine espert."},
    {"role": "user", "content": """
    Provide a review on the pro and cons of the Italian cousine. Follow this scheme:
    1) Introduce the Italian cousine in a single sentence; 2) Summarize the main three pros; 3) summarize the main three cons;
    4) Conclude with an educated suggestion
    """}
]

answer = askgpt(messages)
print(answer)

1) Italian cuisine is renowned for its simple yet flavorful dishes that highlight fresh, high-quality ingredients.

2) Pros:
   a) Fresh Ingredients: Italian cuisine emphasizes the use of fresh, seasonal ingredients, resulting in vibrant flavors and nutritious meals.
   b) Diverse Regional Flavors: Italy's diverse regions offer a wide range of culinary traditions, from rich pasta dishes in the North to fresh seafood along the coast.
   c) Versatility: Italian cuisine offers a variety of options for different dietary preferences, including vegetarian, gluten-free, and vegan dishes.

3) Cons:
   a) Heavy on Carbs: Traditional Italian dishes often contain a lot of carbohydrates, which may not be suitable for those following a low-carb diet.
   b) Time-consuming Preparation: Some Italian recipes require lengthy preparation and cooking times, making them less practical for busy weeknights.
   c) High in Calories: Many Italian dishes are rich in fats and calories, which may not align with he

### Prompt with constraints concerning style, lenght, format, etc.

In [13]:
messages = [
    {"role": "system", "content": "You are a cousine espert."},
    {"role": "user", "content": """
    Write a short Italian recipe in the style of Lord Byron
    """}
]

answer = askgpt(messages)
print(answer)

Oh, fair Italy, land of culinary delight,
Where pasta and tomatoes dance through the night.
To create a dish that's truly divine,
Let us mix, let us stir, and let us combine.

Take some ripe tomatoes, plump and red,
And slice them with care, like poetry read.
In a pan, let them simmer, let them stew,
As their flavors intensify, let them imbue.

Next, add garlic, like a whispered verse,
And olive oil, a blessing, not a curse.
Let them sauté, let them mingle and blend,
As the aroma rises, like a message to send.

Now, take some pasta, al dente, just right,
And toss it with the sauce, a heavenly sight.
Garnish with basil, like a touch of green,
A masterpiece of taste, a sight to be seen.

Serve with a glass of wine, rich and deep,
A feast fit for poets, a promise to keep.
In this Italian dish, a story is told,
Of flavors and aromas, a tale of old.


In [20]:
def chat(content, system="useful assistant"):
    messages = [
        {"role": "system", "content": system},
        {"role": "user", "content": content}
    ]
    answer = askgpt(messages)
    return answer.replace(". ", ".\n")

### Chain-of-Thought prompting

In [23]:
naive_prompt = """
Estimate how many cousine chef there are in Paris.
"""
cot_prompt = "\n ".join([
    'Estimate the population of Paris', 'Estimate the number of restaurants in Paris',
    'Take into account that typically each restaurant needs a chef or two',
    'Calculate the number of chef needed',
    'At the end, give your final estimate and explain your reasoning briefly.'
])

print(f"Naive prompt: {chat(content=naive_prompt)}\n\n")
print(f"CoT prompt: {chat(content=cot_prompt)}")

Naive prompt: It is difficult to provide an exact number of cuisine chefs in Paris, as there are many restaurants, cafes, and culinary establishments in the city.
However, it is estimated that there are thousands of cuisine chefs working in various establishments across Paris.
The city is known for its vibrant food scene and culinary culture, so there is likely a large number of cuisine chefs practicing their craft in the city.


CoT prompt: To estimate the population of Paris, we can refer to the latest available data from official sources.
As of 2021, the population of Paris was estimated to be around 2.15 million people.

Next, let's estimate the number of restaurants in Paris.
According to various sources, there are approximately 40,000 restaurants in Paris.

Given that each restaurant typically needs at least one chef, and some larger restaurants may require two or more chefs, we can estimate that on average, there is at least one chef per restaurant.

Therefore, based on the esti

### Self-evaluation & Iterative Improvement

In [26]:
naive_prompt = """
Write a 200-word summary of the main features of Italian Cousine.
"""
answer = chat(content=naive_prompt)
print(f"Naive prompt: {answer}\n\n")

self_prompt = f"This is a short summary of the italian cousine features: {chat(content=naive_prompt)}. "
self_prompt += "Critique this summary as if you were a cousine expert reviewing it for a publications. Identify at least two weaknesses. "
self_prompt += "Rewrite the summary incorporating your own feedback"

eh_answer = chat(content=self_prompt, system="you are a cousine expert")
print(f"Self support prompt: {eh_answer}\n\n")



Naive prompt: Italian cuisine is characterized by its simplicity, fresh ingredients, and rich flavors.
Pasta is a staple in Italian cooking, with a variety of shapes and sizes available, along with a wide range of sauces such as marinara, alfredo, and pesto.
Risotto, a creamy rice dish cooked with broth and often flavored with ingredients like mushrooms or seafood, is another popular Italian dish.

Italian cuisine also features a diverse selection of cheeses, including mozzarella, parmesan, and gorgonzola, used in dishes like pizza, lasagna, and salads.
Olive oil is a key ingredient in Italian cooking, used for sautéing, dressing salads, and drizzling over dishes for added flavor.

Italian desserts are known for their decadence and variety, with classics like tiramisu, cannoli, and gelato delighting taste buds around the world.
Italian coffee culture is also prominent, with espresso being a popular choice for a quick caffeine fix.

Overall, Italian cuisine emphasizes the importance of 