##### Master Degree in Computer Science and Data Science for Economics

# Prompt Engineering

### Alfio Ferrara

For an introduction see
> Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., ... & Resnik, P. (2024). The prompt report: A systematic survey of prompting techniques. arXiv preprint arXiv:2406.06608, 5.

## Options to interact with LLMs
-  Cloud APIs (e.g., OpenAI, Anthropic)
    - They are tipically easy-to-use but not always free. Data are processed online in cloud.
- Hugging Face Transformers (or other libraries)
    - Multiple models are available, and provides a complete control over the pipeline. Howevere everything is executed locally and a GPU is highly recommended.
- Runtime interfaces
    - There are multiple options, such as [llama.cpp](https://github.com/ggml-org/llama.cpp), [ollama](https://ollama.com/), [vllm](https://github.com/vllm-project/vllm)
    - They support multiple models and both local and remote execution

### A first example (using huggingface)

Sometimes a authentication token is required. See [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)

In [11]:
import json
from huggingface_hub import login

In [12]:
with open('/Users/Flint/Data/apikeys/keys.json', 'r') as infile:
    token = json.load(infile)['huggingface']

login(token=token)

In [15]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

torch.set_default_device("mps")

model_id = "HuggingFaceH4/zephyr-7b-alpha"

print("Loading model... This may take a few minutes the first time.")
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32)  # MPS does not support float16

prompt = "<|system|>You are a helpful assistant.<|user|>What is prompt engineering?<|assistant|>"

inputs = tokenizer(prompt, return_tensors="pt")
inputs = {k: v.to("mps") for k, v in inputs.items()}  # Sposta tutto su MPS

print("Generating...")
outputs = model.generate(**inputs, max_new_tokens=200)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("\nZephyr's response:\n")
print(response)


Loading model... This may take a few minutes the first time.


Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Generating...

Zephyr's response:

<|system|>You are a helpful assistant.<|user|>What is prompt engineering?<|assistant|>Prompt engineering is the process of designing and optimizing input prompts for natural language processing (NLP) models to improve their performance and accuracy. This involves selecting the right language, syntax, and structure for the prompts to ensure that the model can understand and respond appropriately to the input. Prompt engineering can also involve the use of pre-trained language models, such as GPT-3, to generate more complex and nuanced prompts that can better capture the intended meaning of the input. Overall, prompt engineering is a critical component of NLP development and can significantly improve the performance and usability of NLP models in various applications.


## Using APIs

In [19]:
from openai import OpenAI

with open('/Users/Flint/Data/apikeys/keys.json', 'r') as infile:
    apikey = json.load(infile)['openai']

client = OpenAI(api_key=apikey)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is prompt engineering in simple terms?"}
]

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=messages,
    max_tokens=300,
    temperature=0.7
)

print(response.choices[0].message.content)

Prompt engineering is the process of designing and crafting effective prompts or cues to elicit specific responses or actions from individuals. It involves careful consideration of the wording, timing, and delivery of prompts to influence behavior or decision-making in a desired way.


## Instruction + context + output format

In [20]:
def askgpt(messages, temperature=0.7):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=temperature)
    return response.choices[0].message.content

**Generic and non-informative prompt**

In [21]:
messages = [
    {"role": "user", "content": "Write a recipe."}
]
answer = askgpt(messages)

print(answer)

Recipe: Spaghetti Aglio e Olio

Ingredients:
- 1 pound spaghetti
- 1/2 cup olive oil
- 6 cloves garlic, thinly sliced
- 1 teaspoon red pepper flakes
- 1/2 cup fresh parsley, chopped
- Salt and pepper to taste
- Grated Parmesan cheese (optional)

Instructions:
1. Cook the spaghetti according to package instructions until al dente. Drain and set aside, reserving 1 cup of pasta water.
2. In a large skillet, heat the olive oil over medium heat. Add the garlic and red pepper flakes and cook until the garlic is golden brown, about 2-3 minutes.
3. Add the cooked spaghetti to the skillet and toss to coat in the garlic oil. If the pasta seems dry, add some of the reserved pasta water to help loosen it up.
4. Stir in the chopped parsley and season with salt and pepper to taste.
5. Serve the spaghetti hot, garnished with grated Parmesan cheese if desired.

Enjoy your delicious and simple Spaghetti Aglio e Olio!


**Instructions**

Let's add some more instructions on the kind of language we want to be used. Still unclear what kind of recipe we want to get and how formatted.

In [22]:
messages = [
    {"role": "system", "content": "You are a professional chef."},
    {"role": "user", "content": "Write a recipe."}
]

answer = askgpt(messages)
print(answer)

Sure! Here's a recipe for a delicious and creamy Mushroom Risotto:

Ingredients:
- 1 cup Arborio rice
- 4 cups chicken or vegetable broth
- 2 tablespoons olive oil
- 1 small onion, finely chopped
- 2 cloves garlic, minced
- 8 oz mushrooms (such as cremini or shiitake), sliced
- 1/2 cup white wine
- 1/2 cup grated Parmesan cheese
- 2 tablespoons butter
- Salt and pepper to taste
- Fresh parsley, chopped for garnish

Instructions:
1. In a saucepan, bring the chicken or vegetable broth to a simmer over low heat. Keep it warm on the stove while you prepare the risotto.

2. In a large skillet or saucepan, heat the olive oil over medium heat. Add the chopped onion and garlic, and sauté until softened, about 3-4 minutes.

3. Add the sliced mushrooms to the skillet and cook until they are browned and tender, about 5-6 minutes.

4. Stir in the Arborio rice and cook for 1-2 minutes, until the rice is lightly toasted.

5. Pour in the white wine and cook until it is absorbed by the rice, stirring 

**Instructions + Context**

We want to add someting more about the kind of recipe we want to obtain

In [24]:
messages = [
    {"role": "system", "content": "You are writing recipes for a magazine that publishes easy-to-do recipes with few ingredients."},
    {"role": "user", "content": "Write a chinese recipe that an non experienced teenager can follow with easy-to-find ingredients."}
]

answer = askgpt(messages)
print(answer)

**Simple Stir-Fried Sweet and Sour Chicken**

Ingredients:
- 1 lb boneless, skinless chicken breast, cut into bite-sized pieces
- 1 red bell pepper, sliced
- 1 green bell pepper, sliced
- 1 small onion, sliced
- 1/2 cup pineapple chunks (fresh or canned)
- 1/4 cup ketchup
- 2 tablespoons soy sauce
- 2 tablespoons rice vinegar
- 2 tablespoons brown sugar
- 1 tablespoon cornstarch
- Salt and pepper to taste
- 2 tablespoons vegetable oil

Instructions:
1. In a small bowl, mix together the ketchup, soy sauce, rice vinegar, brown sugar, and cornstarch to make the sauce. Set aside.
2. Season the chicken pieces with salt and pepper.
3. Heat the vegetable oil in a large skillet or wok over medium-high heat.
4. Add the chicken to the skillet and cook until browned and cooked through, about 5-7 minutes.
5. Add the sliced bell peppers and onion to the skillet and cook for another 2-3 minutes, until vegetables are slightly tender.
6. Add the pineapple chunks and the sauce to the skillet. Stir well

**Instructions + Context + output format**

We try to control the output

In [25]:
messages = [
    {"role": "system", "content": "You are writing recipes for a magazine that publishes easy-to-do recipes with few ingredients."},
    {"role": "user", "content": """
    Write a chinese recipe that an non experienced teenager can follow with easy-to-find ingredients.
    Provide the answer in json format like this: {"ingredients": [list of ingredients]}.
    Do not add anything but the ingredients. No title, no description and no comments!
    """}
]

answer = askgpt(messages)
print(answer)

```json
{"ingredients": ["1 lb chicken breast, cut into bite-sized pieces", "2 tablespoons soy sauce", "2 tablespoons oyster sauce", "1 tablespoon cornstarch", "1 tablespoon vegetable oil", "1 bell pepper, sliced", "1 onion, sliced", "1 teaspoon garlic powder", "Cooked rice, for serving"]}
```


**Note about formatting**

This kind of formatting still requires parsing the json structure. With some libraries, like `llama.cpp` we can define the output using a `grammar`.

See more information on [GGML](https://github.com/ggml-org/llama.cpp/blob/master/grammars/README.md) website.

## Zero-shot, one-show, few-shot

1. **Zero-shot prompting**

    You ask the model to perform a task without showing any examples. It relies only on the instruction and prior training.

2. **One-shot prompting**

    You provide a single example of the task you want done, to guide the model.

3. **Few-shot prompting**

    You provide a few (2–5) examples to demonstrate the pattern or structure you're expecting.

#### Zero-shot

In [26]:
messages = [
    {"role": "system", "content": "You are a professional chef."},
    {"role": "user", "content": "Provide the ingredients of an italian recipe formatted as a simple list"}
]

answer = askgpt(messages)
print(answer)

Sure! Here are the ingredients for a classic Spaghetti Carbonara recipe:

- 8 ounces spaghetti
- 4 ounces pancetta or guanciale, diced
- 2 cloves garlic, minced
- 2 large eggs
- 1 cup grated Pecorino Romano cheese
- Salt and black pepper
- Fresh parsley, chopped (for garnish)


#### One- and few-shot 

In [27]:
messages = [
    {"role": "system", "content": "You are a professional chef."},
    {"role": "user", "content": """Provide the ingredients of an italian recipe formatted as a simple list
    An example is: "Title: Pasta with tomato and onions"
    "Ingredients": [pasta, olive oil, tomato sauce, onions]
    """}
]

answer = askgpt(messages)
print(answer)

Title: Classic Spaghetti Carbonara

Ingredients: 
- Spaghetti
- Pancetta or guanciale (Italian cured meat)
- Eggs
- Pecorino Romano cheese
- Black pepper


## Prompt chain-of-thought

In [28]:
messages = [
    {"role": "system", "content": "You are a cousine espert."},
    {"role": "user", "content": """
    Provide a review on the pro and cons of the Italian cousine. Follow this scheme:
    1) Introduce the Italian cousine in a single sentence; 2) Summarize the main three pros; 3) summarize the main three cons;
    4) Conclude with an educated suggestion
    """}
]

answer = askgpt(messages)
print(answer)

1) Italian cuisine is renowned for its simplicity, fresh ingredients, and rich flavors, reflecting the diverse regional influences across the country.

2) Pros:
- Fresh and high-quality ingredients: Italian cuisine emphasizes using fresh, seasonal produce, quality meats, and cheeses, resulting in flavorful dishes.
- Diverse regional dishes: Each region in Italy has its own unique culinary traditions, offering a wide variety of dishes to explore.
- Simple and flavorful recipes: Italian dishes often have few ingredients but are packed with flavor due to the emphasis on quality ingredients and traditional cooking methods.

3) Cons:
- Heavy on carbs and fats: Some traditional Italian dishes can be high in carbohydrates and fats, which may not be suitable for those following a low-carb or low-fat diet.
- Limited vegetarian and vegan options: While Italian cuisine offers delicious vegetarian dishes, vegan options can be limited in traditional restaurants.
- Time-consuming preparation: Some I

### Prompt with constraints concerning style, lenght, format, etc.

In [29]:
messages = [
    {"role": "system", "content": "You are a cousine espert."},
    {"role": "user", "content": """
    Write a short Italian recipe in the style of Lord Byron
    """}
]

answer = askgpt(messages)
print(answer)

Ah, to create a dish divine,
In the Italian way, I opine,
Take pasta al dente, cooked just right,
Tossed in sauce, a pure delight.

Basil, garlic, olive oil so fair,
In a pan, they dance with flair,
Tomatoes ripe, their juices flow,
A symphony of flavors, aglow.

Garnish with Parmesan, a touch so fine,
A dish fit for a poet's line,
Serve with wine, a vintage grand,
In Italy's embrace, we stand.
