# Prompting a Large Language Model


This is an example notebook for running an open source model from Hugging Face.

In [1]:
%pip install -r requirements.txt

Collecting jedi>=0.16 (from ipython>=4.0.0->ipywidgets->-r requirements.txt (line 4))
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m50.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi
Successfully installed jedi-0.19.2


In [1]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

We use the [Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) in this notebook due to its small size.

In [2]:
torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3.5-mini-instruct",
    device_map="cuda", # change to cuda if running on GPU
    torch_dtype="auto",
    trust_remote_code=False,
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3.5-mini-instruct")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json: 0.00B [00:00, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.67G [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/195 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

added_tokens.json:   0%|          | 0.00/306 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

An example few-shot prompt:

In [None]:
messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

In [None]:
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "do_sample": False,
}

Device set to use cpu


In [None]:
output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

 To solve the equation 2x + 3 = 7, follow these steps:

Step 1: Isolate the term with the variable (2x) by subtracting 3 from both sides of the equation.
2x + 3 - 3 = 7 - 3
2x = 4

Step 2: Solve for x by dividing both sides of the equation by the coefficient of x, which is 2.
2x / 2 = 4 / 2
x = 2

So, the solution to the equation 2x + 3 = 7 is x = 2.


1. Zero-shot Prompting

In [3]:
messages_zero = [
    {
        "role": "system",
        "content": "You are a helpful academic advisor."
    },
    {
        "role": "user",
        "content": """
Task:
You are an academic advisor.

A student must choose courses for the next semester with the following constraints:

Courses available:
- Math (3 credits, Mon 9–11)
- Physics (3 credits, Mon 10–12)
- Programming (4 credits, Tue 9–12)
- Statistics (3 credits, Wed 9–11)
- Economics (3 credits, Thu 10–12)

Constraints:
1. The student must take at least 9 credits.
2. No time conflicts are allowed.
3. Programming must be taken if possible.
4. The student prefers quantitative courses (Math, Physics, Statistics).

Output requirements (STRICT):
1. Selected courses (list)
2. Total credits
3. A brief justification (max 3 bullet points)

Do NOT show reasoning steps.
Do NOT include extra explanations.
"""  }
]

2. Few-shot Prompting

In [4]:
messages_fewshot = [
    {
        "role": "system",
        "content": "You are a helpful academic advisor."
    },
    {
        "role": "user",
        "content": """
Example:
Courses available:
- History (3 credits, Mon 9–11)
- Biology (3 credits, Tue 9–11)
- Chemistry (3 credits, Wed 9–11)

Constraints:
- At least 6 credits
- No time conflicts

Output:
Selected courses:
- History
- Biology

Total credits: 6

Justification:
- No time conflicts
- Meets minimum credit requirement
"""
    },
    {
        "role": "user",
        "content": """
Now complete the following task:

Task:
You are an academic advisor.

A student must choose courses for the next semester with the following constraints:

Courses available:
- Math (3 credits, Mon 9–11)
- Physics (3 credits, Mon 10–12)
- Programming (4 credits, Tue 9–12)
- Statistics (3 credits, Wed 9–11)
- Economics (3 credits, Thu 10–12)

Constraints:
1. The student must take at least 9 credits.
2. No time conflicts are allowed.
3. Programming must be taken if possible.
4. The student prefers quantitative courses (Math, Physics, Statistics).

Output requirements (STRICT):
1. Selected courses (list)
2. Total credits
3. A brief justification (max 3 bullet points)

Do NOT show reasoning steps.
Do NOT include extra explanations.
"""
    }
]

3. Chain-of-thought Prompting

In [5]:
messages_cot = [
    {
        "role": "system",
        "content": "You are a helpful academic advisor."
    },
    {
        "role": "user",
        "content": """
Before answering, think step by step about course conflicts, credit requirements, and student preferences.
Do NOT reveal your reasoning.
Only provide the final answer.

Task:
You are an academic advisor.

A student must choose courses for the next semester with the following constraints:

Courses available:
- Math (3 credits, Mon 9–11)
- Physics (3 credits, Mon 10–12)
- Programming (4 credits, Tue 9–12)
- Statistics (3 credits, Wed 9–11)
- Economics (3 credits, Thu 10–12)

Constraints:
1. The student must take at least 9 credits.
2. No time conflicts are allowed.
3. Programming must be taken if possible.
4. The student prefers quantitative courses (Math, Physics, Statistics).

Output requirements (STRICT):
1. Selected courses (list)
2. Total credits
3. A brief justification (max 3 bullet points)

Do NOT show reasoning steps.
Do NOT include extra explanations.
"""
    }
]

4. Tree-of-thought Prompting

In [None]:
messages_tot = [
    {
        "role": "system",
        "content": "You are a helpful academic advisor."
    },
    {
        "role": "user",
        "content": """
Internally consider multiple possible valid course combinations,
compare them, and select the best one according to the constraints and preferences.
Do NOT show this comparison.
Only provide the final answer.

Task:
You are an academic advisor.

A student must choose courses for the next semester with the following constraints:

Courses available:
- Math (3 credits, Mon 9–11)
- Economics (3 credits, Mon 10–12)
- Programming (4 credits, Tue 9–12)
- Statistics (3 credits, Wed 9–11)
- Physics (3 credits, Thu 10–12)

Constraints:
1. The student must take at least 9 credits.
2. No time conflicts are allowed.
3. Programming must be taken if possible.
4. The student prefers quantitative courses (Math, Physics, Statistics).

Output requirements (STRICT):
1. Selected courses (list)
2. Total credits
3. A brief justification (max 3 bullet points)

Do NOT show reasoning steps.
Do NOT include extra explanations.
"""
    }
]

In [9]:
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)

Device set to use cuda


In [10]:
def run_prompt(messages):
    prompt = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

    output = pipe(
        prompt,
        max_new_tokens=400,
        do_sample=False,
        return_full_text=False
    )
    return output[0]["generated_text"]

In [11]:
all_messages = {
    "ZERO": messages_zero,
    "FEWSHOT": messages_fewshot,
    "COT": messages_cot,
    "TOT": messages_tot,
}

results = {}

for name, msgs in all_messages.items():
    print(f"\n===== {name} =====")
    results[name] = run_prompt(msgs)
    print(results[name])

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.



===== ZERO =====
 Selected courses:
- Programming (4 credits)
- Math (3 credits)
- Physics (3 credits)
- Statistics (3 credits)

Total credits: 13

Justification:
- Programming course is included to meet the preference for quantitative courses and the constraint of taking it if possible.
- Math, Physics, and Statistics courses are selected to fulfill the preference for quantitative subjects and to reach the minimum of 9 credits.
- The schedule does not conflict as each course meets on different days.

===== FEWSHOT =====
 Selected courses:
- Math
- Physics
- Programming

Total credits: 10

Justification:
- Programming course included as per preference.
- Combined Math and Physics to meet quantitative course preference.
- Total credits exceed minimum requirement without time conflicts.

===== COT =====
 Selected courses: Programming, Math, Physics, Statistics
Total credits: 10

Justification:
- Programming course is included to meet the preference for quantitative courses and the const