# Introduction to Prompt Engineering with Large Language Models

In this notebook, you'll learn how to interact with large language models (LLMs) using prompting.

We will explore:

- Basic prompts
- Improving prompts with context
- Few-shot examples
- Using prompt templates

<a href="https://colab.research.google.com/github/cbadenes/semantic-report-search/blob/main/data/analysis/40_prompting_basics.ipynb" target="_parent">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>


In [2]:
from huggingface_hub import login
import getpass

token = getpass.getpass("🔑 Enter your Hugging Face token: ")
login(token)


🔑 Enter your Hugging Face token: ··········


### Parameters for `pipeline("text-generation", ...)`

- **model**: The preloaded model (e.g., `AutoModelForCausalLM`).
- **tokenizer**: The tokenizer that matches the model (e.g., `AutoTokenizer`).

#### Generation Parameters:

- **max_length**:
  - Total length of input + generated output.
  - Use this for absolute control over sequence size.
  - ⚠️ Can be overridden by `max_new_tokens`.

- **max_new_tokens**:
  - Limits the number of *new* tokens the model should generate.
  - Use this instead of `max_length` when input varies in length.
  - Recommended for most prompting scenarios.

- **truncation**:
  - If `True`, cuts input to fit within model limits.
  - Useful when feeding long text as prompt.

- **do_sample**:
  - If `True`, the model samples from the probability distribution (randomness).
  - If `False`, it picks the most likely next token (greedy decoding).
  - Recommended: `True` + `temperature` tuning for creativity.

- **temperature**:
  - Controls the randomness of sampling.
  - Lower = more deterministic (e.g., 0.3), higher = more creative (e.g., 0.8).
  - Best used with `do_sample=True`.

- **return_full_text**:
  - If `True`, the result includes both the prompt and generated text.
  - If `False`, it returns only the model's output.


In [18]:
!pip install -q transformers accelerate

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

tokenizer = AutoTokenizer.from_pretrained(model_id, token=token)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", token=token)

#generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_length=512,          # Total max length of input + output
    truncation=True,         # Truncate input if it exceeds model's max length
    do_sample=True,          # Enable sampling (randomness); set False for greedy decoding
    return_full_text=False,  # If True, returns input + output; if False, only generated part
    temperature=0.3,         # Controls randomness; lower = more deterministic
    max_new_tokens=100       # Number of tokens to generate (output only)
)



Device set to use cuda:0


Basic Prompt:

In [19]:
prompt = (
    "<|system|>\nYou are a helpful assistant.\n"
    "<|user|>\nSummarize the purpose of a report that analyzes the distribution flows in feeder markets.\n"
    "<|assistant|>\n"
)
response = generator(prompt, max_new_tokens=200)[0]['generated_text']
print(response)


Both `max_new_tokens` (=200) and `max_length`(=512) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


A report analyzing the distribution flows in feeder markets is intended to provide insights into the distribution channels and the distribution flows within them. The purpose of such a report is to identify the key players, their distribution channels, and the distribution flows within them. The report may also provide insights into the competitive landscape, market trends, and industry dynamics. The report may also offer recommendations for improving distribution efficiency and reducing costs.


Improving Prompt with Context:

In [20]:
prompt = (
    "<|system|>\nYou are a helpful assistant that classifies business reports based on their description.\n"
    "<|user|>\n"
    "Here's an example of a report description and its purpose:\n"
    "Description: 'This report provides details about average daily rate (ADR), occupancy, and revenue trends in top-performing hotels.'\n"
    "Purpose: 'To analyze revenue performance metrics in hospitality properties.\n"
    "Now analyze this one:\n"
    "Description: 'Shows contribution of each channel to the market flow across different regions.\n"
    "Purpose:"
    "\n<|assistant|>\n"
)

response = generator(prompt, max_new_tokens=300)[0]['generated_text']
print(response)


Both `max_new_tokens` (=300) and `max_length`(=512) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


This report provides insights into the contribution of each channel to the market flow across different regions. The report aims to analyze the effectiveness of different marketing strategies and identify areas of improvement for the hospitality industry. The report also highlights the impact of different marketing channels on sales and revenue. The report's purpose is to help hoteliers and marketers understand the overall impact of their marketing strategies and make informed decisions for future campaigns.


Few-Shot Prompting:

In [21]:

prompt = (
    "<|system|>\nYou are a helpful assistant that classifies business reports based on their description.\n"
    "<|user|>\n"
    "Below are examples of report descriptions and their business categories:\n"
    "Description: 'Tracks performance metrics for feeder markets and hotel destinations.'\nCategory: 'Market Performance'\n"
    "Description: 'Analyzes average revenue per room and booking channels.'\nCategory: 'Revenue Management'\n"
    "Description: 'Evaluates staffing and service efficiency across hotel departments.'\nCategory: 'Operations'\n"
    "Now define the category of this one:\n"
    "Description: 'Breakdown of regional demand for group bookings.'\nCategory:"
    "\n<|assistant|>\n"
)
response = generator(prompt, max_new_tokens=50)[0]['generated_text']
print(response)


Both `max_new_tokens` (=50) and `max_length`(=512) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


The report category for this description is 'Market Performance.'
The report category for the second description is 'Revenue Management.'
The report category for the third description is 'Operations.'
The report category for the fourth description is 'Mark


CRISPE Prompt Template (Guided Practice):

In [22]:
prompt = (
    "<|system|>\nYou are a helpful assistant trained to classify hotel business reports based on their descriptions.\n"
    "<|user|>\n"
    "Context: You're working with a set of hotel business reports.\n"
    "Role: You are a data analyst assistant.\n"
    "Instructions: Classify the report into one of these categories: ['Market', 'Revenue', 'Operations', 'HR'].\n"
    "Steps:\n"
    "- Read the description\n"
    "- Identify the key business focus\n"
    "- Map to a category\n\n"
    "Parameters:\n"
    "- Use only one label\n"
    "- Be concise\n\n"
    "Example:\n"
    "Description: 'Shows ADR and occupancy over time per destination'\n"
    "Category: Revenue\n\n"
    "Now classify:\n"
    "Description: 'Summarizes guest complaints and staff resolution time'\n"
    "Category:"
    "\n<|assistant|>\n"
)

response = generator(prompt, max_new_tokens=300)[0]['generated_text']
print(response)


Both `max_new_tokens` (=300) and `max_length`(=512) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Based on the description, the report is classified as Revenue. The key focus is on guest complaints and staff resolution time.
