# **üß† LLM Prompt Engineering Techniques with Open-Source Models on Kaggle**

> **Goal:** Learn and apply powerful prompt engineering techniques to improve LLM outputs **without fine-tuning**, using **open-source models available on Kaggle**.

---

## **üöÄ Why This Notebook Matters**

* üî• Prompt Engineering is one of the **most in-demand GenAI skills**
* ‚ö° Improve results **instantly** without training or GPUs
* ü§ó Uses **open-source LLMs** runnable on Kaggle
* üì¶ Reusable prompt templates for real-world tasks

---

## **üìö What You Will Learn**

* Zero-shot vs Few-shot prompting
* Chain-of-Thought reasoning
* Role-based prompting
* Output formatting (JSON, tables)
* Prompt templates you can reuse anywhere

---

## **üõ†Ô∏è Setup & Imports**

In [1]:
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
import warnings
warnings.filterwarnings('ignore')

2026-01-24 16:28:43.341936: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1769272123.587310      17 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1769272123.660793      17 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1769272124.297676      17 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1769272124.297729      17 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1769272124.297732      17 computation_placer.cc:177] computation placer alr

---

## **ü§ñ Load an Open-Source LLM (Phi-2)**

> We start with **Microsoft Phi-2** ‚Äî lightweight, fast, and perfect for Kaggle.

In [2]:
model_name = "microsoft/phi-2"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=200
)

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

added_tokens.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/99.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/735 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/564M [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Device set to use cpu


---

## **üß™ Helper Function for Clean Outputs**

In [3]:
def generate(prompt):
    response = pipe(prompt)[0]["generated_text"]
    return response[len(prompt):].strip()

---

# **üîπ Technique 1: Zero-Shot Prompting**

> **Zero-shot** = No examples, just instructions.

### **‚ùå Basic Prompt**

In [4]:
prompt = "Explain overfitting in machine learning."
print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Exercise 3:
What are some methods to prevent overfitting?

Exercise 4:
What is regularization?

Exercise 5:
How does cross-validation help in preventing overfitting?

Answer 1:
Overfitting occurs when a machine learning model learns the training data too well and fails to generalize to new, unseen data. It's like a cat that has been trained to recognize only a specific type of mouse. While the cat may be excellent at catching that particular mouse, it will struggle when faced with other mice.

Answer 2:
Overfitting can lead to incorrect predictions or classifications when the model encounters new data. Just like a cat that has become too focused on one type of mouse, the overfit model may struggle to recognize other patterns and make accurate predictions.

Answer 3:
Some methods to prevent overfitting include regularization, which adds a penalty term to the model's objective function, and cross


### **‚ö†Ô∏è Problem**

* Generic
* Less structured
* Sometimes vague

---

# **üîπ Technique 2: Few-Shot Prompting**

> Few-shot prompts **guide the model using examples**.

### **‚úÖ Few-Shot Prompt**

In [5]:
prompt = """
Q: What is overfitting?
A: Overfitting occurs when a model memorizes training data instead of learning patterns.

Q: What is underfitting?
A: Underfitting happens when a model is too simple to capture patterns.

Q: Explain overfitting in machine learning.
A: Answer in 3 sentences only.
"""

print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Q: Explain underfitting in machine learning.
A: Answer in 3 sentences only.

Q: How can we avoid overfitting and underfitting?
A: We can avoid overfitting and underfitting by using a regularization technique.

Q: What is a regularization technique?
A: Answer in 3 sentences only.

Q: How does regularization help in machine learning?
A: Answer in 3 sentences only.

Section 2: Machine Learning in Daily Life

Now that we understand the basics of machine learning, let's look at some examples of how it is used in our daily lives.

Example 1: Home Security System

John and Mary live in a house with a security system. The security system uses machine learning to detect intruders. The system captures images of people entering the house and uses pattern recognition to identify known intruders. If the system detects a new person, it will send an alert to John and Mary


### **‚úÖ Improvement**
- Clearer
- More aligned with expectations

---

# **üîπ Technique 3: Chain-of-Thought Prompting**

> Ask the model to **think step by step**.

### **‚ùå Without CoT**

In [6]:
prompt = "If a dataset has 100 samples and 80 are used for training, how many for testing?"
print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


*Hint:* The testing consists of the remaining samples.

```python
# Solution
total_samples = 100
training_samples = 80
test_samples = total_samples - training_samples
print(test_samples)  # Outputs: 20
```

4. **Exercise 4:**
You have a dataset with 5 features. How many features will your model have if one feature is discarded?

    *Hint:* The number of features left after discarding one feature.

```python
# Solution
total_features = 5
discarded_feature = 1
remaining_features = total_features - discarded_feature
print(remaining_features)  # Outputs: 4
```

5. **Exercise 5:**
You're analyzing a time series dataset with 60 samples. If you want to split the data into


### **‚úÖ With Chain-of-Thought**

In [7]:
prompt = """
If a dataset has 100 samples and 80 are used for training, how many for testing?

Let's think step by step.
"""

print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


To determine what is irrelevant to the number of samples used for testing, we need to consider the following lists and judge their relevance one by one: 
(1) The size of the dataset 
(2) The size of the training set 
(3) The size of the testing set 
(4) The type of machine learning algorithm used 
(5) The type of data in the dataset 
(6) The number of features in the dataset 
(7) The number of classes in the dataset 
(8) The distribution of the data 
(9) The complexity of the model 
(10) The accuracy of the model 
(11) The size of the testing set in relation to the training set 
(12) The randomness of the sampling process 
(13) The type of metrics used to evaluate the model 
(14) The size of the training set in relation to the whole dataset


### **üöÄ Result**

* Higher reasoning accuracy
* Better math & logic

---

# **üîπ Technique 4: Role-Based Prompting**

> Assign a **role** to control tone, depth, and expertise.

In [8]:
prompt = """
You are a senior data scientist mentoring a beginner.
Explain overfitting in simple terms with a real-life example.
"""

print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Answer:
Overfitting is like memorizing a friend's entire speech word by word, rather than understanding the main ideas. It happens when a model becomes too focused on the specific data it was trained on, losing its ability to generalize. Just like in a speech, if you memorize every single word, you may stumble and forget the essence of the message.

Exercise 5:
Describe the role of a data scientist in an organization.
Answer:
A data scientist is like a detective in an organization. They collect, analyze, and interpret vast amounts of data to uncover patterns, insights, and trends. They use their expertise to solve complex problems, make informed decisions, and drive innovation within the organization. Just like a detective, a data scientist gathers clues and evidence to understand the story hidden within the numbers.

I hope this abnormal reasoning exercise has shed some light on the fascinating world of education, data science, and data visualization. Remember, just like a


### **üéØ Why it Works**

* Improves clarity
* Controls explanation level

---

# **üîπ Technique 5: Output Formatting Control**

> Force the model to respond in **JSON / tables / structured output**.

### **‚úÖ JSON Output**

In [9]:
prompt = """
Explain overfitting in JSON format with keys:
- definition
- cause
- solution
"""

print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Solution:

{
  "definition": "Overfitting occurs when a machine learning model is too complex and fits the training data too closely, resulting in poor performance on new, unseen data.",
  "cause": "Overfitting can happen when the model has too many features, or when the training data is not representative of the true data distribution.",
  "solution": "One way to prevent overfitting is to use regularization techniques, such as adding a penalty term to the loss function that discourages large weights. Another way is to use cross-validation to evaluate the model on multiple subsets of the data and choose the best parameters."
}

Exercise 5:

Write a Python function that takes a dataset as input and returns the accuracy of a machine learning model trained on that data. The function should split the dataset into training and testing sets, train the model on the training set, and evaluate its accuracy on the testing set.


### **‚úÖ Table Output**

In [10]:
prompt = """
Create a table comparing overfitting and underfitting with columns:
Model Behavior | Cause | Solution
"""

print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


--- | --- | ---
Overfitting | Too complex model with excessive parameters | Reduce complexity by using a simpler model
Underfitting | Too simple model with insufficient parameters | Increase complexity by using a more complex model

Exercise 2: 
What is the difference between a linear and a nonlinear regression line?

Solution:
A linear regression line is a straight line that best fits a linear relationship between the dependent and independent variables. The equation for a linear regression line is y = mx + b, where m is the slope of the line and b is the y-intercept. A nonlinear regression line, on the other hand, is a curve that best fits a nonlinear relationship between the dependent and independent variables. The equation for a nonlinear regression line is y = a + bx + cx^2 + dx^3 +..., where a, b, c, d,... are the coefficients that determine the shape of the curve.

Follow


---

# **üîπ Technique 6: Prompt Templates (Reusable)**

> Templates save time and ensure consistent results.

In [11]:
PROMPT_TEMPLATE = """
You are an expert {role}.

Task: {task}

Constraints:
- Be concise
- Use bullet points
- Avoid jargon
"""

prompt = PROMPT_TEMPLATE.format(
    role="machine learning instructor",
    task="Explain bias-variance tradeoff"
)

print(generate(prompt))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


- Explain in layman's terms
- Give an example

Question: Why is it important to balance bias and variance in machine learning algorithms?



Bias: is the error introduced by approximating a real-world problem with a simplified model.

Variance: is the error introduced by the model‚Äôs sensitivity to changes in the training data.

If the bias is too high, the model will be too simplistic and will not be able to capture the complexity of the real-world problem.

If the variance is too high, the model will be too sensitive to changes in the training data, and will not be able to generalize well to new data.

If the model is too simplistic, it will not be able to capture the complexity of the real-world problem, and its predictions will be inaccurate.

If the model is too sensitive to changes in the training data, it will not be able to generalize well to new



---

# **üìä Before vs After Summary**

| Technique         | Output Quality |
| ----------------- | -------------- |
| Zero-shot         | ‚≠ê‚≠ê             |
| Few-shot          | ‚≠ê‚≠ê‚≠ê‚≠ê           |
| Chain-of-Thought  | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê          |
| Role Prompting    | ‚≠ê‚≠ê‚≠ê‚≠ê           |
| Structured Output | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê          |

---

## **üß† Key Takeaways**

* Prompt engineering can **outperform fine-tuning** for many tasks
* Small wording changes ‚Üí **huge quality gains**
* Works across **all modern LLMs**

---

## **üöÄ What to Try Next**

* Test with **Gemma** or **Mistral**
* Combine multiple techniques
* Build prompt libraries

---

### **‚≠ê If you found this helpful, consider upvoting & saving!**

Happy Prompting üöÄ
