## Lab Assignment - 2

#### Implement all Prompt Engineering approaches- Interview Approach, COT, TOT. Compare and contrast all and analyse their applications.  Implement Zero shot and Few shot prompting and compare their results.


---

# About the Model Used

## Model Name:
gpt-oss-120b

## Parameter Size:
120 Billion Parameters (120B)

## Hosted Via:
Clarifai (OpenAI-Compatible API format)

## Inferencing LLM from 'clarifai'
You will require to get your API key from this [website](https://clarifai.com/openai/chat-completion/models/gpt-oss-120b?tab=runWithApi).







In [None]:
import os
from openai import OpenAI
from google.colab import userdata

# Get Clarifai API Key from Colab secrets
CLARIFI_API = userdata.get('CLARIFI_API')

# Initialize client
client = OpenAI(
    base_url="https://api.clarifai.com/v2/ext/openai/v1",
    api_key=CLARIFI_API,
)

# Model URL
MODEL = "https://clarifai.com/openai/chat-completion/models/gpt-oss-120b/versions/f1d2ad8c01c74705868f5c8ae4a1ff7c"

def call_llm(messages, temperature=0.7):
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=temperature,
        stream=False,
    )
    return response.choices[0].message.content


# Interview Approach

The Interview Approach gathers information step-by-step before generating a final response.

Instead of directly answering, the model asks clarifying questions to better understand context.



In [None]:
interview_messages = [
    {"role": "system", "content": "You are a tech assistant."},
    {"role": "user", "content": "Before suggesting a laptop for me, ask 3 questions to understand my needs."}
]

print("Interview Approach Output")
print(call_llm(interview_messages))

Interview Approach Output
Sure! To recommend the best laptop for you, could you let me know:

1. What will you primarily use the laptop for (e.g., gaming, video editing, software development, everyday office tasks, student work, etc.)?  
2. Do you have any preferences or requirements for size, weight, or battery life (e.g., need something ultra‑portable, or a larger screen for multitasking)?  
3. What’s your budget range and any specific features you’d like (e.g., touch screen, dedicated graphics, high‑resolution display, specific ports, etc.)?


Observation:
The model asks clarifying questions instead of directly providing a solution.
This demonstrates controlled conversational prompting.


# Chain of Thought (CoT)

Chain of Thought prompting forces the model to reason step-by-step before giving a final answer.

Useful for:
- Mathematical reasoning
- Logical problems
- Multi-step tasks

In [None]:
cot_messages = [
    {"role": "system", "content": "Solve step by step using plain text only. Do not use LaTeX formatting."},
    {"role": "user", "content": "What is 25 × 12? Let's think step by step."}
]

print("Chain of Thought Output")
print(call_llm(cot_messages))


Chain of Thought Output
To multiply 25 by 12, break the problem into simpler parts.

1. **Separate the multiplier (12) into tens and ones**  
   - 12 = 10 + 2  

2. **Multiply 25 by each part**  
   - 25 × 10 = 250 (just add a zero to 25)  
   - 25 × 2 = 50 (because 25 + 25 = 50)  

3. **Add the two results together**  
   - 250 + 50 = 300  

So, 25 × 12 = **300**.


Observation:
The model explains intermediate reasoning steps before giving the final answer.
This improves transparency and reasoning accuracy.


# Tree of Thought (ToT)

Tree of Thought prompting explores multiple reasoning paths before selecting the best solution.

Unlike CoT (linear reasoning), ToT generates multiple solution branches and evaluates them.


In [None]:
tot_messages = [
    {"role": "system", "content": "You are a student mentor."},
    {"role": "user", "content": """
A student wants to improve exam marks.

Generate 3 possible strategies.
Briefly compare them.
Select the best one.
"""}
]

print("Tree of Thought Output")
print(call_llm(tot_messages))


Tree of Thought Output
**Three Strategies for Boosting Exam Marks**

| # | Strategy | Core Idea | Main Benefits | Potential Drawbacks |
|---|----------|-----------|---------------|----------------------|
| 1 | **Active‑Recall + Spaced‑Repetition System** (e.g., flashcards, Anki) | Instead of rereading notes, you repeatedly test yourself on the material, and the intervals between reviews grow as you demonstrate mastery. | • Proven to increase long‑term retention <br>• Makes study time highly efficient <br>• Gives immediate feedback on what you truly know | • Requires initial set‑up time to create good cards <br>• May feel “hard” at first, discouraging some learners |
| 2 | **Structured Study Schedule + Regular Practice Exams** | Build a weekly calendar that blocks dedicated “focus blocks” for each subject, and embed timed practice tests every 1‑2 weeks. | • Encourages consistent effort and reduces cramming <br>• Practice exams reveal knowledge gaps and improve test‑taking stamina <br>• 

Observation:
The model generates multiple reasoning branches,
evaluates them, and selects the most optimal solution.

This demonstrates structured multi-path reasoning.


# Comparison: Interview vs CoT vs ToT

| Technique | Reasoning Type | Best For |
|------------|---------------|----------|
| Interview Approach | Iterative questioning | Requirement clarification |
| Chain of Thought | Linear step-by-step reasoning | Math & logic |
| Tree of Thought | Multi-path reasoning | Strategic decisions |

Conclusion:
Prompt structure significantly influences reasoning behavior without changing the underlying model.


# Zero-shot vs Few-shot Prompting


## Zero-Shot Prompting

**Zero-shot = no examples given.**
You just tell the model **what to do**, and it relies purely on its pre-training.

### Idea

> “You already know this stuff. Just do it.”

### Example

```text
Classify the sentiment of this text as Positive, Negative, or Neutral:
"I love how fast this app is."


In [None]:
zero_shot_messages = [
    {
        "role": "system",
        "content": (
            "You are a restaurant review classifier. "
            "For each review, output sentiment and primary issue. "
            "Format: <Sentiment> - <Issue>. Output only this."
        ),
    },
    {
        "role": "user",
        "content": (
            "I never leave this place disappointed. All the dishes are good and well served. "
            "The staff is attentive and suggests good menu options."
        ),
    },
]

zero_shot_output = call_llm(zero_shot_messages)
print("ZERO-SHOT OUTPUT:", zero_shot_output)


ZERO-SHOT OUTPUT: Positive - Food


## Few-Shot Prompting

**Few-shot = give a few examples** before asking the actual question.

### Idea

> “Here’s the pattern I want. Follow it.”

### Example

```text
Classify the sentiment of the text.

Text: "This product is terrible."
Sentiment: Negative

Text: "Amazing experience, will buy again!"
Sentiment: Positive

Text: "The app is okay, nothing special."
Sentiment:
```

Model learns the **structure + logic** from examples.

### When to use

* Task is **nuanced or custom**
* You need **consistent formatting**
* Domain-specific logic (finance, medical, logs, code, etc.)

### Limitations

* Longer prompts → higher cost
* Too many examples can confuse or bias the model


In [None]:
few_shot_messages = [
    {
        "role": "system",
        "content": (
            "You are a restaurant review classifier. "
            "For each review, output sentiment and primary issue. "
            "Format: <Sentiment> - <Issue>. Output only this."
            "None can new be the output, Give whatever you think is best below are some examples"
        ),
    },

    {
        "role": "user",
        "content": (
            "Overall okay. Biriyani was not as good. The rest of the items were good. "
            "Kadai paneer was too high a price for the quantity served. Ambiance is nice."
        ),
    },
    {"role": "assistant", "content": "Positive - High Price"},

    {
        "role": "user",
        "content": (
            "There's no menu card. Staff has to say the menu which makes customers uncomfortable. "
            "Overall taste wise average and overhyped and overpriced."
        ),
    },
    {"role": "assistant", "content": "Negative - Menu Card Issue"},

    {
        "role": "user",
        "content": (
            "Food tastes good but the service is very slow and waiting time is high."
        ),
    },
    {"role": "assistant", "content": "Negative - Slow Service"},

    {
        "role": "user",
        "content": (
            "The ambiance is beautiful and the staff is polite, but the food quality is inconsistent."
        ),
    },
    {"role": "assistant", "content": "Positive - Food Consistency"},
    {
        "role": "user",
        "content": (
            "I never leave this place disappointed. All the dishes are good and well served. "
            "The staff is attentive and suggests good menu options."
        ),
    },
]

output = call_llm(few_shot_messages)
print("MODEL OUTPUT:", output)

MODEL OUTPUT: Positive - No Issue



# Comparison & Analysis

ANALYSIS
========

1. Tone & Consistency

    Zero-shot:
    - Relies purely on the system instruction.
    - Tone may vary slightly between runs.
    - Persuasion strategy is more generic.

    Few-shot:
    - Strongly aligned with the provided examples.
    - Maintains consistent framing: "comb as scalp-care & confidence tool".
    - More controlled and predictable sales pitch.

2. Creativity vs Control

    Zero-shot:
    - More creative and flexible.
    - Might take unexpected angles.

    Few-shot:
    - Less creative, more on-rails.
    - Follows the demonstrated narrative closely.

3. Cost & Prompt Length

    Zero-shot:
    - Shorter prompt → cheaper and faster.

    Few-shot:
    - Longer prompt → higher token usage.
    - Worth it when precision matters.

4. When to Use Which

    Use Zero-shot when:
    - Task is simple or exploratory.
    - You want quick iteration.

    Use Few-shot when:
    - You need strict tone, format, or logic.
    - You're building production agents (sales, support, decisioning).

CONCLUSION

  - Zero-shot is "tell the model what to do".
  - Few-shot is "show the model how to do it".
  
    In real systems, a hybrid (clear rules + 1–3 examples) works best.