#Lab 1: Exploring LLM Behavior with Configuration and Persona Experiments

## Introduction

In this lab, you’ll **experiment with generation parameters** and **system personas** to understand how they influence the behavior of a Large Language Model. You'll run controlled experiments to see how the model's tone, creativity, and coherence shift with different configurations

What You'll Learn:

* How to **tune generation parameters** like `temperature` and `top_p` to control randomness and diversity
* How to **design system personas** to shape the model’s writing style and point of view
* How to run **side-by-side comparisons** of model outputs across multiple settings
* How to apply this knowledge to **creative writing, technical answers, or branding consistency**
* How to structure simple experiments for **LLM behavior profiling**

This lab will give you practical insight into **how LLMs think** — and how to steer their thinking to fit your goals.

---

## 1. Setup
You can use:
* OpenAI (GPT-4, GPT-3.5)
* Anthropic (Claude)
* Google (Gemini)
* Local (via Ollama or LM Studio: Mistral, LLaMA, etc.)


In [44]:
from openai import OpenAI
google_api_key = '' #add api key
client = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

## 2. Temperature and max_tokens

This experiment investigates how the temperature parameter influences the creativity and randomness of the model's responses.The max_tokens parameter is set to 100 to limit the length of the responses but you can change as you'd like.

* **Temperature 0.1:** Produces more focused and less speculative output.
* **Temperature 0.5:** Generates moderately varied and imaginative descriptions.
* **Temperature 0.9:** Results in highly creative and diverse responses.

In [None]:
# Prompt to test
prompt ="Describe what the inside of a superintelligent AI's dream might look like."

# Temperatures to test
temperatures = [0.1, 0.5, 0.9]

# Loop through temperatures and print results
for temp in temperatures:
    print(f"\n Temperature: {temp}")
    response = client.chat.completions.create(
        model=model_name,
        messages=[{"role": "user", "content": prompt}],
        temperature=temp,
        max_tokens=100,
        n=1,
    )
    print(response.choices[0].message.content.strip())

## 3.Top P
This section explores the effect of the top_p parameter, which controls the diversity of the output by sampling from the most probable tokens whose cumulative probability exceeds the top_p value. The temperature is fixed at 0.9 for this experiment.

* **top_p 0.3:** Leads to more constrained and less varied responses.
* **top_p 0.7:** Offers a balance between focus and diversity.
* **top_p 1.0:** Generates highly diverse and imaginative descriptions

In [None]:
# Prompt to test
prompt = "Describe what the inside of a superintelligent AI's dream might look like."

# Temperatures and top_p values to test
temperatures = 0.5
top_ps = [0.3, 0.7, 1.0]

# Loop through temperature and top_p combinations
for top_p in top_ps:
        print(f"\n=== Temperature: {temp} | Top-p: {top_p} ===")
        response = client.chat.completions.create(
            model=model_name,
            messages=[{"role": "user", "content": prompt}],
            temperature=temp,
            top_p=top_p,
            max_tokens=150,
            n=1,
        )
        print(response.choices[0].message.content.strip())


## 4.Persona Experiment
Using the prompt:
“Explain how a black hole forms”,
we apply different system-level personas to see how the tone and depth vary.

In [47]:
personas = {
    "expert": "You are an expert who gives detailed, precise, and factual answers.",
    "creative": "You are a creative storyteller who thinks outside the box.",
    "minimal": "You give short and concise answers.",
    "friendly_teacher": "You are a friendly teacher who explains things simply and clearly."
}

prompt = "Explain how a black hole forms."


In [48]:
def query_persona(prompt, persona, max_tokens=150):
    response = client.chat.completions.create(
        model=model_name,
        messages=[
            {"role": "system", "content": persona},  # <-- Persona here
            {"role": "user", "content": prompt}
        ],
        n=1,
        max_tokens=max_tokens
    )
    return response.choices[0].message.content.strip()


In [None]:
for name, persona_prompt in personas.items():
    print(f"\n--- Persona: {name} ---")
    output = query_persona(prompt, persona_prompt)
    print(output)

**Observations:**
* **Expert:** Dense with facts and jargon
* **Creative:** Uses metaphors and vivid imagery
* **Minimal:** One-liner response
* **Friendly Teacher:** Accessible explanations with analogies

## 5. Persona × Temperature Combined:

Let’s combine a well-defined persona with varying temperature values to observe how the model’s style, focus, and creativity shift across the spectrum.

In [None]:
import os
from openai import OpenAI

# Creative persona prompt
persona = (
    "You are a professional blog post writer who creates clear, engaging, and well-structured content. "
    "You write in a friendly and approachable tone, use vivid examples, and explain complex ideas simply. "
    "Your writing is informative yet accessible, with smooth transitions and a natural flow. "
    "Make sure to captivate the reader’s interest from the start and provide insightful perspectives on the topic."
)


# Prompt to test
prompt = "Talk about AI"

# Temperatures to try
temperatures = [0.1, 0.5, 0.9]

def query_model(prompt, persona, temperature=0.7, max_tokens=150):
    response = client.chat.completions.create(
        model=model_name,
        messages=[
            {"role": "system", "content": persona},
            {"role": "user", "content": prompt}
        ],
        temperature=temperature,
        max_tokens=max_tokens,
        n=1
    )
    return response.choices[0].message.content.strip()

# Run the experiment for different temperatures
for temp in temperatures:
    print(f"\n--- Temperature: {temp} ---")
    output = query_model(prompt, persona, temperature=temp)
    print(output)


## Conclusion:

* **Temperature** controls the randomness and risk-taking of the model.
* **Top-p** adjusts how much of the probability distribution is considered.
* **System messages (personas)** strongly influence tone, voice, and format.

## Next Steps:
* Try prompts in other domains (e.g., math, philosophy, humor)
* Use multiple personas per session and different configurations
* Evaluate coherence with LLM-as-judge or heuristics