# Instruction engineering
This notebook focuses on instruction engineering, a critical part of prompt engineering that focuses on writing clear, structured, and effective instructions for language models. The quality of instructions directly influences the relevance, accuracy, and usability of the model's output.

As language models become more advanced, providing well-structured, concise, and specific instructions becomes increasingly important. Well-crafted prompts guide the model to generate responses that are not only relevant but also coherent and aligned with the task's goals. In this notebook, we will explore techniques for designing effective instructions, balancing the level of specificity and generality, and refining prompts iteratively for optimal results.

In [1]:
import os
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

### Initialize the language model
We instantiate a lightweight GPT model from OpenAI using LangChain.

In [2]:
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini-2024-07-18")

## Crafting clear instructions
The clarity of our instruction significantly impacts the quality of a model's output. Let’s compare two prompts: one vague and one detailed, to observe how LLMs handle each. We will demonstrate how the precision and scope of instructions affect the informativeness and relevance of the model’s response.

In [3]:
# A vague instruction – lacks specifics
vague_instruction = "Tell me about climate change concisely."
# A clear instruction – explicitly defines scope and focus
clear_instruction = "Provide a concise summary of the primary causes and effects of climate change, focusing on scientific consensus from the past five years concisely."

# Invoke model with both prompts
print("Vague Instruction Output:")
print(llm.invoke(vague_instruction).content)

print("\nClear Instruction Output:")
print(llm.invoke(clear_instruction).content)

Vague Instruction Output:
Climate change refers to long-term alterations in temperature, precipitation, and other atmospheric conditions on Earth, primarily driven by human activities such as burning fossil fuels, deforestation, and industrial processes. This results in increased greenhouse gas emissions, leading to global warming, rising sea levels, extreme weather events, and disruptions to ecosystems and biodiversity. Mitigating climate change involves reducing emissions, transitioning to renewable energy sources, and enhancing sustainability practices.

Clear Instruction Output:
**Primary Causes of Climate Change:**

1. **Greenhouse Gas Emissions:** The burning of fossil fuels (coal, oil, and natural gas) for energy and transportation is the largest source of greenhouse gases (GHGs), notably carbon dioxide (CO2) and methane (CH4).
2. **Deforestation:** Land-use changes, particularly deforestation for agriculture and urban development, reduce carbon sequestration and increase CO2 le

This block demonstrates how instruction clarity affects model output. The clear version sets better boundaries: scope (causes and effects), focus (scientific consensus), and timeframe (last five years). The model output will reflect this increased specificity with more targeted content.


## Effective instruction structures
Now we test how format and structure how a model organizes its output. We compare two common styles:
- Bullet-style prompts, which suggest structured, point-wise answers.
- Narrative-style prompts, which encourage natural language flow.



In [4]:
# Instruction using bullet-point format
bullet_structure = """
Explain the process of photosynthesis concisely:
- Define photosynthesis
- List the main components involved
- Describe the steps in order
- Mention its importance for life on Earth
"""

# Instruction using narrative, conversational tone
narrative_structure = """
Imagine you're a botanist explaining photosynthesis to a curious student.
Start with a simple definition, then walk through the process step-by-step,
highlighting the key components involved. Conclude by emphasizing why
photosynthesis is crucial for life on Earth. Write it concisely.
"""

# Invoke model with both prompts
print("Bullet Structure Output:")
print(llm.invoke(bullet_structure).content)

print("\nNarrative Structure Output:")
print(llm.invoke(narrative_structure).content)

Bullet Structure Output:
### Definition of Photosynthesis
Photosynthesis is the biochemical process by which green plants, algae, and some bacteria convert light energy, usually from the sun, into chemical energy stored in glucose, using carbon dioxide and water.

### Main Components Involved
1. **Chlorophyll** - The green pigment in plants that captures light energy.
2. **Light Energy** - Typically from sunlight, which drives the process.
3. **Water (H₂O)** - Absorbed by roots from the soil.
4. **Carbon Dioxide (CO₂)** - Taken in from the atmosphere through stomata.
5. **Glucose (C₆H₁₂O₆)** - The sugar produced as a result of photosynthesis.
6. **Oxygen (O₂)** - A byproduct released into the atmosphere.

### Steps in Order
1. **Light Absorption**: Chlorophyll absorbs sunlight, energizing electrons.
2. **Water Splitting (Photolysis)**: Light energy splits water molecules into oxygen, protons, and electrons.
3. **Oxygen Release**: The oxygen produced is released as a byproduct.
4. **Ene

This shows how different instruction formats can tailor the tone and flow of generated content. Bullet points promote structured, compact outputs; narratives encourage storytelling and natural phrasing. Choosing one depends on our downstream use case (e.g., structured responses vs. conversational agents).

## Balancing specificity and generality in Instructions
Prompts can be highly specific or more general. Specificity reduces ambiguity and increases accuracy, but generality can be useful for creative or exploratory tasks. Next, we explore how instruction specificity level changes model output. We will prompt the model to summarize a time-travel film in two ways: one specific, one general.

In [5]:
# Specific: references exact film, characters, plot points
specific_instruction = """
Describe the plot of the 1985 film 'Back to the Future', focusing on:
1. The main character's name and his friendship with Dr. Brown
2. The time machine and how it works
3. The specific year the main character travels to and why it's significant
4. The main conflict involving his parents' past
5. How the protagonist resolves the issues and returns to his time
Limit your response to 150 words.
"""

# General: open-ended, abstract task
general_instruction = """
Describe the plot of a popular time travel movie from the 1980s. Include:
1. The main characters and their relationships
2. The method of time travel
3. The time period visited and its significance
4. The main conflict or challenge faced
5. How the story is resolved
Keep your response around 150 words.
"""

# Invoke model with both prompts
print("Specific Instruction Output:")
print(llm.invoke(specific_instruction).content)

print("\nGeneral Instruction Output:")
print(llm.invoke(general_instruction).content)

Specific Instruction Output:
In the 1985 film "Back to the Future," the main character, Marty McFly, is a teenager who shares a close friendship with eccentric inventor Dr. Emmett Brown. Dr. Brown creates a time machine out of a DeLorean car, powered by plutonium and requiring a speed of 88 miles per hour to travel through time. Marty accidentally travels back to 1955, a pivotal year where he inadvertently disrupts his parents' meeting, threatening his own existence. The main conflict arises as Marty must ensure his parents fall in love to secure his future. With Dr. Brown's guidance, Marty orchestrates a series of events to bring his parents together during the Enchantment Under the Sea dance. Successfully resolving the issues, Marty returns to 1985, where he finds his life has improved due to the positive changes in his parents’ relationship.

General Instruction Output:
In the 1985 classic "Back to the Future," teenager Marty McFly, played by Michael J. Fox, is accidentally sent bac

Specific instructions produce precise, grounded responses. General ones provide creative flexibility, but can introduce ambiguity or generic details. Our choice should align with how constrained or creative our application needs to be.

## Iterative refinement
No prompt is perfect on the first try. Often, refining our prompt incrementally leads to much better results. Prompt refinement is a feedback-driven process. We start with a base instruction, evaluate the model's response, and then enhance the instruction to improve relevance, detail, or formatting.

Here, we will show how an initial, vague prompt can be transformed into a much more useful and complete one through iteration.


In [6]:
# First draft: simple instruction
initial_instruction = "Explain how to make a peanut butter and jelly sandwich."

# Get model output
print("Initial Instruction Output:")
initial_output = llm.invoke(initial_instruction).content
print(initial_output)

# Improved version: more detailed, clear, and safety-conscious
refined_instruction = """
Explain how to make a peanut butter and jelly sandwich, with the following improvements:
1. Specify the type of bread, peanut butter, and jelly to use
2. Include a step about washing hands before starting
3. Mention how to deal with potential allergies
4. Add a tip for storing the sandwich if not eaten immediately
Present the instructions in a numbered list format.
"""

print("\nRefined Instruction Output:")
refined_output = llm.invoke(refined_instruction).content
print(refined_output)

Initial Instruction Output:
Making a peanut butter and jelly sandwich is a simple and classic process. Here’s how to do it step by step:

### Ingredients:
- 2 slices of bread (white, whole wheat, or your choice)
- Peanut butter (smooth or chunky, depending on your preference)
- Jelly or jam (flavor of your choice, such as grape, strawberry, or raspberry)

### Tools:
- A butter knife or spreader
- A spoon (optional for jelly)
- A plate (optional for serving)

### Instructions:

1. **Gather Your Ingredients and Tools**: Make sure you have all your ingredients and tools ready at your workspace.

2. **Spread Peanut Butter**:
   - Take one slice of bread and place it on the plate.
   - Use the butter knife to scoop out a generous amount of peanut butter.
   - Spread the peanut butter evenly over the slice of bread, covering it from edge to edge.

3. **Spread Jelly**:
   - Take the second slice of bread and place it on the plate.
   - If you prefer, you can use a clean butter knife or a spoo

This illustrates the power of prompt iteration. Small refinements like adding safety steps, specifying formats (lists), or asking for real-world advice (e.g., storage tips) can greatly improve the usefulness and clarity of generated outputs.

## Practical application
Now we combine everything we have learned — clarity, format, specificity, structure, and iteration — into a prompt for a more complex and realistic task. We will build a well-scoped prompt for creating an educational resource — writing a brief personal finance lesson plan.

In [7]:
final_instruction = """
Task: Create a brief lesson plan for teaching basic personal finance to high school students.

Instructions:
1. Start with a concise introduction explaining the importance of personal finance.
2. List 3-5 key topics to cover (e.g., budgeting, saving, understanding credit).
3. For each topic:
   a) Provide a brief explanation suitable for teenagers.
   b) Suggest one practical activity or exercise to reinforce the concept.
4. Conclude with a summary and a suggestion for further learning resources.

Format your response as a structured outline. Aim for clarity and engagement,
balancing specific examples with general principles that can apply to various
financial situations. Keep the entire lesson plan to approximately 300 words.
"""

print("Final Instruction Output:")
print(llm.invoke(final_instruction).content)

Final Instruction Output:
### Lesson Plan: Introduction to Basic Personal Finance for High School Students

#### Introduction
Understanding personal finance is crucial for making informed decisions about money. It empowers you to manage your expenses, save for future goals, and build a secure financial future. Learning these skills now will help you avoid common financial pitfalls later in life.

#### Key Topics

1. **Budgeting**
   - **Explanation**: Budgeting is the process of creating a plan to spend your money. It helps you allocate funds for essentials, savings, and discretionary spending.
   - **Activity**: Have students create a simple budget using a template that includes categories like income, expenses, and savings goals. They can personalize it based on hypothetical or real income sources.

2. **Saving**
   - **Explanation**: Saving involves setting aside money for future needs or emergencies. It's important to save regularly, even small amounts, to build a financial cushion

This instruction blends clarity, structure, and balance. It's detailed enough to guide the model, yet general enough for creative flexibility. The formatting and length constraints help control output quality and consistency.