# Prompt formatting and structure

In this notebook, we will explore different ways to structure prompts. When interacting with LLMs, the way we phrase and structure prompts plays a critical role in shaping the model's responses. Prompt design is not only about content — it's about form, context, and clarity. We will investigate how changes in format (e.g., Q&A, dialogue, structured instructions) and layout elements (e.g., lists, headings) influence the model's output.

In [1]:
import os
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

### Initialize the language model
We instantiate a lightweight GPT model from OpenAI using LangChain.

In [2]:
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini-2024-07-18")

## Exploring different prompt formats
We will now look at how different ways of phrasing a prompt—question format, dialogue, and instruction—affect the model's interpretation and response using the topic of photosynthesis as our consistent theme.

### 1. Question and answer (Q&A) format
In this format, we treat the prompt like a direct question. This often leads to concise and focused answers.

In [3]:
# Prompt using a classic Q&A format
qa_prompt = """Q: What is photosynthesis?
A:"""

# Get the model's response
response = llm.invoke(qa_prompt).content
print(response)

Photosynthesis is a biological process used by plants, algae, and some bacteria to convert light energy, usually from the sun, into chemical energy stored in glucose (a type of sugar). This process primarily occurs in the chloroplasts of plant cells and involves two main stages: the light-dependent reactions and the light-independent reactions (Calvin cycle).

1. **Light-dependent reactions**: These occur in the thylakoid membranes of the chloroplasts and require sunlight. When chlorophyll and other pigments absorb light, they energize electrons, which then move through a series of proteins (the electron transport chain). This process generates ATP (adenosine triphosphate) and NADPH (nicotinamide adenine dinucleotide phosphate), which are energy carriers. Additionally, water (H₂O) is split during this process, releasing oxygen (O₂) as a byproduct.

2. **Light-independent reactions (Calvin cycle)**: These occur in the stroma of the chloroplasts and do not directly require light. Instead

Q&A prompts are interpreted by the model as factual and objective, often resulting in shorter, more direct answers.

### 2. Dialogue format
Dialogue-style prompts add context and human intent. By simulating a conversation, we guide the tone and complexity level (e.g., teacher-to-student).

In [4]:
# Prompt simulating a conversation between a student and teacher
dialogue_prompt = """Student: Can you explain photosynthesis to me?
Teacher: Certainly! Photosynthesis is...
Student: What does a plant need for photosynthesis?
Teacher:"""

# Get the model's response
response = llm.invoke(dialogue_prompt).content
print(response)

Plants need several key components for photosynthesis to occur:

1. **Sunlight**: This is the primary energy source for photosynthesis. Plants capture light energy using a pigment called chlorophyll, which is found in their leaves.

2. **Carbon Dioxide (CO2)**: Plants absorb carbon dioxide from the air through small openings in their leaves called stomata.

3. **Water (H2O)**: Plants take up water from the soil through their roots. Water is also essential for the photosynthesis process.

4. **Chlorophyll**: While not a requirement in the same way as sunlight, CO2, and water, chlorophyll is crucial because it allows plants to capture light energy.

During photosynthesis, plants convert these inputs into glucose (a type of sugar) and oxygen. The overall chemical equation for photosynthesis can be simplified as:

\[ 6CO_2 + 6H_2O + \text{light energy} \rightarrow C_6H_{12}O_6 + 6O_2 \]

This means that six molecules of carbon dioxide and six molecules of water, using light energy, are tra

Dialogue formatting adds a narrative element, making the output more natural and explanatory. The model adapts its tone and word choice to match the characters involved.

### 3. Instruction format
Instruction-based prompts are declarative and task-oriented. This is a versatile format ideal for educational, technical, or analytical tasks.

In [5]:
# Prompt with a clear instructional tone
instruction_prompt = """Provide a brief explanation of photosynthesis, including its main components and importance."""

# Get the model's response
response = llm.invoke(instruction_prompt).content
print(response)

Photosynthesis is the biochemical process by which green plants, algae, and certain bacteria convert light energy, usually from the sun, into chemical energy stored in glucose. This process primarily occurs in the chloroplasts of plant cells, where chlorophyll, the green pigment, captures light energy.

### Main Components of Photosynthesis:
1. **Light**: Sunlight provides the energy required for photosynthesis.
2. **Chlorophyll**: This pigment absorbs light, primarily in the blue and red wavelengths.
3. **Water (H₂O)**: Absorbed by plant roots, water is split into oxygen and hydrogen during the light-dependent reactions.
4. **Carbon Dioxide (CO₂)**: Taken from the atmosphere through stomata, CO₂ is used in the Calvin cycle to produce glucose.
5. **Glucose (C₆H₁₂O₆)**: The end product of photosynthesis, it serves as an energy source for plants and other organisms.

### Importance of Photosynthesis:
- **Oxygen Production**: Photosynthesis releases oxygen as a byproduct, which is essenti

Instructional prompts signal the model to deliver structured or goal-directed responses. They work well when the output needs to meet specific content expectations.


## Impact of structural elements in prompts
Beyond wording, how we structure a prompt visually—using headings, bullets, or lists—can influence the clarity and organization of the output.

### 1. Using headings
We ask the model to organize content under predefined headings. This adds readability and encourages clear segmentation of ideas.

In [6]:
# Prompt with section headers to shape structure
headings_prompt = """Explain photosynthesis using the following structure:

# Definition

# Process

# Importance
"""

# Get the structured response
response = llm.invoke(headings_prompt).content
print(response)

# Definition
Photosynthesis is a biochemical process by which green plants, algae, and some bacteria convert light energy, usually from the sun, into chemical energy stored in glucose. This process primarily occurs in the chloroplasts of plant cells, utilizing chlorophyll, the green pigment responsible for capturing light energy.

# Process
Photosynthesis can be divided into two main stages: the light-dependent reactions and the light-independent reactions (Calvin cycle).

1. **Light-dependent Reactions**: These reactions occur in the thylakoid membranes of the chloroplasts and require sunlight. When chlorophyll absorbs light energy, it excites electrons, which then travel through a series of proteins in the electron transport chain. This process generates ATP (adenosine triphosphate) and NADPH (nicotinamide adenine dinucleotide phosphate), two energy carriers. Additionally, water molecules are split (photolysis), releasing oxygen as a byproduct.

2. **Light-independent Reactions (Calv

The model uses the headings to break its answer into parts. This approach is useful for documentation, teaching, and structured analysis.

### 2. Using bullet points
Using bullet points encourages the model to summarize or list facts concisely, which is effective for note-taking or extracting key points.

In [7]:
# Prompt with bullet points to organize information
bullet_points_prompt = """List the key components needed for photosynthesis:

•
•
•
"""

# Get a bulleted response
response = llm.invoke(bullet_points_prompt).content
print(response)

The key components needed for photosynthesis are:

1. **Light Energy** (usually from the sun)
2. **Water (H₂O)**
3. **Carbon Dioxide (CO₂)**
4. **Chlorophyll** (the pigment in plants that captures light energy)

These components work together in plants, algae, and some bacteria to convert light energy into chemical energy, producing glucose and oxygen as byproducts.


Bullets prompt the model to itemize content, focusing on clarity and brevity. This structure is great for summarizing complex ideas or creating outlines.

### 3. Using numbered lists
With numbered lists, we guide the model to follow a logical, step-by-step order. This is useful for sequences, workflows, or processes.

In [8]:
# Prompt using a numbered list for ordered steps
numbered_list_prompt = """Describe the steps of photosynthesis in order:

1.
2.
3.
4.
"""

# Generate a stepwise explanation
response = llm.invoke(numbered_list_prompt).content
print(response)

Photosynthesis occurs in two main stages: the light-dependent reactions and the light-independent reactions (Calvin cycle). Here are the steps in order:

1. **Light Absorption**: Chlorophyll and other pigments in the chloroplasts absorb sunlight, primarily in the blue and red wavelengths. This energy excites electrons and initiates the process.

2. **Water Splitting (Photolysis)**: The absorbed light energy is used to split water molecules (H₂O) into oxygen (O₂), protons (H⁺), and electrons. This reaction occurs in the thylakoid membranes of the chloroplasts.

3. **Electron Transport Chain**: The excited electrons from chlorophyll are passed along a series of proteins in the thylakoid membrane, known as the electron transport chain. As the electrons move through the chain, their energy is used to pump protons into the thylakoid lumen, creating a proton gradient.

4. **ATP and NADPH Formation**: The proton gradient drives ATP synthase to produce ATP from ADP and inorganic phosphate (Pi)

Numbering enforces a hierarchy and sequence. It helps the model respond with temporal or logical progression—useful in how-to guides or process documentation.


## Comparing prompt effectiveness
Let’s now look at how different prompt structures influence the quality and completeness of responses when answering the same question. We will use three styles:
- Plain instruction
- Structured instruction
- Q&A with enumerated list

In [9]:
# Define a list of differently structured prompts
comparison_prompts = [
    "Explain the importance of photosynthesis for life on Earth.",
    """Explain the importance of photosynthesis for life on Earth. Structure your answer as follows:
    1. Oxygen production
    2. Food chain support
    3. Carbon dioxide absorption""",
    """Q: Why is photosynthesis important for life on Earth?
    A: Photosynthesis is crucial for life on Earth because:
    1.
    2.
    3."""
]

# Iterate over each format and generate a response
for i, prompt in enumerate(comparison_prompts, 1):
    print(f"Prompt {i}:")
    response = llm.invoke(prompt).content
    print(response)

Prompt 1:
Photosynthesis is a crucial biological process that plays a fundamental role in sustaining life on Earth. Here are several key reasons highlighting its importance:

1. **Oxygen Production**: Photosynthesis is responsible for producing oxygen, a vital component of the Earth's atmosphere. During the process, plants, algae, and some bacteria convert carbon dioxide and water into glucose and oxygen using sunlight. This oxygen is essential for the survival of most living organisms that rely on aerobic respiration to produce energy.

2. **Carbon Dioxide Reduction**: Photosynthesis helps regulate atmospheric carbon dioxide levels, a greenhouse gas that contributes to climate change. By absorbing CO2 during the process, photosynthetic organisms mitigate the impacts of excess greenhouse gases and help maintain a balanced ecosystem.

3. **Base of the Food Chain**: Photosynthetic organisms, primarily plants and phytoplankton, form the foundation of the food chain. They convert solar ene

This section highlights how the same topic can yield different styles, levels of detail, and logical flow depending on prompt formatting. It is a practical way to evaluate which format suits a specific use case (e.g., teaching vs. summarization vs. detailed reasoning).


Prompt formatting is more than just phrasing—it is design thinking applied to language. As we build more complex or domain-specific applications, being intentional with how we structure prompts can dramatically improve performance and reliability.