# Prompt Engineering: A Structured Approach

## Overview
This Jupyter notebook provides a structured approach to prompt engineering using Ollama, focusing on generating clear, neutral, and inclusive content. 

## Prerequisites

- Python 3.8 or higher
- Jupyter Notebook/Lab environment
- Ollama installed on your system
- Basic understanding of Python and API interactions

## Setting Up Ollama

Install Ollama by following the instructions at [Ollama's official website](https://ollama.com/)
and following the instructions from ollama to download LLAMA 3.2.

**Pull LLAMA 3 using bash**:
```bash
ollama pull llama3.3
```

## Python Environment Setup
Create a new virtual environment and install the required packages in bash:
```bash
python -m venv ollama-env
source ollama-env/bin/activate  # On Windows: ollama-env\Scripts\activate
pip install -r requirements.txt
```

In [None]:
!pip install -r requirements.txt

In [None]:
!pip install ollama

# Code Implementation

In [None]:
"""
Ollama Prompt Engineering Module

This module provides utilities for interacting with Ollama locally
and implementing structured prompt engineering approaches.
"""

import subprocess
import json
from typing import Dict, Any, List, Optional, Union
import logging
from datetime import datetime
import re
from pathlib import Path

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

class LocalPromptEngineer:
    """A class for managing prompt engineering with local Ollama installation."""
    
    def __init__(self, model: str = "llama2"):
        """
        Initialize the LocalPromptEngineer class.
        
        Args:
            model (str): Name of the Ollama model to use
        """
        self.model = model
        self.history: List[Dict[str, Any]] = []
        
        # Verify Ollama installation and model
        try:
            # Check if Ollama is installed
            subprocess.run(['ollama', '--version'], check=True, capture_output=True)
            
            # Check if model exists
            result = subprocess.run(['ollama', 'list'], check=True, capture_output=True, text=True)
            available_models = result.stdout.lower()
            
            if self.model.lower() not in available_models:
                raise RuntimeError(
                    f"Model '{self.model}' not found. Please run 'ollama pull {self.model}' first.\n"
                    f"Available models: {available_models}"
                )
                
        except subprocess.CalledProcessError as e:
            raise RuntimeError(f"Error checking Ollama: {e.stderr}") from e
        except FileNotFoundError as e:
            raise RuntimeError("Ollama command not found. Please install Ollama first.") from e
    
    def create_structured_prompt(
        self,
        task: str,
        context: str = "",
        constraints: List[str] = None,
        examples: List[Dict[str, str]] = None
    ) -> str:
        """
        Create a structured prompt following best practices.
        
        Args:
            task (str): Main task description
            context (str): Additional context for the task
            constraints (List[str]): List of constraints to apply
            examples (List[Dict[str, str]]): List of example input/output pairs
            
        Returns:
            str: Formatted prompt
        """
        constraints = constraints or []
        examples = examples or []
        
        prompt_parts = [
            "# Task",
            task,
            "\n# Context" if context else "",
            context if context else "",
            "\n# Constraints:" if constraints else "",
            "\n".join(f"- {c}" for c in constraints) if constraints else "",
            "\n# Examples:" if examples else "",
        ]
        
        for example in examples:
            prompt_parts.extend([
                "\nInput:",
                example.get("input", ""),
                "\nOutput:",
                example.get("output", ""),
            ])
            
        return "\n".join(filter(None, prompt_parts))
    
    def generate_response(
        self,
        prompt: str,
        temperature: float = 0.7
    ) -> Dict[str, str]:
        """
        Generate a response using local Ollama installation.
        
        Args:
            prompt (str): The structured prompt
            temperature (float): Creativity parameter (0.0 - 1.0)
            
        Returns:
            Dict[str, str]: Response from the model
        """
        try:
            # First test if the model can be reached
            test_cmd = ['ollama', 'list']
            subprocess.run(test_cmd, check=True, capture_output=True, text=True)
            
            # Escape the prompt for shell
            escaped_prompt = prompt.replace('"', '\\"').replace("'", "\\'")
            
            # Construct the command without shell
            cmd = [
                'ollama', 
                'run',
                self.model,
                escaped_prompt
            ]
            
            # Run the command without shell
            result = subprocess.run(
                cmd,
                shell=False,  # Don't use shell to avoid escaping issues
                capture_output=True,
                text=True
            )
            
            if result.returncode != 0:
                logging.error(f"Command failed with error: {result.stderr}")
                return {"response": "", "error": result.stderr}
            
            response = {
                "response": result.stdout.strip(),
                "error": result.stderr.strip() if result.stderr else None
            }
            
            # Log the interaction
            self.history.append({
                "timestamp": datetime.now().isoformat(),
                "prompt": prompt,
                "response": response,
                "model": self.model,
                "temperature": temperature
            })
            
            return response
            
        except subprocess.CalledProcessError as e:
            logging.error(f"Error generating response: {e}")
            logging.error(f"Command output: {e.output}")
            return {"response": "", "error": str(e)}
        except Exception as e:
            logging.error(f"Unexpected error: {e}")
            return {"response": "", "error": str(e)}
    
    def _evaluate_general_criterion(self, text: str, criterion: str) -> float:
        """
        Evaluate text for general criteria like clarity, technical accuracy, etc.
        
        Args:
            text (str): Text to evaluate
            criterion (str): Criterion to evaluate
            
        Returns:
            float: Score for the criterion (0-1)
        """
        criterion_patterns = {
            'clarity': {
                'positive': [
                    r'\b(clearly|specifically|precisely)\b',
                    r'\bfor example\b',
                    r'\bin other words\b',
                ],
                'negative': [
                    r'\b(maybe|perhaps|possibly)\b',
                    r'\b(complicated|complex)\b(?!.*\b(explained|simplified)\b)',
                ]
            },
            'technical_accuracy': {
                'positive': [
                    r'\b(algorithm|function|method)\b(?=.*\b(implementation|example|usage)\b)',
                    r'\b(input|output)\b(?=.*\b(example|parameter|argument)\b)',
                ],
                'negative': [
                    r'\b(magic|somehow|mysteriously)\b',
                    r'\b(always|never)\b(?=.*\b(works|fails)\b)',
                ]
            },
            'engagement': {
                'positive': [
                    r'\b(imagine|consider|think about)\b',
                    r'\b(real-world|practical|everyday)\b',
                    r'\blet\'s\b',
                ],
                'negative': [
                    r'\b(boring|tedious|difficult)\b',
                    r'\b(simply|just|easily)\b(?=.*\b(do|understand)\b)',
                ]
            }
        }
        
        if criterion not in criterion_patterns:
            return 0.5  # Default score for unknown criteria
            
        patterns = criterion_patterns[criterion]
        score = 0.5  # Start with neutral score
        
        # Check positive patterns
        for pattern in patterns['positive']:
            matches = len(re.findall(pattern, text, re.IGNORECASE))
            score += matches * 0.1  # Increment score for each positive match
            
        # Check negative patterns
        for pattern in patterns['negative']:
            matches = len(re.findall(pattern, text, re.IGNORECASE))
            score -= matches * 0.1  # Decrement score for each negative match
            
        # Normalize score to 0-1 range
        return max(0.0, min(1.0, score))

    def _evaluate_bias(self, text: str) -> float:
        """
        Evaluate text for various types of bias.
        
        Args:
            text (str): Text to evaluate
            
        Returns:
            float: Bias score (0-1, where 0 is least biased)
        """
        bias_indicators = {
            'gender_bias': {
                'patterns': [
                    r'\b(he|his|him|gentleman|man|men)\b(?!.*\b(she|her|hers|lady|woman|women)\b)',
                    r'\b(she|her|hers|lady|woman|women)\b(?!.*\b(he|his|him|gentleman|man|men)\b)',
                    r'\b(businessman|businesswoman|chairman|chairwoman|spokesman|spokeswoman)\b'
                ],
                'weight': 0.3
            },
            'racial_bias': {
                'patterns': [
                    r'\b(normal|standard|regular|typical|default)(?=\s+(person|people|individual|community))\b',
                    r'\b(ethnic|minority|diverse)(?=\s+only\b)',
                ],
                'weight': 0.3
            },
            'age_bias': {
                'patterns': [
                    r'\b(young|old|elderly|senior)(?=\s+people\b)',
                    r'\b(millennials|boomers|gen\s+[xyz])\b\s+(?=\b(are|always|never|typically)\b)',
                ],
                'weight': 0.2
            },
            'socioeconomic_bias': {
                'patterns': [
                    r'\b(poor|rich|wealthy|low-income|high-income)(?=\s+people\b)',
                    r'\b(educated|uneducated|privileged|underprivileged)\b',
                ],
                'weight': 0.2
            }
        }
        
        bias_score = 0.0
        
        for bias_type, config in bias_indicators.items():
            type_score = 0
            for pattern in config['patterns']:
                matches = len(re.findall(pattern, text, re.IGNORECASE))
                if matches > 0:
                    type_score += matches * 0.1
            
            bias_score += type_score * config['weight']
        
        return min(1.0, bias_score)

    def evaluate_response(
        self,
        response: Dict[str, str],
        criteria: List[str]
    ) -> Dict[str, float]:
        """
        Evaluate the quality of a response based on given criteria.
        
        Args:
            response (Dict[str, str]): Response from the model
            criteria (List[str]): List of evaluation criteria
            
        Returns:
            Dict[str, float]: Evaluation scores
        """
        text = response.get("response", "")
        scores = {}
        
        for criterion in criteria:
            if criterion == "bias":
                scores[criterion] = self._evaluate_bias(text)
            else:
                scores[criterion] = self._evaluate_general_criterion(text, criterion)
        
        return scores

    def save_history(self, filepath: Union[str, Path]) -> None:
        """
        Save interaction history to a JSON file.
        
        Args:
            filepath (Union[str, Path]): Path to save the history file
        """
        filepath = Path(filepath)
        with filepath.open('w') as f:
            json.dump(self.history, f, indent=2)

In [None]:
# Initialize the prompt engineer
engineer = LocalPromptEngineer(model="llama3.3")

# Define a prompt structure for an educational task
task = "Create an explanation of binary search algorithms"
context = "The explanation should be suitable for beginners and use inclusive language"
constraints = [
    "Use gender-neutral language",
    "Avoid technical jargon without explanation",
    "Include real-world examples",
    "Consider diverse learning styles"
]
examples = [
    {
        "input": "What is a loop?",
        "output": "A loop is like a repeated action, similar to how you might check each drawer in a desk until you find what you're looking for."
    }
]

# Create the structured prompt
prompt = engineer.create_structured_prompt(
    task=task,
    context=context,
    constraints=constraints,
    examples=examples
)

# Generate a response (model is already specified during initialization)
response = engineer.generate_response(
    prompt=prompt,    # The structured prompt we created
    temperature=0.7   # Controls response creativity (0.0 - 1.0)
)

# Evaluate the response
evaluation_criteria = [
    "clarity",
    "bias",
    "technical_accuracy",
    "engagement"
]
scores = engineer.evaluate_response(response, evaluation_criteria)

# Save the interaction history
engineer.save_history("prompt_engineering_history.json")

# Best Practices for Prompt Engineering
1. Clarity and Structure

- Use clear, specific instructions
- Break down complex tasks into smaller components
- Provide context and constraints explicitly

2. Inclusivity and Neutrality

- Use gender-neutral language
- Consider diverse perspectives and experiences
- Avoid cultural assumptions
- Use accessible examples

3. Technical Considerations

- Specify output format requirements
- Include error handling expectations
- Define success criteria
- Consider edge cases

4. Response Evaluation

- Define clear evaluation metrics
- Check for bias in responses
- Validate technical accuracy
- Ensure accessibility of explanations

# Common Pitfalls to Avoid

1. Ambiguous instructions
2. Implicit assumptions
3. Lack of context
4. Overly complex prompts
5. Insufficient constraints
6. Missing evaluation criteria

# Next Steps

- Experiment with different prompt structures
- Test with various models
- Gather feedback from diverse users
- Iterate based on evaluation results
- Document successful patterns
- Build a prompt template library