# Prompt Engineering Basics: A Comprehensive Guide

This notebook demonstrates fundamental prompt engineering techniques using current AI models (GPT-5, Claude 4.1, XAI Grok) with practical examples and validation methods.

## Learning Objectives
1. Understand core prompting principles
2. Learn to craft effective prompts for different use cases
3. Master prompt validation and optimization techniques
4. Apply prompting to real-world scenarios
5. Implement best practices for AI-assisted content creation

## Prerequisites
- Python 3.8+
- API keys for OpenAI, Anthropic, and XAI
- Basic understanding of LLM capabilities
- Familiarity with JSON and API calls

## Setup and Configuration

First, let's install the required libraries and set up our environment.

In [None]:
# Install required packages
!pip install openai anthropic requests python-dotenv

# Import libraries
import os
import json
import time
from typing import Dict, List, Optional
from datetime import datetime

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

# Set up API keys
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
ANTHROPIC_API_KEY = os.getenv('ANTHROPIC_API_KEY')
XAI_API_KEY = os.getenv('XAI_API_KEY')

print("✅ Environment configured successfully")

## 1. Core Prompting Techniques

Let's explore the fundamental prompting approaches with current 2025 models.

In [None]:
class PromptEngineer:
    """A comprehensive prompt engineering class for 2025 AI models"""
    
    def __init__(self, model_preference: str = 'gpt-5'):
        self.model_preference = model_preference
        self.conversation_history = []
        
    def zero_shot_prompt(self, task: str, context: str = "") -> str:
        """Create a zero-shot prompt for direct task completion"""
        prompt = f"""{context}
        
Task: {task}

Instructions: Complete the task directly without additional examples.
Provide a clear, accurate, and well-structured response."""
        return prompt
    
    def few_shot_prompt(self, task: str, examples: List[Dict], context: str = "") -> str:
        """Create a few-shot prompt with examples"""
        prompt_parts = [context] if context else []
        prompt_parts.append("Examples:")
        
        for i, example in enumerate(examples, 1):
            prompt_parts.append(f"Example {i}:")
            prompt_parts.append(f"Input: {example['input']}")
            prompt_parts.append(f"Output: {example['output']}")
            prompt_parts.append("")
        
        prompt_parts.append(f"Task: {task}")
        return "\n".join(prompt_parts)
    
    def chain_of_thought_prompt(self, problem: str, context: str = "") -> str:
        """Create a Chain-of-Thought prompting approach"""
        prompt = f"""{context}
        
Problem: {problem}

Instructions: Think step-by-step to solve this problem. Break it down into logical steps:
1. Understand the core problem
2. Identify key variables and constraints
3. Consider different approaches
4. Work through the solution systematically
5. Verify the solution

Show your reasoning at each step and provide a final answer."""
        return prompt
    
    def role_based_prompt(self, role: str, task: str, context: str = "") -> str:
        """Create a role-based prompt for specialized expertise"""
        prompt = f"""You are a {role}. Use your expertise to complete the following task:

{context}

Task: {task}

Apply your specialized knowledge and provide a professional response."""
        return prompt
    
    def validation_prompt(self, content: str, criteria: List[str]) -> str:
        """Create a validation prompt to check content quality"""
        criteria_text = "\n".join(f"- {criterion}" for criterion in criteria)
        
        prompt = f"""Please validate the following content against these criteria:

Content to validate:
{content}

Validation Criteria:
{criteria_text}

Provide a detailed validation report including:
1. Compliance with each criterion (Met/Partially Met/Not Met)
2. Specific issues or concerns
3. Suggestions for improvement
4. Overall quality score (1-10)
5. Recommendations for revision"""
        return prompt

# Initialize the prompt engineer
engineer = PromptEngineer()
print("✅ PromptEngineer initialized successfully")

## 2. Model Integration Examples

Let's create functions to interact with different 2025 AI models.

In [None]:
import openai
import anthropic
import requests

# Set up API clients
if OPENAI_API_KEY:
    openai_client = openai.OpenAI(api_key=OPENAI_API_KEY)
    
if ANTHROPIC_API_KEY:
    anthropic_client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY)

def query_gpt5(prompt: str, temperature: float = 0.7) -> str:
    """Query GPT-5 with optimized prompting"""
    if not OPENAI_API_KEY:
        return "OpenAI API key not configured"
    
    try:
        response = openai_client.chat.completions.create(
            model="gpt-5",
            messages=[{"role": "user", "content": prompt}],
            temperature=temperature,
            max_tokens=4000
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        return f"Error querying GPT-5: {str(e)}"

def query_claude41(prompt: str, temperature: float = 0.7) -> str:
    """Query Claude 4.1 with advanced prompting"""
    if not ANTHROPIC_API_KEY:
        return "Anthropic API key not configured"
    
    try:
        message = anthropic_client.messages.create(
            model="claude-4-1-opus-20241221",
            max_tokens=4000,
            temperature=temperature,
            messages=[{"role": "user", "content": prompt}]
        )
        return message.content[0].text.strip()
    except Exception as e:
        return f"Error querying Claude 4.1: {str(e)}"

def query_grok(prompt: str, temperature: float = 0.7) -> str:
    """Query XAI Grok with real-time capabilities"""
    if not XAI_API_KEY:
        return "XAI API key not configured"
    
    try:
        headers = {
            'Authorization': f'Bearer {XAI_API_KEY}',
            'Content-Type': 'application/json'
        }
        
        data = {
            'model': 'grok-4',
            'messages': [{'role': 'user', 'content': prompt}],
            'temperature': temperature,
            'max_tokens': 4000
        }
        
        response = requests.post(
            'https://api.x.ai/v1/chat/completions',
            headers=headers,
            json=data
        )
        
        return response.json()['choices'][0]['message']['content'].strip()
    except Exception as e:
        return f"Error querying Grok: {str(e)}"

print("✅ Model integration functions created successfully")

## 3. Practical Prompting Examples

Let's explore different prompting techniques with real examples.

In [None]:
# Example 1: Zero-shot prompting for content creation
content_task = "Write a 500-word article about the impact of AI on academic research in 2025."
context = "Focus on positive developments, ethical considerations, and future implications."

zero_shot_prompt = engineer.zero_shot_prompt(content_task, context)
print("Zero-shot Prompt:")
print(zero_shot_prompt[:300] + "...")
print("\n" + "="*50 + "\n")

# Example 2: Few-shot prompting for structured output
examples = [
    {
        "input": "The study found that exercise improves memory retention in older adults.",
        "output": "{\"topic\": \"memory\", \"intervention\": \"exercise\", \"population\": \"older adults\", \"finding\": \"positive\", \"confidence\": \"high\"}"
    },
    {
        "input": "Meditation reduced stress levels among college students in the experiment.",
        "output": "{\"topic\": \"stress\", \"intervention\": \"meditation\", \"population\": \"college students\", \"finding\": \"positive\", \"confidence\": \"high\"}"
    }
]

structured_task = "AI-powered tutoring systems enhanced learning outcomes for elementary students."
few_shot_prompt = engineer.few_shot_prompt(structured_task, examples)
print("Few-shot Prompt:")
print(few_shot_prompt[:400] + "...")
print("\n" + "="*50 + "\n")

## 4. Validation and Quality Assessment

Let's create validation functions to assess AI-generated content quality.

In [None]:
import re
from typing import Tuple, List

class ContentValidator:
    """Comprehensive content validation for AI-generated materials"""
    
    def __init__(self):
        self.fact_check_keywords = [
            'according to', 'research shows', 'study found',
            'data indicates', 'evidence suggests', 'statistics show'
        ]
    
    def validate_academic_content(self, content: str) -> Dict:
        """Validate academic content for quality and accuracy"""
        validation_results = {
            'length_check': self._check_length(content),
            'citation_check': self._check_citations(content),
            'structure_check': self._check_structure(content),
            'originality_check': self._check_originality(content),
            'clarity_check': self._check_clarity(content),
            'fact_claims': self._identify_fact_claims(content)
        }
        
        # Calculate overall score
        scores = [result['score'] for result in validation_results.values()]
        validation_results['overall_score'] = sum(scores) / len(scores)
        
        return validation_results
    
    def _check_length(self, content: str) -> Dict:
        """Check content length appropriateness"""
        word_count = len(content.split())
        
        if word_count < 100:
            return {'score': 3, 'status': 'Too short', 'word_count': word_count}
        elif word_count < 500:
            return {'score': 7, 'status': 'Appropriate', 'word_count': word_count}
        elif word_count < 1500:
            return {'score': 8, 'status': 'Good length', 'word_count': word_count}
        else:
            return {'score': 6, 'status': 'Very long', 'word_count': word_count}
    
    def _check_citations(self, content: str) -> Dict:
        """Check for proper citations and references"""
        citation_patterns = [
            r'\([^)]*\d{4}[^)]*\)',  # (Author, Year)
            r'\b\d{4}\b',  # Year references
            r'et al\.',  # et al. citations
            r'According to',  # Attribution phrases
        ]
        
        citation_count = sum(len(re.findall(pattern, content)) for pattern in citation_patterns)
        
        if citation_count == 0:
            return {'score': 2, 'status': 'No citations found', 'count': citation_count}
        elif citation_count < 3:
            return {'score': 5, 'status': 'Few citations', 'count': citation_count}
        elif citation_count < 7:
            return {'score': 8, 'status': 'Well cited', 'count': citation_count}
        else:
            return {'score': 9, 'status': 'Extensively cited', 'count': citation_count}
    
    def _check_structure(self, content: str) -> Dict:
        """Check content structure and organization"""
        structure_indicators = [
            'introduction', 'background', 'methodology', 'results',
            'discussion', 'conclusion', 'references', 'abstract'
        ]
        
        found_indicators = [indicator for indicator in structure_indicators 
                          if indicator.lower() in content.lower()]
        
        structure_score = len(found_indicators) / len(structure_indicators) * 10
        
        return {
            'score': min(10, structure_score + 3),  # Base score of 3
            'status': f"{len(found_indicators)}/{len(structure_indicators)} sections found",
            'found_sections': found_indicators
        }
    
    def _check_originality(self, content: str) -> Dict:
        """Assess content originality and unique insights"""
        # Check for generic phrases and overused terms
        generic_phrases = [
            'in conclusion', 'it is important to note', 'research shows',
            'many people believe', 'it is widely known'
        ]
        
        generic_count = sum(content.lower().count(phrase) for phrase in generic_phrases)
        
        # Check for specific examples and details
        specific_indicators = [
            r'\d+%',  # Specific percentages
            r'\d+ studies',  # Specific study counts
            r'\b\d{4}\b',  # Specific years
            'example', 'case study', 'specifically'
        ]
        
        specific_count = sum(len(re.findall(indicator, content, re.IGNORECASE)) 
                           for indicator in specific_indicators)
        
        originality_score = max(1, min(10, 5 + specific_count - generic_count))
        
        return {
            'score': originality_score,
            'status': 'Good specificity' if specific_count > generic_count else 'More specific examples needed',
            'specific_indicators': specific_count,
            'generic_phrases': generic_count
        }
    
    def _check_clarity(self, content: str) -> Dict:
        """Assess content clarity and readability"""
        sentences = re.split(r'[.!?]+', content)
        avg_sentence_length = sum(len(s.split()) for s in sentences) / len(sentences)
        
        # Check for complex jargon
        complex_words = ['methodology', 'paradigm', 'epistemology', 'ontology', 'phenomenology']
        complex_count = sum(content.lower().count(word) for word in complex_words)
        
        clarity_score = 10
        issues = []
        
        if avg_sentence_length > 25:
            clarity_score -= 2
            issues.append('Long sentences detected')
        
        if complex_count > 5:
            clarity_score -= 1
            issues.append('High technical jargon')
        
        return {
            'score': max(1, clarity_score),
            'status': 'Clear and readable' if clarity_score >= 8 else 'Can be improved',
            'avg_sentence_length': avg_sentence_length,
            'issues': issues
        }
    
    def _identify_fact_claims(self, content: str) -> List[Dict]:
        """Identify factual claims that need verification"""
        fact_claims = []
        
        # Look for fact-indicating phrases
        for keyword in self.fact_check_keywords:
            positions = [m.start() for m in re.finditer(keyword, content, re.IGNORECASE)]
            for pos in positions:
                # Extract surrounding context
                start = max(0, pos - 50)
                end = min(len(content), pos + 100)
                context = content[start:end].strip()
                
                fact_claims.append({
                    'keyword': keyword,
                    'context': context,
                    'position': pos,
                    'needs_verification': True
                })
        
        return fact_claims

# Initialize validator
validator = ContentValidator()
print("✅ ContentValidator initialized successfully")

## 5. Practical Validation Examples

Let's test the validation system with sample content.

In [None]:
# Sample academic content for validation
sample_content = """
The Impact of Artificial Intelligence on Academic Research in 2025

Introduction
Artificial intelligence has revolutionized academic research across multiple disciplines. This paper examines the transformative effects of AI tools on research methodologies, data analysis, and scholarly communication.

Background
Recent developments in large language models, particularly GPT-5 and Claude 4.1, have enabled new forms of research assistance. According to Smith et al. (2024), these tools can increase research productivity by up to 40%.

Methodology
This study analyzed 500 research papers published in 2025 to assess AI tool usage patterns. Data was collected from major academic databases including PubMed, IEEE Xplore, and Google Scholar.

Results
The analysis revealed that 78% of papers in computer science and 45% in social sciences utilized AI tools for literature review and data analysis. Specifically, GPT-5 was used in 34% of cases, while Claude 4.1 appeared in 28% of the analyzed papers.

Discussion
These findings suggest that AI tools are becoming integral to modern research workflows. However, concerns remain about academic integrity and the need for proper citation practices.

Conclusion
As AI continues to evolve, researchers must develop appropriate guidelines for ethical AI usage in academic work. Future research should focus on developing standards for AI tool citation and validation.

References
Smith, J., et al. (2024). AI in Academic Research: A Comprehensive Review. Journal of Higher Education Technology, 15(2), 45-67.
Johnson, A., & Williams, B. (2025). Large Language Models in Scholarly Communication. Research Today, 8(1), 12-25.
"""

# Validate the sample content
validation_results = validator.validate_academic_content(sample_content)

print("📊 Content Validation Results:")
print(f"Overall Score: {validation_results['overall_score']:.1f}/10")
print("\nDetailed Results:")
for check, result in validation_results.items():
    if check != 'overall_score':
        print(f"{check}: Score {result['score']}/10 - {result['status']}")

print(f"\n📋 Fact Claims Found: {len(validation_results['fact_claims'])}")
for i, claim in enumerate(validation_results['fact_claims'][:3], 1):
    print(f"{i}. {claim['keyword']}: {claim['context'][:80]}...")

## 6. Advanced Prompting Techniques

Let's explore advanced prompting techniques with current 2025 models.

In [None]:
# Advanced prompting with model comparison
def compare_models_on_task(task: str, models: List[str] = None) -> Dict:
    """Compare different models on the same task"""
    if models is None:
        models = ['gpt-5', 'claude-4.1', 'grok-4']
    
    results = {}
    
    for model in models:
        print(f"🤖 Querying {model.upper()}...")
        
        # Create enhanced prompt for 2025 models
        enhanced_prompt = f"""You are using {model.upper()}, one of the most advanced AI systems available in 2025.
        
Task: {task}

Instructions:
1. Use your extensive knowledge and real-time access (where available)
2. Provide detailed, accurate, and well-structured responses
3. Include specific examples and current data
4. Consider ethical implications and limitations
5. Use your unique capabilities to enhance the response

Please provide your response in a clear, professional format."""
        
        if model.lower() == 'gpt-5':
            response = query_gpt5(enhanced_prompt)
        elif model.lower() == 'claude-4.1':
            response = query_claude41(enhanced_prompt)
        elif model.lower() == 'grok-4':
            response = query_grok(enhanced_prompt)
        else:
            response = f"Model {model} not implemented"
        
        results[model] = {
            'response': response[:500] + "..." if len(response) > 500 else response,
            'length': len(response),
            'timestamp': datetime.now().isoformat()
        }
        
        time.sleep(1)  # Rate limiting
    
    return results

# Example task for model comparison
research_task = "Analyze the current state of AI ethics in academic research as of 2025, including recent developments, challenges, and future directions."

print("🔍 Comparing models on AI ethics analysis...")
try:
    comparison_results = compare_models_on_task(research_task, ['gpt-5'])
    
    for model, result in comparison_results.items():
        print(f"\n📝 {model.upper()} Response:")
        print(result['response'])
        print(f"📏 Length: {result['length']} characters")
        print(f"⏰ Generated: {result['timestamp']}")
        
except Exception as e:
    print(f"Error in model comparison: {e}")
    print("Note: API keys may not be configured for all models")

## 7. Best Practices and Guidelines

Let's create a comprehensive prompt optimization framework.

In [None]:
class PromptOptimizer:
    """Advanced prompt optimization for 2025 AI models"""
    
    def __init__(self):
        self.optimization_techniques = {
            'context_engineering': self._optimize_context,
            'role_assignment': self._optimize_role,
            'output_formatting': self._optimize_format,
            'constraint_specification': self._optimize_constraints,
            'validation_integration': self._optimize_validation
        }
    
    def optimize_prompt(self, base_prompt: str, techniques: List[str] = None) -> str:
        """Apply multiple optimization techniques to a base prompt"""
        if techniques is None:
            techniques = list(self.optimization_techniques.keys())
        
        optimized_prompt = base_prompt
        
        for technique in techniques:
            if technique in self.optimization_techniques:
                optimized_prompt = self.optimization_techniques[technique](optimized_prompt)
        
        return optimized_prompt
    
    def _optimize_context(self, prompt: str) -> str:
        """Add context optimization elements"""
        context_addition = """
Context Optimization:
- Consider the broader implications and context
- Include relevant background information
- Account for current developments and trends
- Consider multiple perspectives and viewpoints
"""
        return prompt + context_addition
    
    def _optimize_role(self, prompt: str) -> str:
        """Add role optimization elements"""
        role_addition = """
Role Specification:
- You are an expert in this domain
- Apply your specialized knowledge and experience
- Consider professional standards and best practices
- Provide insights based on current research and developments
"""
        return prompt + role_addition
    
    def _optimize_format(self, prompt: str) -> str:
        """Add output formatting optimization"""
        format_addition = """
Output Requirements:
- Use clear, structured formatting
- Include specific examples and evidence
- Provide actionable recommendations
- Use professional and academic language
- Include relevant citations and references
"""
        return prompt + format_addition
    
    def _optimize_constraints(self, prompt: str) -> str:
        """Add constraint optimization elements"""
        constraint_addition = """
Constraints and Guidelines:
- Ensure accuracy and factual correctness
- Consider ethical implications
- Maintain academic integrity standards
- Follow proper citation practices
- Acknowledge limitations and uncertainties
"""
        return prompt + constraint_addition
    
    def _optimize_validation(self, prompt: str) -> str:
        """Add validation optimization elements"""
        validation_addition = """
Validation Instructions:
- Cross-reference information with reliable sources
- Include confidence levels for claims
- Note any assumptions or limitations
- Suggest methods for verification
- Provide uncertainty quantification where appropriate
"""
        return prompt + validation_addition

# Initialize optimizer
optimizer = PromptOptimizer()

# Example optimization
base_prompt = "Explain the impact of AI on academic research."
optimized_prompt = optimizer.optimize_prompt(base_prompt)

print("🔧 Base Prompt:")
print(base_prompt)
print("\n🚀 Optimized Prompt:")
print(optimized_prompt)

## 8. JSON Schema Validation Examples

Let's create practical JSON schema validation for AI outputs.

In [None]:
import json
from jsonschema import validate, ValidationError

# Define JSON schemas for different AI output types
RESEARCH_ANALYSIS_SCHEMA = {
    "type": "object",
    "properties": {
        "title": {"type": "string", "minLength": 10, "maxLength": 200},
        "authors": {"type": "array", "items": {"type": "string"}},
        "publication_year": {"type": "integer", "minimum": 1900, "maximum": 2025},
        "methodology": {
            "type": "object",
            "properties": {
                "type": {"type": "string"},
                "sample_size": {"type": "integer"},
                "data_collection_period": {"type": "string"}
            },
            "required": ["type", "sample_size"]
        },
        "key_findings": {"type": "array", "items": {"type": "string"}},
        "limitations": {"type": "array", "items": {"type": "string"}},
        "doi": {"type": "string"}
    },
    "required": ["title", "authors", "publication_year", "key_findings"]
}

MARKET_ANALYSIS_SCHEMA = {
    "type": "object",
    "properties": {
        "market_name": {"type": "string"},
        "market_size": {
            "type": "object",
            "properties": {
                "value": {"type": "number"},
                "currency": {"type": "string"},
                "year": {"type": "integer"}
            }
        },
        "growth_rate": {"type": "number", "minimum": -100, "maximum": 1000},
        "key_drivers": {"type": "array", "items": {"type": "string"}},
        "major_competitors": {"type": "array", "items": {"type": "string"}},
        "market_trends": {"type": "array", "items": {"type": "string"}},
        "opportunities": {"type": "array", "items": {"type": "string"}},
        "threats": {"type": "array", "items": {"type": "string"}},
        "confidence_score": {"type": "number", "minimum": 0, "maximum": 1}
    },
    "required": ["market_name", "market_size", "key_drivers"]
}

def validate_ai_output(output: str, schema: dict) -> Dict:
    """Validate AI output against JSON schema"""
    try:
        # Try to parse as JSON
        parsed_output = json.loads(output)
        
        # Validate against schema
        validate(instance=parsed_output, schema=schema)
        
        return {
            'valid': True,
            'parsed_data': parsed_output,
            'errors': []
        }
        
    except json.JSONDecodeError as e:
        return {
            'valid': False,
            'parsed_data': None,
            'errors': [f"JSON parsing error: {str(e)}"]
        }
        
    except ValidationError as e:
        return {
            'valid': False,
            'parsed_data': None,
            'errors': [f"Schema validation error: {str(e)}"]
        }
    
    except Exception as e:
        return {
            'valid': False,
            'parsed_data': None,
            'errors': [f"Unexpected error: {str(e)}"]
        }

# Example usage
sample_research_json = '''
{
  "title": "The Impact of AI on Academic Research Productivity",
  "authors": ["Dr. Sarah Johnson", "Prof. Michael Chen"],
  "publication_year": 2025,
  "methodology": {
    "type": "mixed methods",
    "sample_size": 500,
    "data_collection_period": "2024-2025"
  },
  "key_findings": [
    "AI tools increased research productivity by 35%",
    "Most significant impact in literature review phase",
    "Concerns about academic integrity remain"
  ],
  "limitations": [
    "Sample limited to STEM disciplines",
    "Self-reported productivity measures"
  ],
  "doi": "10.1234/research.2025.001"
}
'''

# Validate the sample JSON
validation_result = validate_ai_output(sample_research_json, RESEARCH_ANALYSIS_SCHEMA)

print("📋 JSON Schema Validation Results:")
print(f"Valid: {validation_result['valid']}")

if validation_result['valid']:
    print("✅ JSON is valid according to schema")
    print(f"📊 Parsed data keys: {list(validation_result['parsed_data'].keys())}")
else:
    print("❌ JSON validation failed:")
    for error in validation_result['errors']:
        print(f"  - {error}")

print("\n🔍 Validation completed successfully!")

## 9. Summary and Next Steps

This notebook has demonstrated key concepts in prompt engineering for 2025 AI models:

### Key Learnings
1. **Model Capabilities**: Understanding GPT-5, Claude 4.1, and XAI Grok capabilities
2. **Prompt Techniques**: Zero-shot, few-shot, and Chain-of-Thought prompting
3. **Validation Methods**: Content validation and JSON schema compliance
4. **Best Practices**: Ethical AI usage and academic integrity
5. **Practical Implementation**: Working code examples and API integrations

### Next Steps
1. **Configure API Keys**: Set up actual API keys for testing
2. **Explore Advanced Features**: Try multimodal prompting and agentic workflows
3. **Customize Schemas**: Adapt JSON schemas for your specific use cases
4. **Build Applications**: Create practical applications using these techniques
5. **Contribute Back**: Share improvements and new techniques with the community

### Resources
- [Prompting-Gold-Standard Repository](https://github.com/llMr-Sweetll/Prompting-Gold-Standard)
- [OpenAI GPT-5 Documentation](https://platform.openai.com/docs)
- [Anthropic Claude 4.1 Guide](https://docs.anthropic.com/claude)
- [XAI Grok API Documentation](https://x.ai/grok)

Remember: The field of AI prompting is rapidly evolving. Stay current with latest developments and always prioritize ethical usage and validation of AI-generated content.