diff --git a/01-course/module-01-foundations/README.md b/01-course/module-01-foundations/README.md
index 2c422a6..afd9dc6 100644
--- a/01-course/module-01-foundations/README.md
+++ b/01-course/module-01-foundations/README.md
@@ -5,10 +5,10 @@
This foundational module introduces you to prompt engineering concepts and gets your development environment configured for hands-on learning.
### Learning Objectives
-By completing this module, you will:
+By completing this module, you will be able to:
- ✅ Set up a working development environment with AI assistant access
- ✅ Identify and apply the four core elements of effective prompts
-- ✅ Write basic prompts for code improvement and documentation
+- ✅ Write basic prompts for reviewing code
- ✅ Iterate and refine prompts based on output quality
### Getting Started
diff --git a/01-course/module-01-foundations/module1.ipynb b/01-course/module-01-foundations/module1.ipynb
index 31c4120..fe3c925 100644
--- a/01-course/module-01-foundations/module1.ipynb
+++ b/01-course/module-01-foundations/module1.ipynb
@@ -166,6 +166,7 @@
"
Installation commands run locally and install packages to your Python environment
\n",
"
You don't copy/paste - just click the run button in each cell
\n",
"
Output appears below each cell after you run it
\n",
+ "
Long outputs are truncated: If you see \"Output is truncated. View as a scrollable element\" - click that link to see the full response in a scrollable view
\n",
"\n",
"\n"
]
@@ -205,7 +206,7 @@
"def install_requirements():\n",
" try:\n",
" # Install from requirements.txt\n",
- " subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-r\", \"requirements.txt\"])\n",
+ " subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-q\", \"-r\", \"requirements.txt\"])\n",
" print(\"✅ SUCCESS! All dependencies installed successfully.\")\n",
" print(\"📦 Installed: openai, anthropic, python-dotenv, requests\")\n",
" except subprocess.CalledProcessError as e:\n",
@@ -235,7 +236,14 @@
"\n",
"Choose your preferred option:\n",
"\n",
- "- **Option A: GitHub Copilot API (local proxy)**: Recommended if you don't have OpenAI or CircuIT API access. Follow [GitHub-Copilot-2-API/README.md](../../GitHub-Copilot-2-API/README.md) to authenticate and start the local server, then run the `GitHub Copilot (local proxy)` setup cells below.\n",
+ "- **Option A: GitHub Copilot API (local proxy)**: Recommended if you don't have OpenAI or CircuIT API access.\n",
+ " - Supports both **Claude** and **OpenAI** models\n",
+ " - No API keys needed - uses your GitHub Copilot subscription\n",
+ " - Follow [GitHub-Copilot-2-API/README.md](../../GitHub-Copilot-2-API/README.md) to authenticate and start the local server\n",
+ " - Run the setup cell below and **edit your preferred provider** (`\"openai\"` or `\"claude\"`) by setting the `PROVIDER` variable\n",
+ " - Available models:\n",
+ " - **OpenAI**: gpt-4o, gpt-4, gpt-3.5-turbo, o3-mini, o4-mini\n",
+ " - **Claude**: claude-3.5-sonnet, claude-3.7-sonnet, claude-sonnet-4\n",
"\n",
"- **Option B: OpenAI API**: If you have OpenAI API access, you can use the `OpenAI` connection cells provided later in this notebook.\n",
"\n",
@@ -309,40 +317,144 @@
"metadata": {},
"outputs": [],
"source": [
- "# GitHub Copilot API setup (local proxy)\n",
+ "# Option A: GitHub Copilot API setup (Recommended)\n",
"import openai\n",
+ "import anthropic\n",
"import os\n",
"\n",
- "# Configure for local GitHub Copilot proxy\n",
- "client = openai.OpenAI(\n",
+ "# ============================================\n",
+ "# 🎯 CHOOSE YOUR AI MODEL PROVIDER\n",
+ "# ============================================\n",
+ "# Set your preference: \"openai\" or \"claude\"\n",
+ "PROVIDER = \"claude\" # Change to \"claude\" to use Claude models\n",
+ "\n",
+ "# ============================================\n",
+ "# 📋 Available Models by Provider\n",
+ "# ============================================\n",
+ "# OpenAI Models (via GitHub Copilot):\n",
+ "# - gpt-4o (recommended, supports vision)\n",
+ "# - gpt-4\n",
+ "# - gpt-3.5-turbo\n",
+ "# - o3-mini, o4-mini\n",
+ "#\n",
+ "# Claude Models (via GitHub Copilot):\n",
+ "# - claude-3.5-sonnet (recommended, supports vision)\n",
+ "# - claude-3.7-sonnet (supports vision)\n",
+ "# - claude-sonnet-4 (supports vision)\n",
+ "# ============================================\n",
+ "\n",
+ "# Configure clients for both providers\n",
+ "openai_client = openai.OpenAI(\n",
" base_url=\"http://localhost:7711/v1\",\n",
- " api_key=\"dummy-key\" # The local proxy doesn't need a real key\n",
+ " api_key=\"dummy-key\"\n",
")\n",
"\n",
- "def get_chat_completion(messages, model=\"gpt-4\", temperature=0.7):\n",
- " \"\"\"\n",
- " Get a chat completion from the AI model.\n",
- " \n",
- " Args:\n",
- " messages: List of message dictionaries with 'role' and 'content'\n",
- " model: Model name (default: gpt-4)\n",
- " temperature: Creativity level 0-1 (default: 0.7)\n",
- " \n",
- " Returns:\n",
- " String response from the AI model\n",
- " \"\"\"\n",
+ "claude_client = anthropic.Anthropic(\n",
+ " api_key=\"dummy-key\",\n",
+ " base_url=\"http://localhost:7711\"\n",
+ ")\n",
+ "\n",
+ "# Set default models for each provider\n",
+ "OPENAI_DEFAULT_MODEL = \"gpt-4o\"\n",
+ "CLAUDE_DEFAULT_MODEL = \"claude-3.5-sonnet\"\n",
+ "\n",
+ "\n",
+ "def _extract_text_from_blocks(blocks):\n",
+ " \"\"\"Extract text content from response blocks returned by the API.\"\"\"\n",
+ " parts = []\n",
+ " for block in blocks:\n",
+ " text_val = getattr(block, \"text\", None)\n",
+ " if isinstance(text_val, str):\n",
+ " parts.append(text_val)\n",
+ " elif isinstance(block, dict):\n",
+ " t = block.get(\"text\")\n",
+ " if isinstance(t, str):\n",
+ " parts.append(t)\n",
+ " return \"\\n\".join(parts)\n",
+ "\n",
+ "\n",
+ "def get_openai_completion(messages, model=None, temperature=0.0):\n",
+ " \"\"\"Get completion from OpenAI models via GitHub Copilot.\"\"\"\n",
+ " if model is None:\n",
+ " model = OPENAI_DEFAULT_MODEL\n",
" try:\n",
- " response = client.chat.completions.create(\n",
+ " response = openai_client.chat.completions.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature\n",
" )\n",
" return response.choices[0].message.content\n",
" except Exception as e:\n",
- " return f\"❌ Error: {e}\\\\n\\\\n💡 Make sure the GitHub Copilot local proxy is running on port 7711\"\n",
+ " return f\"❌ Error: {e}\\n💡 Make sure GitHub Copilot proxy is running on port 7711\"\n",
+ "\n",
+ "\n",
+ "def get_claude_completion(messages, model=None, temperature=0.0):\n",
+ " \"\"\"Get completion from Claude models via GitHub Copilot.\"\"\"\n",
+ " if model is None:\n",
+ " model = CLAUDE_DEFAULT_MODEL\n",
+ " try:\n",
+ " response = claude_client.messages.create(\n",
+ " model=model,\n",
+ " max_tokens=8192,\n",
+ " messages=messages,\n",
+ " temperature=temperature\n",
+ " )\n",
+ " return _extract_text_from_blocks(getattr(response, \"content\", []))\n",
+ " except Exception as e:\n",
+ " return f\"❌ Error: {e}\\n💡 Make sure GitHub Copilot proxy is running on port 7711\"\n",
"\n",
- "print(\"✅ GitHub Copilot API configured successfully!\")\n",
- "print(\"🔗 Connected to: http://localhost:7711\")\n"
+ "\n",
+ "def get_chat_completion(messages, model=None, temperature=0.7):\n",
+ " \"\"\"\n",
+ " Generic function to get chat completion from any provider.\n",
+ " Routes to the appropriate provider-specific function based on PROVIDER setting.\n",
+ " \"\"\"\n",
+ " if PROVIDER.lower() == \"claude\":\n",
+ " return get_claude_completion(messages, model, temperature)\n",
+ " else: # Default to OpenAI\n",
+ " return get_openai_completion(messages, model, temperature)\n",
+ "\n",
+ "\n",
+ "def get_default_model():\n",
+ " \"\"\"Get the default model for the current provider.\"\"\"\n",
+ " if PROVIDER.lower() == \"claude\":\n",
+ " return CLAUDE_DEFAULT_MODEL\n",
+ " else:\n",
+ " return OPENAI_DEFAULT_MODEL\n",
+ "\n",
+ "\n",
+ "# ============================================\n",
+ "# 🧪 TEST CONNECTION\n",
+ "# ============================================\n",
+ "print(\"🔄 Testing connection to GitHub Copilot proxy...\")\n",
+ "test_result = get_chat_completion([\n",
+ " {\"role\": \"user\", \"content\": \"test\"}\n",
+ "])\n",
+ "\n",
+ "if test_result and \"Error\" in test_result:\n",
+ " print(\"\\n\" + \"=\"*60)\n",
+ " print(\"❌ CONNECTION FAILED!\")\n",
+ " print(\"=\"*60)\n",
+ " print(f\"Provider: {PROVIDER.upper()}\")\n",
+ " print(f\"Expected endpoint: http://localhost:7711\")\n",
+ " print(\"\\n⚠️ The GitHub Copilot proxy is NOT running!\")\n",
+ " print(\"\\n📋 To fix this:\")\n",
+ " print(\" 1. Open a new terminal\")\n",
+ " print(\" 2. Navigate to your copilot-api directory\")\n",
+ " print(\" 3. Run: uv run copilot2api start\")\n",
+ " print(\" 4. Wait for the server to start (you should see 'Server initialized')\")\n",
+ " print(\" 5. Come back and rerun this cell\")\n",
+ " print(\"\\n💡 Need setup help? See: GitHub-Copilot-2-API/README.md\")\n",
+ " print(\"=\"*70)\n",
+ "else:\n",
+ " print(\"\\n\" + \"=\"*60)\n",
+ " print(\"✅ CONNECTION SUCCESSFUL!\")\n",
+ " print(\"=\"*60)\n",
+ " print(f\"🤖 Provider: {PROVIDER.upper()}\")\n",
+ " print(f\"📦 Default Model: {get_default_model()}\")\n",
+ " print(f\"🔗 Endpoint: http://localhost:7711\")\n",
+ " print(f\"\\n💡 To switch providers, change PROVIDER to '{'claude' if PROVIDER.lower() == 'openai' else 'openai'}' and rerun this cell\")\n",
+ " print(\"=\"*70)\n"
]
},
{
@@ -490,9 +602,9 @@
"print(response)\n",
"\n",
"if response and \"Connection successful\" in response:\n",
- " print(\"\\\\n🎉 Perfect! Your AI connection is working!\")\n",
+ " print(\"\\n🎉 Perfect! Your AI connection is working!\")\n",
"else:\n",
- " print(\"\\\\n⚠️ Connection test complete, but response format may vary.\")\n",
+ " print(\"\\n⚠️ Connection test complete, but response format may vary.\")\n",
" print(\"This is normal - let's continue with the tutorial!\")\n"
]
},
diff --git a/01-course/module-02-fundamentals/README.md b/01-course/module-02-fundamentals/README.md
index d36af1b..a4b5a9c 100644
--- a/01-course/module-02-fundamentals/README.md
+++ b/01-course/module-02-fundamentals/README.md
@@ -4,59 +4,50 @@
This module covers the essential prompt engineering techniques that form the foundation of effective AI assistant interaction for software development.
-### What You'll Learn
-- Clear instruction writing and specification techniques
-- Role prompting and persona adoption for specialized expertise
-- Using delimiters and structured inputs for complex tasks
-- Step-by-step reasoning and few-shot learning patterns
-- Providing reference text to reduce hallucinations
-
-### Module Contents
-- **[module2.ipynb](./module2.ipynb)** - Complete module 2 tutorial notebook
+### Learning Objectives
+By completing this module, you will be able to:
-### Core Techniques Covered
+- ✅ Apply eight core prompt engineering techniques to real coding scenarios
+- ✅ Write clear instructions with specific constraints and requirements
+- ✅ Use role prompting to transform AI into specialized domain experts
+- ✅ Organize complex inputs using XML delimiters and structured formatting
+- ✅ Teach AI your preferred styles using few-shot examples
+- ✅ Implement chain-of-thought reasoning for systematic problem-solving
+- ✅ Ground AI responses in reference texts with proper citations
+- ✅ Break complex tasks into sequential workflows using prompt chaining
+- ✅ Create evaluation rubrics and self-critique loops with LLM-as-Judge
+- ✅ Separate reasoning from clean final outputs using inner monologue
-#### 1. Clear Instructions & Specifications
-- Writing precise, unambiguous prompts
-- Specifying constraints, formats, and requirements
-- Handling edge cases and error conditions
+### Getting Started
-#### 2. Role Prompting & Personas
-- Adopting specialized engineering roles (security, performance, QA)
-- Leveraging domain expertise through persona prompting
-- Combining multiple perspectives for comprehensive analysis
+**First time here?** If you haven't set up your development environment yet, follow the [Quick Setup guide](../../README.md#-quick-setup) in the main README first.
-#### 3. Delimiters & Structured Inputs
-- Organizing complex multi-file inputs using headers and XML-like tags
-- Separating requirements, context, and code cleanly
-- Structuring outputs for consistency and parsability
+**Ready to start?**
+1. **Open the tutorial notebook**: Click on [module2.ipynb](./module2.ipynb) to start the interactive tutorial
+2. **Install dependencies**: Run the "Install Required Dependencies" cell in the notebook
+3. **Follow the notebook**: Work through each cell sequentially - the notebook will guide you through setup and exercises
+4. **Complete exercises**: Practice the hands-on activities as you go
-#### 4. Step-by-Step Reasoning
-- Guiding systematic analysis through explicit steps
-- Building chains of reasoning for complex problems
-- Creating reproducible analytical workflows
+### Module Contents
+- **[module2.ipynb](./module2.ipynb)** - Complete module 2 tutorial notebook
-#### 5. Few-Shot Learning & Examples
-- Providing high-quality examples to establish patterns
-- Teaching consistent formatting and style
-- Demonstrating edge case handling
+### Time Required
+Approximately 90-120 minutes (1.5-2 hours)
-### Learning Objectives
-By completing this module, you will:
-- ✅ Master the six core prompt engineering techniques
-- ✅ Be able to transform vague requests into specific, actionable prompts
-- ✅ Know how to structure complex multi-file refactoring tasks
-- ✅ Understand how to guide AI assistants through systematic analysis
-- ✅ Have practical experience with each technique applied to code
+**Time Breakdown:**
+- Setup and introduction: ~10 minutes
+- 8 core tactics with examples: ~70 minutes
+- Hands-on practice activities: ~20-30 minutes
+- Progress tracking: ~5 minutes
-### Time Required
-Approximately 30 minutes
+💡 **Tip:** You can complete this module in one session or break it into multiple shorter sessions. Each tactic is self-contained, making it easy to pause and resume.
### Prerequisites
-- Completion of [Module 1: Foundations](../module-01-foundations/)
-- Working development environment with AI assistant access
+- Python 3.8+ installed
+- IDE with notebook support (VS Code or Cursor recommended)
+- API access to GitHub Copilot, CircuIT, or OpenAI
### Next Steps
After completing this module:
-1. Practice with the integrated exercises in this module
-2. Continue to [Module 3: Applications](../module-03-applications/)
+1. Practice with the integrated exercises in this module
+2. Continue to [Module 3: Application in Software Engineering](../module-03-applications/)
diff --git a/01-course/module-02-fundamentals/module2.ipynb b/01-course/module-02-fundamentals/module2.ipynb
index b047b6d..a9b6158 100644
--- a/01-course/module-02-fundamentals/module2.ipynb
+++ b/01-course/module-02-fundamentals/module2.ipynb
@@ -6,95 +6,49 @@
"source": [
"# Module 2 - Core Prompting Techniques\n",
"\n",
- "## What You'll Learn\n",
- "\n",
- "In this hands-on module, you'll master the fundamental prompting techniques that professional developers use daily. You'll learn to craft prompts that leverage role-playing, structured inputs, examples, and step-by-step reasoning to get consistently excellent results from AI assistants.\n",
- "\n",
- "**What you'll accomplish:**\n",
- "- ✅ Master role prompting and personas for specialized expertise\n",
- "- ✅ Use delimiters and structured inputs for complex scenarios\n",
- "- ✅ Apply few-shot examples to establish consistent output styles\n",
- "- ✅ Implement chain-of-thought reasoning for complex problems\n",
- "- ✅ Build advanced prompts that reference external documentation\n",
- "- ✅ Create production-ready prompts for software engineering tasks\n",
- "\n",
- "## Prerequisites\n",
- "\n",
- "### Required Knowledge\n",
- "- Completion of Module 1 (Foundation Setup) or equivalent experience\n",
- "- Basic understanding of prompt structure (instructions, context, input, output format)\n",
- "- Familiarity with Python and software development concepts\n",
- "\n",
- "### Required Setup\n",
- "- [ ] Python 3.8+ installed on your system\n",
- "- [ ] IDE with notebook support (VS Code, Cursor, or Jupyter)\n",
- "- [ ] API access to either:\n",
- " - GitHub Copilot (preferred for this tutorial)\n",
- " - CircuIT APIs, or\n",
- " - OpenAI API key\n",
- "\n",
- "### Time Required\n",
- "- Approximately 45 minutes total\n",
- "- Can be completed in 2 sessions of 20-25 minutes each\n",
- "\n",
- "## Tutorial Structure\n",
- "\n",
- "### Part 1: Role Prompting and Personas (15 min)\n",
- "- Learn to assign specific expertise roles to AI assistants\n",
- "- Practice with software engineering personas\n",
- "- See immediate improvements in response quality\n",
- "\n",
- "### Part 2: Structured Inputs and Delimiters (15 min)\n",
- "- Master the use of delimiters for complex inputs\n",
- "- Organize multi-file code scenarios\n",
- "- Handle mixed content types effectively\n",
- "\n",
- "### Part 3: Examples and Chain-of-Thought (15 min)\n",
- "- Use few-shot examples to establish consistent styles\n",
- "- Implement step-by-step reasoning for complex tasks\n",
- "- Build systematic approaches to code analysis\n",
+ "| **Aspect** | **Details** |\n",
+ "|-------------|-------------|\n",
+ "| **Goal** | Craft prompts that leverage role-playing, structured inputs, examples, and step-by-step reasoning to get consistently excellent results from AI assistants |\n",
+ "| **Time** | ~90-120 minutes (1.5-2 hours) |\n",
+ "| **Prerequisites** | Python 3.8+, IDE with notebook support, API access (GitHub Copilot, CircuIT, or OpenAI) |\n",
+ "| **Setup Required** | Clone the repository and follow [Quick Setup](../README.md) before running this notebook |\n",
"\n",
"---\n",
"\n",
"## 🚀 Ready to Start?\n",
"\n",
- "**Important:** This module requires fresh setup. Even if you completed Module 1, please run the setup cells below to ensure everything works correctly.\n"
+ "
\n",
+ "⚠️ Important:
\n",
+ "This module requires fresh setup. Even if you completed Module 1, run the setup cells below to ensure everything works correctly. \n",
+ "
\n",
+ "\n",
+ "---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "---\n",
+ "## 🔧 Setup: Environment Configuration\n",
"\n",
- "# Fresh Environment Setup\n",
+ "### Step 1: Install Required Dependencies\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
"\n",
- "Even if you completed Module 1, please run these setup cells to ensure your environment is ready for Module 2.\n",
+ "Let's start by installing the packages we need for this tutorial.\n",
"\n",
- "## Step 0.1: Install Dependencies\n"
+ "Run the cell below. You should see a success message when installation completes:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "❌ Installation failed: Command '['/Users/snekarma/Development/SplunkDev/prompteng-devs/.venv/bin/python', '-m', 'pip', 'install', '-r', './requirements.txt']' returned non-zero exit status 1.\n",
- "💡 Try running: pip install openai anthropic python-dotenv requests\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "/Users/snekarma/Development/SplunkDev/prompteng-devs/.venv/bin/python: No module named pip\n"
- ]
- }
- ],
+ "outputs": [],
"source": [
"# Install required packages for Module 2\n",
"import subprocess\n",
@@ -103,7 +57,7 @@
"def install_requirements():\n",
" try:\n",
" # Install from requirements.txt\n",
- " subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-r\", \"requirements.txt\"])\n",
+ " subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-q\", \"-r\", \"requirements.txt\"])\n",
" print(\"✅ SUCCESS! Module 2 dependencies installed successfully.\")\n",
" print(\"📦 Ready for: openai, anthropic, python-dotenv, requests\")\n",
" except subprocess.CalledProcessError as e:\n",
@@ -117,9 +71,27 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Step 0.2: Configure API Connection\n",
+ "### Step 2: Connect to AI Model\n",
+ "\n",
+ "
\n",
+ "💡 Note:
\n",
+ "The code below runs on your local machine and connects to AI services over the internet.\n",
+ "
\n",
"\n",
- "Choose your preferred API option and run the corresponding cell:\n"
+ "Choose your preferred option:\n",
+ "\n",
+ "- **Option A: GitHub Copilot API (local proxy)** ⭐ **Recommended**: \n",
+ " - Supports both **Claude** and **OpenAI** models\n",
+ " - No API keys needed - uses your GitHub Copilot subscription\n",
+ " - Follow [GitHub-Copilot-2-API/README.md](../../GitHub-Copilot-2-API/README.md) to authenticate and start the local server\n",
+ " - Run the setup cell below and **edit your preferred provider** (`\"openai\"` or `\"claude\"`) by setting the `PROVIDER` variable\n",
+ " - Available models:\n",
+ " - **OpenAI**: gpt-4o, gpt-4, gpt-3.5-turbo, o3-mini, o4-mini\n",
+ " - **Claude**: claude-3.5-sonnet, claude-3.7-sonnet, claude-sonnet-4\n",
+ "\n",
+ "- **Option B: OpenAI API**: If you have OpenAI API access, uncomment and run the **Option B** cell below.\n",
+ "\n",
+ "- **Option C: CircuIT APIs (Azure OpenAI)**: If you have CircuIT API access, uncomment and run the **Option C** cell below.\n"
]
},
{
@@ -130,37 +102,279 @@
"source": [
"# Option A: GitHub Copilot API setup (Recommended)\n",
"import openai\n",
+ "import anthropic\n",
"import os\n",
"\n",
- "# Configure for local GitHub Copilot proxy\n",
- "client = openai.OpenAI(\n",
+ "# ============================================\n",
+ "# 🎯 CHOOSE YOUR AI MODEL PROVIDER\n",
+ "# ============================================\n",
+ "# Set your preference: \"openai\" or \"claude\"\n",
+ "PROVIDER = \"claude\" # Change to \"claude\" to use Claude models\n",
+ "\n",
+ "# ============================================\n",
+ "# 📋 Available Models by Provider\n",
+ "# ============================================\n",
+ "# OpenAI Models (via GitHub Copilot):\n",
+ "# - gpt-4o (recommended, supports vision)\n",
+ "# - gpt-4\n",
+ "# - gpt-3.5-turbo\n",
+ "# - o3-mini, o4-mini\n",
+ "#\n",
+ "# Claude Models (via GitHub Copilot):\n",
+ "# - claude-3.5-sonnet (recommended, supports vision)\n",
+ "# - claude-3.7-sonnet (supports vision)\n",
+ "# - claude-sonnet-4 (supports vision)\n",
+ "# ============================================\n",
+ "\n",
+ "# Configure clients for both providers\n",
+ "openai_client = openai.OpenAI(\n",
" base_url=\"http://localhost:7711/v1\",\n",
" api_key=\"dummy-key\"\n",
")\n",
"\n",
- "def get_chat_completion(messages, model=\"gpt-4\", temperature=0.7):\n",
- " \"\"\"Get a chat completion from the AI model.\"\"\"\n",
+ "claude_client = anthropic.Anthropic(\n",
+ " api_key=\"dummy-key\",\n",
+ " base_url=\"http://localhost:7711\"\n",
+ ")\n",
+ "\n",
+ "# Set default models for each provider\n",
+ "OPENAI_DEFAULT_MODEL = \"gpt-4o\"\n",
+ "CLAUDE_DEFAULT_MODEL = \"claude-3.5-sonnet\"\n",
+ "\n",
+ "\n",
+ "def _extract_text_from_blocks(blocks):\n",
+ " \"\"\"Extract text content from response blocks returned by the API.\"\"\"\n",
+ " parts = []\n",
+ " for block in blocks:\n",
+ " text_val = getattr(block, \"text\", None)\n",
+ " if isinstance(text_val, str):\n",
+ " parts.append(text_val)\n",
+ " elif isinstance(block, dict):\n",
+ " t = block.get(\"text\")\n",
+ " if isinstance(t, str):\n",
+ " parts.append(t)\n",
+ " return \"\\n\".join(parts)\n",
+ "\n",
+ "\n",
+ "def get_openai_completion(messages, model=None, temperature=0.0):\n",
+ " \"\"\"Get completion from OpenAI models via GitHub Copilot.\"\"\"\n",
+ " if model is None:\n",
+ " model = OPENAI_DEFAULT_MODEL\n",
" try:\n",
- " response = client.chat.completions.create(\n",
+ " response = openai_client.chat.completions.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature\n",
" )\n",
" return response.choices[0].message.content\n",
" except Exception as e:\n",
- " return f\"❌ Error: {e}\\\\n💡 Make sure GitHub Copilot proxy is running on port 7711\"\n",
+ " return f\"❌ Error: {e}\\n💡 Make sure GitHub Copilot proxy is running on port 7711\"\n",
+ "\n",
+ "\n",
+ "def get_claude_completion(messages, model=None, temperature=0.0):\n",
+ " \"\"\"Get completion from Claude models via GitHub Copilot.\"\"\"\n",
+ " if model is None:\n",
+ " model = CLAUDE_DEFAULT_MODEL\n",
+ " try:\n",
+ " response = claude_client.messages.create(\n",
+ " model=model,\n",
+ " max_tokens=8192,\n",
+ " messages=messages,\n",
+ " temperature=temperature\n",
+ " )\n",
+ " return _extract_text_from_blocks(getattr(response, \"content\", []))\n",
+ " except Exception as e:\n",
+ " return f\"❌ Error: {e}\\n💡 Make sure GitHub Copilot proxy is running on port 7711\"\n",
+ "\n",
+ "\n",
+ "def get_chat_completion(messages, model=None, temperature=0.7):\n",
+ " \"\"\"\n",
+ " Generic function to get chat completion from any provider.\n",
+ " Routes to the appropriate provider-specific function based on PROVIDER setting.\n",
+ " \"\"\"\n",
+ " if PROVIDER.lower() == \"claude\":\n",
+ " return get_claude_completion(messages, model, temperature)\n",
+ " else: # Default to OpenAI\n",
+ " return get_openai_completion(messages, model, temperature)\n",
+ "\n",
+ "\n",
+ "def get_default_model():\n",
+ " \"\"\"Get the default model for the current provider.\"\"\"\n",
+ " if PROVIDER.lower() == \"claude\":\n",
+ " return CLAUDE_DEFAULT_MODEL\n",
+ " else:\n",
+ " return OPENAI_DEFAULT_MODEL\n",
+ "\n",
+ "\n",
+ "# ============================================\n",
+ "# 🧪 TEST CONNECTION\n",
+ "# ============================================\n",
+ "print(\"🔄 Testing connection to GitHub Copilot proxy...\")\n",
+ "test_result = get_chat_completion([\n",
+ " {\"role\": \"user\", \"content\": \"test\"}\n",
+ "])\n",
+ "\n",
+ "if test_result and \"Error\" in test_result:\n",
+ " print(\"\\n\" + \"=\"*60)\n",
+ " print(\"❌ CONNECTION FAILED!\")\n",
+ " print(\"=\"*60)\n",
+ " print(f\"Provider: {PROVIDER.upper()}\")\n",
+ " print(f\"Expected endpoint: http://localhost:7711\")\n",
+ " print(\"\\n⚠️ The GitHub Copilot proxy is NOT running!\")\n",
+ " print(\"\\n📋 To fix this:\")\n",
+ " print(\" 1. Open a new terminal\")\n",
+ " print(\" 2. Navigate to your copilot-api directory\")\n",
+ " print(\" 3. Run: uv run copilot2api start\")\n",
+ " print(\" 4. Wait for the server to start (you should see 'Server initialized')\")\n",
+ " print(\" 5. Come back and rerun this cell\")\n",
+ " print(\"\\n💡 Need setup help? See: GitHub-Copilot-2-API/README.md\")\n",
+ " print(\"=\"*70)\n",
+ "else:\n",
+ " print(\"\\n\" + \"=\"*60)\n",
+ " print(\"✅ CONNECTION SUCCESSFUL!\")\n",
+ " print(\"=\"*60)\n",
+ " print(f\"🤖 Provider: {PROVIDER.upper()}\")\n",
+ " print(f\"📦 Default Model: {get_default_model()}\")\n",
+ " print(f\"🔗 Endpoint: http://localhost:7711\")\n",
+ " print(f\"\\n💡 To switch providers, change PROVIDER to '{'claude' if PROVIDER.lower() == 'openai' else 'openai'}' and rerun this cell\")\n",
+ " print(\"=\"*70)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Option B: Direct OpenAI API\n",
+ "\n",
+ "**Setup:** Add your API key to `.env` file, then uncomment and run:\n",
+ "\n",
+ "> 💡 **Note:** This option requires a paid OpenAI API account. If you're using GitHub Copilot, stick with Option A above.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# # Option B: Direct OpenAI API setup\n",
+ "# import openai\n",
+ "# import os\n",
+ "# from dotenv import load_dotenv\n",
+ "\n",
+ "# load_dotenv()\n",
+ "\n",
+ "# client = openai.OpenAI(\n",
+ "# api_key=os.getenv(\"OPENAI_API_KEY\") # Set this in your .env file\n",
+ "# )\n",
+ "\n",
+ "# def get_chat_completion(messages, model=\"gpt-4o\", temperature=0.7):\n",
+ "# \"\"\"Get a chat completion from OpenAI.\"\"\"\n",
+ "# try:\n",
+ "# response = client.chat.completions.create(\n",
+ "# model=model,\n",
+ "# messages=messages,\n",
+ "# temperature=temperature\n",
+ "# )\n",
+ "# return response.choices[0].message.content\n",
+ "# except Exception as e:\n",
+ "# return f\"❌ Error: {e}\"\n",
+ "\n",
+ "# print(\"✅ OpenAI API configured successfully!\")\n",
+ "# print(\"🤖 Using OpenAI's official API\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Option C: CircuIT APIs (Azure OpenAI)\n",
+ "\n",
+ "**Setup:** Configure environment variables (`CISCO_CLIENT_ID`, `CISCO_CLIENT_SECRET`, `CISCO_OPENAI_APP_KEY`) in `.env` file.\n",
+ "\n",
+ "Get values from: https://ai-chat.cisco.com/bridgeit-platform/api/home\n",
+ "\n",
+ "Then uncomment and run:\n",
"\n",
- "print(\"✅ GitHub Copilot API configured for Module 2!\")\n",
- "print(\"🔗 Connected to: http://localhost:7711\")\n"
+ "> 💡 **Note:** This option is for Cisco employees with CircuIT API access.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# # Option C: CircuIT APIs (Azure OpenAI) setup\n",
+ "# import openai\n",
+ "# import traceback\n",
+ "# import requests\n",
+ "# import base64\n",
+ "# import os\n",
+ "# from dotenv import load_dotenv\n",
+ "# from openai import AzureOpenAI\n",
+ "\n",
+ "# # Load environment variables\n",
+ "# load_dotenv()\n",
+ "\n",
+ "# # Open AI version to use\n",
+ "# openai.api_type = \"azure\"\n",
+ "# openai.api_version = \"2024-12-01-preview\"\n",
+ "\n",
+ "# # Get API_KEY wrapped in token - using environment variables\n",
+ "# client_id = os.getenv(\"CISCO_CLIENT_ID\")\n",
+ "# client_secret = os.getenv(\"CISCO_CLIENT_SECRET\")\n",
+ "\n",
+ "# url = \"https://id.cisco.com/oauth2/default/v1/token\"\n",
+ "\n",
+ "# payload = \"grant_type=client_credentials\"\n",
+ "# value = base64.b64encode(f\"{client_id}:{client_secret}\".encode(\"utf-8\")).decode(\"utf-8\")\n",
+ "# headers = {\n",
+ "# \"Accept\": \"*/*\",\n",
+ "# \"Content-Type\": \"application/x-www-form-urlencoded\",\n",
+ "# \"Authorization\": f\"Basic {value}\",\n",
+ "# }\n",
+ "\n",
+ "# token_response = requests.request(\"POST\", url, headers=headers, data=payload)\n",
+ "# print(token_response.text)\n",
+ "# token_data = token_response.json()\n",
+ "\n",
+ "# client = AzureOpenAI(\n",
+ "# azure_endpoint=\"https://chat-ai.cisco.com\",\n",
+ "# api_key=token_data.get(\"access_token\"),\n",
+ "# api_version=\"2024-12-01-preview\",\n",
+ "# )\n",
+ "\n",
+ "# app_key = os.getenv(\"CISCO_OPENAI_APP_KEY\")\n",
+ "\n",
+ "# def get_chat_completion(messages, model=\"gpt-4o\", temperature=0.7):\n",
+ "# \"\"\"Get a chat completion from CircuIT APIs.\"\"\"\n",
+ "# try:\n",
+ "# response = client.chat.completions.create(\n",
+ "# model=model,\n",
+ "# messages=messages,\n",
+ "# temperature=temperature,\n",
+ "# user=f'{{\"appkey\": \"{app_key}\"}}',\n",
+ "# )\n",
+ "# return response.choices[0].message.content\n",
+ "# except Exception as e:\n",
+ "# return f\"❌ Error: {e}\"\n",
+ "\n",
+ "# print(\"✅ CircuIT APIs configured successfully!\")\n",
+ "# print(\"🤖 Using Azure OpenAI via CircuIT\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Step 0.3: Verify Setup\n",
+ "### Step 3: Test Connection\n",
"\n",
- "Let's test that everything is working before we begin:\n"
+ "Let's test that everything is working before we begin:\n",
+ "\n",
+ "
\n",
+ "💡 Tip: If you see long AI responses and the output shows \"Output is truncated. View as a scrollable element\" - click that link to see the full response in a scrollable view!\n",
+ "
\n"
]
},
{
@@ -186,9 +400,82 @@
"print(response)\n",
"\n",
"if response and (\"verified\" in response.lower() or \"ready\" in response.lower()):\n",
- " print(\"\\\\n🎉 Perfect! Module 2 environment is ready!\")\n",
+ " print(\"\\n🎉 Perfect! Module 2 environment is ready!\")\n",
"else:\n",
- " print(\"\\\\n⚠️ Setup test complete. Let's continue with the tutorial!\")\n"
+ " print(\"\\n⚠️ Setup test complete. Let's continue with the tutorial!\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "## 🎯 Core Prompt Engineering Techniques\n",
+ "\n",
+ "### Introduction: The Art of Prompt Engineering\n",
+ "\n",
+ "#### 🚀 Ready to Transform Your AI Interactions?\n",
+ "\n",
+ "You've successfully set up your environment and tested the connection. Now comes the exciting part - **learning the tactical secrets** that separate amateur prompt writers from AI power users.\n",
+ "\n",
+ "Think of what you've accomplished so far as **laying the foundation** of a house. Now we're about to build the **architectural masterpiece** that will revolutionize how you work with AI assistants.\n",
+ "\n",
+ "\n",
+ "#### 👨🏫 What You're About to Master\n",
+ "\n",
+ "In the next sections, you'll discover **eight core tactics** that professional developers use to get consistently excellent results from AI:\n",
+ "\n",
+ "
\n",
+ "\n",
+ "
\n",
+ "🎭 Role Prompting \n",
+ "Transform AI into specialized experts\n",
+ "
\n",
+ "⚖️ LLM-as-Judge \n",
+ "Use AI to evaluate and improve outputs\n",
+ "
\n",
+ "\n",
+ "
\n",
+ "🤫 Inner Monologue \n",
+ "Hide reasoning, show only final results\n",
+ "
\n",
+ "\n",
+ "
\n",
+ "\n",
+ "
\n",
+ "💡 Pro Tip:
\n",
+ "This module covers 8 powerful tactics over 90-120 minutes. Take short breaks between tactics to reflect on how you can apply each technique to your day-to-day work. Make notes as you progress—jot down specific use cases from your projects where each tactic could be valuable. This active reflection will help you retain the techniques and integrate them into your workflow faster!\n",
+ "
\n",
+ "\n",
+ "---"
]
},
{
@@ -197,28 +484,51 @@
"source": [
"---\n",
"\n",
- "# Part 1: Role Prompting and Personas\n",
+ "### 🎬 Tactic 0: Write Clear Instructions\n",
+ "\n",
+ "**Foundation Principle** - Before diving into advanced tactics, master the art of clear, specific instructions."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Core Principle:** When interacting with AI models, think of them as brilliant but very new employees who need explicit instructions. The more precisely you explain what you want—including context, specific requirements, and sequential steps—the better the AI's response will be.\n",
"\n",
- "In this section, you'll learn to assign specific roles and expertise to AI assistants, dramatically improving the quality and relevance of their responses.\n",
+ "**The Golden Rule:** Show your prompt to a colleague with minimal context on the task. If they're confused, the AI will likely be too.\n",
"\n",
- "## Learning Outcomes for Part 1\n",
+ "**Software Engineering Application:** This tactic becomes crucial when asking for code refactoring, where you need to specify coding standards, performance requirements, and constraints to get production-ready results.\n",
"\n",
- "By the end of this section, you will:\n",
- "- [ ] Understand how personas improve AI responses\n",
- "- [ ] Write effective role prompts for software engineering tasks\n",
- "- [ ] See immediate improvements in code review quality\n",
- "- [ ] Know when and how to use different engineering personas\n"
+ "*Reference: [Claude Documentation - Be Clear and Direct](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/be-clear-and-direct)*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Step 1.1: Your First Role Prompt\n",
+ "#### Example: Vague vs. Specific Instructions\n",
+ "\n",
+ "**Why This Works:** Specific instructions eliminate ambiguity and guide the model toward your exact requirements.\n",
+ "\n",
+ "Let's compare a generic approach with a specific one:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Vague request - typical beginner mistake\n",
+ "messages = [\n",
+ " {\"role\": \"user\", \"content\": \"Help me choose a programming language for my project\"}\n",
+ "]\n",
"\n",
- "Let's start with a simple example to see the power of role prompting. We'll compare a generic request with a role-specific one.\n",
+ "response = get_chat_completion(messages)\n",
"\n",
- "**First, let's try a generic approach:**\n"
+ "print(\"VAGUE REQUEST RESULT:\")\n",
+ "print(response)\n",
+ "print(\"\\n\" + \"=\"*50 + \"\\n\")"
]
},
{
@@ -227,25 +537,26 @@
"metadata": {},
"outputs": [],
"source": [
- "# Generic approach - no specific role\n",
- "generic_messages = [\n",
+ "# Specific request - much better results\n",
+ "messages = [\n",
" {\n",
" \"role\": \"user\",\n",
- " \"content\": \"Look at this function and tell me what you think: def calc(x, y): return x + y if x > 0 and y > 0 else 0\"\n",
+ " \"content\": \"I need to choose a programming language for building a real-time chat application that will handle 10,000 concurrent users, needs to integrate with a PostgreSQL database, and must be deployable on AWS. The team has 3 years of experience with web development. Provide the top 3 language recommendations with pros and cons for each.\",\n",
" }\n",
"]\n",
"\n",
- "generic_response = get_chat_completion(generic_messages)\n",
- "print(\"🔍 GENERIC RESPONSE:\")\n",
- "print(generic_response)\n",
- "print(\"\\\\n\" + \"=\"*50 + \"\\\\n\")\n"
+ "response = get_chat_completion(messages)\n",
+ "\n",
+ "print(\"SPECIFIC REQUEST RESULT:\")\n",
+ "print(response)\n",
+ "print(\"\\n\" + \"=\"*50 + \"\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "**Now, let's try the same request with a specific role:**\n"
+ "Another way to achieve specificity using the `system prompt`. This is particularly useful when you want to keep the user request clean while providing detailed instructions about response format and constraints."
]
},
{
@@ -254,44 +565,88 @@
"metadata": {},
"outputs": [],
"source": [
- "# Role-specific approach - code reviewer persona\n",
- "role_messages = [\n",
+ "messages = [\n",
" {\n",
" \"role\": \"system\",\n",
- " \"content\": \"\"\"You are a senior code reviewer.\n",
- "\n",
- " Analyze the provided code and give exactly 3 specific feedback points: \n",
- " 1. about code structure\n",
- " 2. about naming conventions\n",
- " 3. about potential improvements\n",
- " \n",
- " Format each point as a bullet with the category in brackets.\"\"\"\n",
+ " \"content\": \"You are a senior technical architect. Provide concise, actionable recommendations in bullet format. Focus only on the most critical factors for the decision. No lengthy explanations.\",\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
- " \"content\": \"def calc(x, y): return x + y if x > 0 and y > 0 else 0\"\n",
- " }\n",
+ " \"content\": \"Help me choose between microservices and monolithic architecture for a startup with 5 developers building a fintech application\",\n",
+ " },\n",
"]\n",
"\n",
- "role_response = get_chat_completion(role_messages)\n",
- "print(\"🎯 ROLE-SPECIFIC RESPONSE:\")\n",
- "print(role_response)\n"
+ "response = get_chat_completion(messages)\n",
+ "\n",
+ "print(\"SYSTEM PROMPT RESULT:\")\n",
+ "print(response)\n",
+ "print(\"\\n\" + \"=\"*50 + \"\\n\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "### 🎭 Tactic 1: Role Prompting\n",
+ "\n",
+ "**Transform AI into specialized domain experts**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "🎉 **Amazing difference!** Notice how the role-specific response is more structured, actionable, and focused.\n",
+ "**Why This Works:** Role prompting using the `system` parameter is the most powerful way to transform any LLM from a general assistant into your virtual domain expert. The right role enhances accuracy in complex scenarios, tailors the communication tone, and improves focus by keeping LLM within the bounds of your task's specific requirements.\n",
+ "\n",
+ "*Reference: [Claude Documentation - System Prompts](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/system-prompts)*\n",
+ "\n",
+ "**Generic Example:**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Instead of asking for a generic response, adopt a specific persona\n",
+ "messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a code reviewer. Analyze the provided code and give exactly 3 specific feedback points: 1 about code structure, 1 about naming conventions, and 1 about potential improvements. Format each point as a bullet with the category in brackets.\",\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": \"def calc(x, y): return x + y if x > 0 and y > 0 else 0\",\n",
+ " },\n",
+ "]\n",
+ "response = get_chat_completion(messages)\n",
"\n",
- "💡 **What made the difference?**\n",
- "- **Specific expertise role** (\"senior code reviewer\")\n",
- "- **Clear output requirements** (exactly 3 points with specific categories)\n",
- "- **Structured format** (bullets with category labels)\n",
+ "print(\"CODE REVIEWER PERSONA RESULT:\")\n",
+ "print(response)\n",
+ "print(\"\\n\" + \"=\"*50 + \"\\n\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Example: Software Engineering Personas\n",
"\n",
- "## Step 1.2: Software Engineering Personas\n",
+ "In coding scenarios, this tactic transforms into:\n",
"\n",
- "Let's practice with different software engineering roles to see how each provides specialized expertise:\n"
+ "- **Specific refactoring requirements** (e.g., \"Extract this into separate classes following SOLID principles\")\n",
+ "- **Detailed code review criteria** (e.g., \"Focus on security vulnerabilities and performance bottlenecks\")\n",
+ "- **Precise testing specifications** (e.g., \"Generate unit tests with 90% coverage including edge cases\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Below cells show how different engineering personas provide specialized expertise for code reviews."
]
},
{
@@ -320,7 +675,7 @@
"security_response = get_chat_completion(security_messages)\n",
"print(\"🔒 SECURITY ENGINEER ANALYSIS:\")\n",
"print(security_response)\n",
- "print(\"\\\\n\" + \"=\"*50 + \"\\\\n\")\n"
+ "print(\"\\n\" + \"=\"*50 + \"\\n\")\n"
]
},
{
@@ -350,14 +705,15 @@
"\n",
"performance_response = get_chat_completion(performance_messages)\n",
"print(\"⚡ PERFORMANCE ENGINEER ANALYSIS:\")\n",
- "print(performance_response)\n"
+ "print(performance_response)\n",
+ "print(\"\\n\" + \"=\"*50 + \"\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Checkpoint: Compare the Responses\n",
+ "### Checkpoint: Compare the Responses\n",
"\n",
"Notice how each engineering persona focused on their area of expertise:\n",
"\n",
@@ -366,9 +722,9 @@
"\n",
"✅ **Success!** You've seen how role prompting provides specialized, expert-level analysis.\n",
"\n",
- "## Step 1.3: Practice - Create Your Own Persona\n",
+ "#### Practice - Create Your Own Persona\n",
"\n",
- "Now it's your turn! Create a \"QA Engineer\" persona to analyze test coverage:\n"
+ "Now it's your turn! Create a \"QA Engineer\" persona to analyze test coverage edit the `system prompt`:\n"
]
},
{
@@ -377,13 +733,12 @@
"metadata": {},
"outputs": [],
"source": [
- "# Your turn: Create a QA Engineer persona\n",
- "# Fill in the system message to create a QA Engineer role\n",
- "\n",
+ "# TODO: Fill in the system message to create a QA Engineer role\n",
+ "# Hint: Focus on test cases, edge cases, and error scenarios\n",
"qa_messages = [\n",
" {\n",
" \"role\": \"system\",\n",
- " \"content\": \"You are a QA engineer. Analyze the provided function and identify test cases needed, including edge cases and error scenarios. Provide specific test recommendations.\"\n",
+ " \"content\": \"\"\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
@@ -400,37 +755,36 @@
"\n",
"qa_response = get_chat_completion(qa_messages)\n",
"print(\"🧪 QA ENGINEER ANALYSIS:\")\n",
- "print(qa_response)\n"
+ "print(qa_response)\n",
+ "print(\"\\n\" + \"=\"*50 + \"\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "🎉 **Excellent!** You've created your own engineering persona and seen how it provides specialized test analysis.\n",
- "\n",
"---\n",
"\n",
- "# Part 2: Structured Inputs and Delimiters\n",
+ "### 📋 Tactic 2: Structured Inputs\n",
+ "\n",
+ "**Organize complex scenarios with XML delimiters**\n",
"\n",
- "Now you'll learn to organize complex inputs using delimiters, making your prompts crystal clear even with multiple files, requirements, and data types.\n",
+ "**Core Principle:** When your prompts involve multiple components like context, instructions, and examples, delimiters (especially XML tags) can be a game-changer. They help AI models parse your prompts more accurately, leading to higher-quality outputs.\n",
"\n",
- "## Learning Outcomes for Part 2\n",
+ "**Why This Works:**\n",
+ "- **Clarity:** Clearly separate different parts of your prompt and ensure your prompt is well structured\n",
+ "- **Accuracy:** Reduce errors caused by AI models misinterpreting parts of your prompt \n",
+ "- **Flexibility:** Easily find, add, remove, or modify parts of your prompt without rewriting everything\n",
+ "- **Parseability:** Having the AI use delimiters in its output makes it easier to extract specific parts of its response\n",
"\n",
- "By the end of this section, you will:\n",
- "- [ ] Use delimiters to organize complex, multi-part inputs\n",
- "- [ ] Handle multi-file code scenarios effectively\n",
- "- [ ] Separate different types of content (code, requirements, documentation)\n",
- "- [ ] Build prompts that scale to real-world complexity\n"
+ "**Software Engineering Application Preview:** Essential for multi-file refactoring, separating code from requirements, and organizing complex code review scenarios."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Step 2.1: Basic Delimiters\n",
- "\n",
- "Let's start with a simple example showing how delimiters clarify different sections of your prompt:\n"
+ "Let's start with a simple example showing how delimiters clarify different sections of your prompt by using `###` as delimiters:"
]
},
{
@@ -466,16 +820,45 @@
"\n",
"delimiter_response = get_chat_completion(delimiter_messages)\n",
"print(\"🔧 REFACTORED CODE:\")\n",
- "print(delimiter_response)\n"
+ "print(delimiter_response)\n",
+ "print(\"\\n\" + \"=\"*70 + \"\\n\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Multi-File Scenarios with XML Delimiters\n",
+ "\n",
+ "One of the most powerful techniques for complex software development tasks is using XML tags and delimiters to structure your prompts. This approach dramatically improves AI accuracy and reduces misinterpretation.\n",
+ "\n",
+ "**Key Benefits:**\n",
+ "- **Clarity**: Clearly separate different parts of your prompt (instructions, context, examples)\n",
+ "- **Accuracy**: Reduce errors caused by AI misinterpreting parts of your prompt\n",
+ "- **Flexibility**: Easily modify specific sections without rewriting everything\n",
+ "- **Parseability**: Structure AI outputs for easier post-processing\n",
+ "\n",
+ "**Best Practices:**\n",
+ "- Use tags like ``, ``, and `` to clearly separate different parts\n",
+ "- Be consistent with tag names throughout your prompts\n",
+ "- Nest tags hierarchically: `` for structured content\n",
+ "- Choose meaningful tag names that describe their content\n",
+ "\n",
+ "**Reference**: Learn more about XML tagging best practices in the [Claude Documentation on XML Tags](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Step 2.2: Multi-File Scenarios with XML Delimiters\n",
+ "In coding scenarios, delimiters become essential for:\n",
+ "\n",
+ "- **Multi-file refactoring** - Separate different files being modified: ``, ``\n",
+ "- **Code vs. requirements** - Distinguish between `` and ``\n",
+ "- **Test scenarios** - Organize ``, ``, ``\n",
+ "- **Pull request reviews** - Structure ``, ``, ``\n",
"\n",
- "For complex projects with multiple files, XML-style delimiters work even better:\n"
+ "The below cell demonstrates multi-file refactoring using XML delimiters to organize complex codebases."
]
},
{
@@ -531,7 +914,8 @@
"\n",
"multifile_response = get_chat_completion(multifile_messages)\n",
"print(\"🏗️ ARCHITECTURAL ANALYSIS:\")\n",
- "print(multifile_response)\n"
+ "print(multifile_response)\n",
+ "print(\"\\n\" + \"=\"*70 + \"\\n\")"
]
},
{
@@ -540,26 +924,33 @@
"source": [
"---\n",
"\n",
- "# Part 3: Examples and Chain-of-Thought\n",
+ "### 📚 Tactic 3: Few-Shot Examples\n",
"\n",
- "In this final section, you'll master two powerful techniques: few-shot examples to establish consistent styles, and chain-of-thought reasoning for complex problem solving.\n",
+ "**Teach AI your preferred styles and standards**\n",
"\n",
- "## Learning Outcomes for Part 3\n",
+ "**Core Principle:** Examples are your secret weapon for getting AI models to generate exactly what you need. By providing a few well-crafted examples in your prompt, you can dramatically improve the accuracy, consistency, and quality of outputs. This technique, known as few-shot or multishot prompting, is particularly effective for tasks that require structured outputs or adherence to specific formats.\n",
"\n",
- "By the end of this section, you will:\n",
- "- [ ] Use few-shot examples to teach AI your preferred response style\n",
- "- [ ] Implement step-by-step reasoning for complex tasks\n",
- "- [ ] Build systematic approaches to code analysis\n",
- "- [ ] Create production-ready prompts that scale\n"
+ "**Why This Works:**\n",
+ "- **Accuracy:** Examples reduce misinterpretation of instructions\n",
+ "- **Consistency:** Examples enforce uniform structure and style across outputs\n",
+ "- **Performance:** Well-chosen examples boost AI's ability to handle complex tasks\n",
+ "\n",
+ "**Crafting Effective Examples:**\n",
+ "- **Relevant:** Your examples should mirror your actual use case\n",
+ "- **Diverse:** Cover edge cases and vary enough to avoid unintended patterns\n",
+ "- **Clear:** Wrap examples in `` tags (if multiple, nest within `` tags)\n",
+ "- **Quantity:** Include 3-5 diverse examples for best results (more examples = better performance)\n",
+ "\n",
+ "**Software Engineering Application Preview:** Essential for establishing coding styles, documentation formats, test case patterns, and consistent API response structures across your development workflow.\n",
+ "\n",
+ "*Reference: [Claude Documentation - Multishot Prompting](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/multishot-prompting)*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Step 3.1: Few-Shot Examples for Consistent Style\n",
- "\n",
- "Let's teach the AI to explain technical concepts in a specific, consistent style:\n"
+ "Let's teach the AI to explain technical concepts in a specific, consistent style:"
]
},
{
@@ -586,77 +977,66 @@
"\n",
"few_shot_response = get_chat_completion(few_shot_messages)\n",
"print(\"📚 CONSISTENT STYLE RESPONSE:\")\n",
- "print(few_shot_response)\n"
+ "print(few_shot_response)\n",
+ "print(\"\\n\" + \"=\"*70 + \"\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "🎯 **Perfect!** Notice how the AI learned the exact format and style from the examples and applied it consistently.\n",
- "\n",
- "## Step 3.2: Chain-of-Thought Reasoning\n",
- "\n",
- "Now let's implement step-by-step reasoning for complex code analysis tasks:\n"
+ "🎯 **Perfect!** Notice how the AI learned the exact format and style from the examples and applied it consistently.\n"
]
},
{
- "cell_type": "code",
- "execution_count": null,
+ "cell_type": "markdown",
"metadata": {},
- "outputs": [],
"source": [
- "# Chain-of-thought for systematic code analysis\n",
- "system_message = \"\"\"Use the following step-by-step instructions to analyze code:\n",
+ "---\n",
"\n",
- "Step 1 - Count the number of functions in the code snippet with a prefix that says 'Function Count: '\n",
- "Step 2 - List each function name with its line number with a prefix that says 'Function List: '\n",
- "Step 3 - Identify any functions that are longer than 10 lines with a prefix that says 'Long Functions: '\n",
- "Step 4 - Provide an overall assessment with a prefix that says 'Assessment: '\"\"\"\n",
+ "### ⛓️💥 Tactic 4: Chain-of-Thought Reasoning\n",
"\n",
- "user_message = \"\"\"\n",
- "def calculate_tax(income, deductions):\n",
- " taxable_income = income - deductions\n",
- " if taxable_income <= 0:\n",
- " return 0\n",
- " elif taxable_income <= 50000:\n",
- " return taxable_income * 0.1\n",
- " else:\n",
- " return 50000 * 0.1 + (taxable_income - 50000) * 0.2\n",
+ "**Guide systematic step-by-step reasoning**\n",
"\n",
- "def format_currency(amount):\n",
- " return f\"${amount:,.2f}\"\n",
+ "**Core Principle:** When faced with complex tasks like research, analysis, or problem-solving, giving AI models space to think can dramatically improve performance. This technique, known as chain of thought (CoT) prompting, encourages the AI to break down problems step-by-step, leading to more accurate and nuanced outputs.\n",
"\n",
- "def generate_report(name, income, deductions):\n",
- " tax = calculate_tax(income, deductions)\n",
- " net_income = income - tax\n",
- " \n",
- " print(f\"Tax Report for {name}\")\n",
- " print(f\"Gross Income: {format_currency(income)}\")\n",
- " print(f\"Deductions: {format_currency(deductions)}\")\n",
- " print(f\"Tax Owed: {format_currency(tax)}\")\n",
- " print(f\"Net Income: {format_currency(net_income)}\")\n",
- "\"\"\"\n",
+ "**Why This Works:**\n",
+ "- **Accuracy:** Stepping through problems reduces errors, especially in math, logic, analysis, or generally complex tasks\n",
+ "- **Coherence:** Structured thinking leads to more cohesive, well-organized responses\n",
+ "- **Debugging:** Seeing the AI's thought process helps you pinpoint where prompts may be unclear\n",
"\n",
- "chain_messages = [\n",
- " {\"role\": \"system\", \"content\": system_message},\n",
- " {\"role\": \"user\", \"content\": user_message}\n",
- "]\n",
+ "**When to Use CoT:**\n",
+ "- Use for tasks that a human would need to think through\n",
+ "- Examples: complex math, multi-step analysis, writing complex documents, decisions with many factors\n",
+ "- **Note:** Increased output length may impact latency, so use judiciously\n",
"\n",
- "chain_response = get_chat_completion(chain_messages)\n",
- "print(\"🔗 CHAIN-OF-THOUGHT ANALYSIS:\")\n",
- "print(chain_response)\n"
+ "**How to Implement CoT (from least to most complex):**\n",
+ "\n",
+ "1. **Basic prompt:** Include \"Think step-by-step\" in your prompt\n",
+ "2. **Guided prompt:** Outline specific steps for the AI to follow in its thinking process\n",
+ "3. **Structured prompt:** Use XML tags like `` and `` to separate reasoning from the final answer\n",
+ "\n",
+ "**Important:** Always have the AI output its thinking. Without outputting its thought process, no thinking occurs!\n",
+ "\n",
+ "**Software Engineering Application Preview:** Critical for test generation, code reviews, debugging workflows, architecture decisions, and security analysis where methodical analysis prevents missed issues.\n",
+ "\n",
+ "*Reference: [Claude Documentation - Chain of Thought](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/chain-of-thought)*\n",
+ "\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "🚀 **Excellent!** The AI followed each step methodically, providing structured, comprehensive analysis.\n",
+ "#### Tactic: Give Models Time to Work Before Judging\n",
"\n",
- "## Step 3.3: Practice - Combine All Techniques\n",
+ "**Critical Tactic:** When asking AI to evaluate solutions, code, or designs, instruct it to solve the problem independently *before* judging the provided solution. This prevents premature agreement and ensures thorough analysis.\n",
"\n",
- "Now let's put everything together in a real-world scenario that combines role prompting, delimiters, and chain-of-thought:\n"
+ "**Why This Matters:** AI models can sometimes be too agreeable or overlook subtle issues when they jump straight to evaluation. By forcing them to work through the problem first, they develop genuine understanding and can provide more accurate assessments.\n",
+ "\n",
+ "**The Principle:** *\"Don't decide if the solution is correct until you have worked through the problem yourself.\"*\n",
+ "\n",
+ "Let's see this with a code review scenario:\n"
]
},
{
@@ -665,93 +1045,1381 @@
"metadata": {},
"outputs": [],
"source": [
- "# Comprehensive example combining all techniques\n",
- "comprehensive_messages = [\n",
- " {\n",
- " \"role\": \"system\",\n",
- " \"content\": \"\"\"You are a senior software engineer conducting a comprehensive code review.\n",
+ "# Example: Forcing AI to think before judging\n",
+ "problem = \"\"\"\n",
+ "Write a function that checks if a string is a palindrome.\n",
+ "The function should ignore spaces, punctuation, and case.\n",
+ "\"\"\"\n",
"\n",
- "Follow this systematic process:\n",
- "Step 1 - Security Analysis: Identify potential security vulnerabilities\n",
- "Step 2 - Performance Review: Analyze efficiency and optimization opportunities \n",
- "Step 3 - Code Quality: Evaluate readability, maintainability, and best practices\n",
- "Step 4 - Recommendations: Provide specific, prioritized improvement suggestions\n",
+ "student_solution = \"\"\"\n",
+ "def is_palindrome(s):\n",
+ " cleaned = ''.join(c.lower() for c in s if c.isalnum())\n",
+ " return cleaned == cleaned[::-1]\n",
+ "\"\"\"\n",
"\n",
- "Format each step clearly with the step name as a header.\"\"\"\n",
- " },\n",
+ "# BAD: Asking AI to judge immediately (may agree too quickly)\n",
+ "print(\"=\" * 70)\n",
+ "print(\"BAD APPROACH: Immediate Judgment\")\n",
+ "print(\"=\" * 70)\n",
+ "\n",
+ "bad_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a code reviewer.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Problem: {problem}\n",
+ "\n",
+ "Student's solution:\n",
+ "{student_solution}\n",
+ "\n",
+ "Is this solution correct?\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "bad_response = get_chat_completion(bad_messages)\n",
+ "print(bad_response)\n",
+ "\n",
+ "# GOOD: Force AI to solve it first, then compare\n",
+ "print(\"=\" * 70)\n",
+ "print(\"GOOD APPROACH: Work Through It First\")\n",
+ "print(\"=\" * 70)\n",
+ "\n",
+ "good_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a code reviewer with a methodical approach.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Problem: {problem}\n",
+ "\n",
+ "Student's solution:\n",
+ "{student_solution}\n",
+ "\n",
+ "Before evaluating the student's solution, follow these steps:\n",
+ "1. In tags, write your own implementation of the palindrome checker\n",
+ "2. In tags, create comprehensive test cases including edge cases\n",
+ "3. In tags, compare the student's solution to yours and test both\n",
+ "4. In tags, provide your final judgment with specific reasoning\n",
+ "\n",
+ "Important: Don't judge the student's solution until you've solved the problem yourself.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "good_response = get_chat_completion(good_messages)\n",
+ "print(good_response)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**📌 Key Takeaway: Give Models Time to Think**\n",
+ "\n",
+ "Notice the difference:\n",
+ "- **Bad approach:** The AI might agree with the student too quickly without thorough analysis\n",
+ "- **Good approach:** By forcing the AI to solve the problem first, it:\n",
+ " - Develops its own understanding of the requirements\n",
+ " - Creates comprehensive test cases independently\n",
+ " - Can objectively compare two solutions\n",
+ " - Catches subtle bugs or edge cases it might have missed\n",
+ "\n",
+ "**Real-World Applications:**\n",
+ "- **Code Review:** Make AI implement a solution before reviewing pull requests\n",
+ "- **Bug Analysis:** Have AI reproduce the bug before suggesting fixes\n",
+ "- **Architecture Review:** Force AI to design its own solution before critiquing proposals\n",
+ "- **Test Review:** Make AI write tests before evaluating test coverage\n",
+ "\n",
+ "**The Golden Rule:** *\"Don't let the AI judge until it has worked through the problem itself.\"*\n",
+ "\n",
+ "---\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Systematic Code Analysis using Chain of Thoughts\n",
+ "\n",
+ "Now let's implement step-by-step reasoning for complex code analysis tasks:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Chain-of-thought for systematic code analysis\n",
+ "system_message = \"\"\"Use the following step-by-step instructions to analyze code:\n",
+ "\n",
+ "Step 1 - Count the number of functions in the code snippet with a prefix that says 'Function Count: '\n",
+ "Step 2 - List each function name with its line number with a prefix that says 'Function List: '\n",
+ "Step 3 - Identify any functions that are longer than 10 lines with a prefix that says 'Long Functions: '\n",
+ "Step 4 - Provide an overall assessment with a prefix that says 'Assessment: '\"\"\"\n",
+ "\n",
+ "user_message = \"\"\"\n",
+ "def calculate_tax(income, deductions):\n",
+ " taxable_income = income - deductions\n",
+ " if taxable_income <= 0:\n",
+ " return 0\n",
+ " elif taxable_income <= 50000:\n",
+ " return taxable_income * 0.1\n",
+ " else:\n",
+ " return 50000 * 0.1 + (taxable_income - 50000) * 0.2\n",
+ "\n",
+ "def format_currency(amount):\n",
+ " return f\"${amount:,.2f}\"\n",
+ "\n",
+ "def generate_report(name, income, deductions):\n",
+ " tax = calculate_tax(income, deductions)\n",
+ " net_income = income - tax\n",
+ " \n",
+ " print(f\"Tax Report for {name}\")\n",
+ " print(f\"Gross Income: {format_currency(income)}\")\n",
+ " print(f\"Deductions: {format_currency(deductions)}\")\n",
+ " print(f\"Tax Owed: {format_currency(tax)}\")\n",
+ " print(f\"Net Income: {format_currency(net_income)}\")\n",
+ "\"\"\"\n",
+ "\n",
+ "chain_messages = [\n",
+ " {\"role\": \"system\", \"content\": system_message},\n",
+ " {\"role\": \"user\", \"content\": user_message}\n",
+ "]\n",
+ "\n",
+ "chain_response = get_chat_completion(chain_messages)\n",
+ "print(\"🔗 CHAIN-OF-THOUGHT ANALYSIS:\")\n",
+ "print(chain_response)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "🚀 **Excellent!** The AI followed each step methodically, providing structured, comprehensive analysis.\n",
+ "\n",
+ "#### Practice Exercise: Combine All Techniques\n",
+ "\n",
+ "Now let's put everything together in a real-world scenario that combines role prompting, delimiters, and chain-of-thought:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Comprehensive example combining all techniques\n",
+ "comprehensive_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"\"\"You are a senior software engineer conducting a comprehensive code review.\n",
+ "\n",
+ "Follow this systematic process:\n",
+ "Step 1 - Security Analysis: Identify potential security vulnerabilities\n",
+ "Step 2 - Performance Review: Analyze efficiency and optimization opportunities \n",
+ "Step 3 - Code Quality: Evaluate readability, maintainability, and best practices\n",
+ "Step 4 - Recommendations: Provide specific, prioritized improvement suggestions\n",
+ "\n",
+ "Format each step clearly with the step name as a header.\"\"\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": \"\"\"\n",
+ "\n",
+ "from flask import Flask, request, jsonify\n",
+ "import sqlite3\n",
+ "\n",
+ "app = Flask(__name__)\n",
+ "\n",
+ "@app.route('/user/')\n",
+ "def get_user(user_id):\n",
+ " conn = sqlite3.connect('users.db')\n",
+ " cursor = conn.cursor()\n",
+ " cursor.execute(f\"SELECT * FROM users WHERE id = {user_id}\")\n",
+ " user = cursor.fetchone()\n",
+ " conn.close()\n",
+ " \n",
+ " if user:\n",
+ " return jsonify({\n",
+ " \"id\": user[0],\n",
+ " \"name\": user[1], \n",
+ " \"email\": user[2]\n",
+ " })\n",
+ " else:\n",
+ " return jsonify({\"error\": \"User not found\"}), 404\n",
+ "\n",
+ "\n",
+ "\n",
+ "This is a user lookup endpoint for a web application that serves user profiles.\n",
+ "The application handles 1000+ requests per minute during peak hours.\n",
+ "\n",
+ "\n",
+ "Perform a comprehensive code review following the systematic process.\n",
+ "\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "comprehensive_response = get_chat_completion(comprehensive_messages)\n",
+ "print(\"🔍 COMPREHENSIVE CODE REVIEW:\")\n",
+ "print(comprehensive_response)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "### 📖 Tactic 5: Reference Citations\n",
+ "\n",
+ "**Ground responses in actual documentation to reduce hallucinations**\n",
+ "\n",
+ "**Core Principle:** When working with long documents or multiple reference materials, asking AI models to quote relevant parts of the documents first before carrying out tasks helps them cut through the \"noise\" and focus on pertinent information. This technique is especially powerful when working with extended context windows.\n",
+ "\n",
+ "**Why This Works:**\n",
+ "- The AI identifies and focuses on relevant information before generating responses\n",
+ "- Citations make outputs verifiable and trustworthy\n",
+ "- Reduces hallucination by grounding responses in actual source material\n",
+ "- Makes it easy to trace conclusions back to specific code or documentation sections\n",
+ "\n",
+ "**Best Practices for Long Context:**\n",
+ "- **Put longform data at the top:** Place long documents (~20K+ tokens) near the top of your prompt, above queries and instructions (can improve response quality by up to 30%)\n",
+ "- **Structure with XML tags:** Use ``, ``, and `` tags to organize multiple documents\n",
+ "- **Request quotes first:** Ask the AI to extract relevant quotes in `` tags before generating the final response\n",
+ "\n",
+ "**Software Engineering Application Preview:** Critical for code review with large codebases, documentation generation from source files, security audit reports, and analyzing API documentation.\n",
+ "\n",
+ "*Reference: [Claude Documentation - Long Context Tips](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/long-context-tips)*\n",
+ "\n",
+ "---\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Example 1: Code Review with Multiple Files\n",
+ "\n",
+ "Let's demonstrate how to structure multiple code files and ask the AI to extract relevant quotes before providing analysis:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Example: Multi-file code review with quote extraction\n",
+ "auth_service = \"\"\"\n",
+ "class AuthService:\n",
+ " def __init__(self, db_connection):\n",
+ " self.db = db_connection\n",
+ " \n",
+ " def authenticate_user(self, username, password):\n",
+ " # TODO: Add password hashing\n",
+ " query = f\"SELECT * FROM users WHERE username='{username}' AND password='{password}'\"\n",
+ " result = self.db.execute(query)\n",
+ " return result.fetchone() is not None\n",
+ " \n",
+ " def create_session(self, user_id):\n",
+ " session_id = str(uuid.uuid4())\n",
+ " # Session expires in 24 hours\n",
+ " expiry = datetime.now() + timedelta(hours=24)\n",
+ " self.db.execute(f\"INSERT INTO sessions VALUES ('{session_id}', {user_id}, '{expiry}')\")\n",
+ " return session_id\n",
+ "\"\"\"\n",
+ "\n",
+ "user_controller = \"\"\"\n",
+ "from flask import Flask, request, jsonify\n",
+ "from auth_service import AuthService\n",
+ "\n",
+ "app = Flask(__name__)\n",
+ "auth = AuthService(db_connection)\n",
+ "\n",
+ "@app.route('/login', methods=['POST'])\n",
+ "def login():\n",
+ " username = request.json.get('username')\n",
+ " password = request.json.get('password')\n",
+ " \n",
+ " if auth.authenticate_user(username, password):\n",
+ " user_id = get_user_id(username)\n",
+ " session_id = auth.create_session(user_id)\n",
+ " return jsonify({'session_id': session_id, 'status': 'success'})\n",
+ " else:\n",
+ " return jsonify({'status': 'failed'}), 401\n",
+ "\"\"\"\n",
+ "\n",
+ "# Structure the prompt with documents at the top, query at the bottom\n",
+ "messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a senior security engineer reviewing code for vulnerabilities.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"\n",
+ "\n",
+ "auth_service.py\n",
+ "\n",
+ "{auth_service}\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "user_controller.py\n",
+ "\n",
+ "{user_controller}\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "Review the authentication code above for security vulnerabilities. \n",
+ "\n",
+ "First, extract relevant code quotes that demonstrate security issues and place them in tags with the source file indicated.\n",
+ "\n",
+ "Then, provide your security analysis in tags, explaining each vulnerability and its severity.\n",
+ "\n",
+ "Finally, provide specific remediation recommendations in tags.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "response = get_chat_completion(messages)\n",
+ "print(\"🔒 SECURITY REVIEW WITH CITATIONS:\")\n",
+ "print(response)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Example 2: API Documentation Analysis\n",
+ "\n",
+ "Now let's analyze API documentation to extract specific information with citations:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Example: Analyzing API documentation with quote grounding\n",
+ "api_docs = \"\"\"\n",
+ "# Payment API Documentation\n",
+ "\n",
+ "## Authentication\n",
+ "All API requests require an API key passed in the `X-API-Key` header.\n",
+ "Rate limit: 1000 requests per hour per API key.\n",
+ "\n",
+ "## Create Payment\n",
+ "POST /api/v2/payments\n",
+ "\n",
+ "Creates a new payment transaction.\n",
+ "\n",
+ "**Request Body:**\n",
+ "- amount (required, decimal): Payment amount in USD\n",
+ "- currency (optional, string): Currency code, defaults to \"USD\"\n",
+ "- customer_id (required, string): Customer identifier\n",
+ "- payment_method (required, string): One of: \"card\", \"bank\", \"wallet\"\n",
+ "- metadata (optional, object): Additional key-value pairs\n",
+ "\n",
+ "**Rate Limit:** 100 requests per minute\n",
+ "\n",
+ "**Response:**\n",
+ "{\n",
+ " \"payment_id\": \"pay_abc123\",\n",
+ " \"status\": \"pending\",\n",
+ " \"amount\": 99.99,\n",
+ " \"created_at\": \"2024-01-15T10:30:00Z\"\n",
+ "}\n",
+ "\n",
+ "## Retrieve Payment\n",
+ "GET /api/v2/payments/{payment_id}\n",
+ "\n",
+ "Retrieves details of a specific payment.\n",
+ "\n",
+ "**Security Note:** Only returns payments belonging to the authenticated API key's account.\n",
+ "\n",
+ "**Response Codes:**\n",
+ "- 200: Success\n",
+ "- 404: Payment not found\n",
+ "- 401: Invalid API key\n",
+ "\"\"\"\n",
+ "\n",
+ "integration_question = \"\"\"\n",
+ "I need to integrate payment processing into my e-commerce checkout flow.\n",
+ "The checkout needs to:\n",
+ "1. Create a payment when user clicks \"Pay Now\"\n",
+ "2. Handle USD and EUR currencies\n",
+ "3. Store order metadata with the payment\n",
+ "4. Check payment status after creation\n",
+ "\n",
+ "What do I need to know from the API documentation?\n",
+ "\"\"\"\n",
+ "\n",
+ "messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a technical integration specialist helping developers implement APIs.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"\n",
+ "\n",
+ "payment_api_docs.md\n",
+ "\n",
+ "{api_docs}\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "{integration_question}\n",
+ "\n",
+ "\n",
+ "First, find and quote the relevant sections from the API documentation that address the integration requirements. Place these quotes in tags with the section name indicated.\n",
+ "\n",
+ "Then, provide a step-by-step integration guide in tags that references the quoted documentation.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "response = get_chat_completion(messages)\n",
+ "print(\"📚 API INTEGRATION GUIDE WITH CITATIONS:\")\n",
+ "print(response)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Key Takeaways: Reference Citations\n",
+ "\n",
+ "**Best Practices Demonstrated:**\n",
+ "1. **Document Structure:** Used `` and `` tags with `` and `` metadata\n",
+ "2. **Documents First:** Placed all reference materials at the top of the prompt, before the query\n",
+ "3. **Quote Extraction:** Asked AI to extract relevant quotes first, then perform analysis\n",
+ "4. **Structured Output:** Used XML tags like ``, ``, and `` to organize responses\n",
+ "---\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "### 🔗 Tactic 6: Prompt Chaining\n",
+ "\n",
+ "**Break complex tasks into sequential workflows**\n",
+ "\n",
+ "**Core Principle:** When working with complex tasks, AI models can sometimes drop the ball if you try to handle everything in a single prompt. Prompt chaining breaks down complex tasks into smaller, manageable subtasks, where each subtask gets the AI's full attention.\n",
+ "\n",
+ "**Why Chain Prompts:**\n",
+ "- **Accuracy:** Each subtask gets full attention, reducing errors\n",
+ "- **Clarity:** Simpler subtasks mean clearer instructions and outputs\n",
+ "- **Traceability:** Easily pinpoint and fix issues in your prompt chain\n",
+ "- **Focus:** Each link in the chain gets the AI's complete concentration\n",
+ "\n",
+ "**When to Chain Prompts:**\n",
+ "Use prompt chaining for multi-step tasks like:\n",
+ "- Research synthesis and document analysis\n",
+ "- Iterative content creation\n",
+ "- Multiple transformations or citations\n",
+ "- Code generation → Review → Refactoring workflows\n",
+ "\n",
+ "**How to Chain Prompts:**\n",
+ "1. **Identify subtasks:** Break your task into distinct, sequential steps\n",
+ "2. **Structure with XML:** Use XML tags to pass outputs between prompts\n",
+ "3. **Single-task goal:** Each subtask should have one clear objective\n",
+ "4. **Iterate:** Refine subtasks based on performance\n",
+ "\n",
+ "**Common Software Development Workflows:**\n",
+ "- **Code Review Pipeline:** Extract code → Analyze issues → Propose fixes → Generate tests\n",
+ "- **Documentation Generation:** Analyze code → Extract docstrings → Format → Review\n",
+ "- **Refactoring Workflow:** Identify patterns → Suggest improvements → Generate refactored code → Validate\n",
+ "- **Testing Pipeline:** Analyze function → Generate test cases → Create assertions → Review coverage\n",
+ "- **Debugging Chain:** Reproduce issue → Analyze root cause → Suggest fixes → Verify solution\n",
+ "\n",
+ "**Debugging Tip:** If the AI misses a step or performs poorly, isolate that step in its own prompt. This lets you fine-tune problematic steps without redoing the entire task.\n",
+ "\n",
+ "**Software Engineering Application Preview:** Essential for complex code reviews, multi-stage refactoring, comprehensive test generation, and architectural analysis where breaking down the task ensures nothing is missed.\n",
+ "\n",
+ "*Reference: [Claude Documentation - Chain Complex Prompts](https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/chain-prompts)*\n",
+ "\n",
+ "---\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Example 1: Code Review with Prompt Chaining\n",
+ "\n",
+ "Let's demonstrate a 3-step prompt chain for comprehensive code review:\n",
+ "1. **Step 1:** Analyze code for issues\n",
+ "2. **Step 2:** Review the analysis for completeness\n",
+ "3. **Step 3:** Generate final recommendations with fixes\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Prompt Chain Example: Code Review Pipeline\n",
+ "code_to_review = \"\"\"\n",
+ "def process_user_data(user_input):\n",
+ " # Process user registration data\n",
+ " data = eval(user_input) # Parse input\n",
+ " \n",
+ " username = data['username']\n",
+ " email = data['email']\n",
+ " password = data['password']\n",
+ " \n",
+ " # Save to database\n",
+ " query = f\"INSERT INTO users (username, email, password) VALUES ('{username}', '{email}', '{password}')\"\n",
+ " db.execute(query)\n",
+ " \n",
+ " # Send welcome email\n",
+ " send_email(email, f\"Welcome {username}!\")\n",
+ " \n",
+ " return {\"status\": \"success\", \"user\": username}\n",
+ "\"\"\"\n",
+ "\n",
+ "# STEP 1: Analyze code for issues\n",
+ "print(\"=\" * 60)\n",
+ "print(\"STEP 1: Initial Code Analysis\")\n",
+ "print(\"=\" * 60)\n",
+ "\n",
+ "step1_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a senior code reviewer specializing in security and best practices.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Analyze this Python function for issues:\n",
+ "\n",
+ "\n",
+ "{code_to_review}\n",
+ "\n",
+ "\n",
+ "Identify all security vulnerabilities, code quality issues, and potential bugs.\n",
+ "Provide your analysis in tags with specific line references.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "analysis = get_chat_completion(step1_messages)\n",
+ "print(analysis)\n",
+ "print(\"\\n\")\n",
+ "\n",
+ "# STEP 2: Review the analysis for completeness\n",
+ "print(\"=\" * 60)\n",
+ "print(\"STEP 2: Review Analysis for Completeness\")\n",
+ "print(\"=\" * 60)\n",
+ "\n",
+ "step2_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a principal engineer reviewing a code analysis. Check for completeness and accuracy.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Here is a code analysis from a code reviewer:\n",
+ "\n",
+ "\n",
+ "{code_to_review}\n",
+ "\n",
+ "\n",
+ "\n",
+ "{analysis}\n",
+ "\n",
+ "\n",
+ "Review this analysis and:\n",
+ "1. Verify all issues are correctly identified\n",
+ "2. Check if any critical issues were missed\n",
+ "3. Rate the severity of each issue (Critical/High/Medium/Low)\n",
+ "\n",
+ "Provide feedback in tags.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "review = get_chat_completion(step2_messages)\n",
+ "print(review)\n",
+ "print(\"\\n\")\n",
+ "\n",
+ "# STEP 3: Generate final recommendations with code fixes\n",
+ "print(\"=\" * 60)\n",
+ "print(\"STEP 3: Final Recommendations and Code Fixes\")\n",
+ "print(\"=\" * 60)\n",
+ "\n",
+ "step3_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a senior developer providing actionable solutions.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Based on the code analysis and review, provide final recommendations:\n",
+ "\n",
+ "\n",
+ "{code_to_review}\n",
+ "\n",
+ "\n",
+ "\n",
+ "{analysis}\n",
+ "\n",
+ "\n",
+ "\n",
+ "{review}\n",
+ "\n",
+ "\n",
+ "Provide:\n",
+ "1. A prioritized list of fixes in tags\n",
+ "2. The complete refactored code in tags\n",
+ "3. Brief explanation of key changes in tags\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "final_recommendations = get_chat_completion(step3_messages)\n",
+ "print(final_recommendations)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Example 2: Test Generation with Prompt Chaining\n",
+ "\n",
+ "Now let's create a chain for comprehensive test generation:\n",
+ "1. **Step 1:** Analyze function to identify test scenarios\n",
+ "2. **Step 2:** Generate test cases based on scenarios \n",
+ "3. **Step 3:** Review and enhance test coverage\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Prompt Chain Example: Test Generation Pipeline\n",
+ "function_to_test = \"\"\"\n",
+ "def calculate_discount(price, discount_percent, customer_tier='standard'):\n",
+ " \\\"\\\"\\\"\n",
+ " Calculate final price after applying discount.\n",
+ " \n",
+ " Args:\n",
+ " price: Original price (must be positive)\n",
+ " discount_percent: Discount percentage (0-100)\n",
+ " customer_tier: Customer tier ('standard', 'premium', 'vip')\n",
+ " \n",
+ " Returns:\n",
+ " Final price after discount and tier bonus\n",
+ " \\\"\\\"\\\"\n",
+ " if price < 0:\n",
+ " raise ValueError(\"Price cannot be negative\")\n",
+ " \n",
+ " if discount_percent < 0 or discount_percent > 100:\n",
+ " raise ValueError(\"Discount must be between 0 and 100\")\n",
+ " \n",
+ " # Apply base discount\n",
+ " discounted_price = price * (1 - discount_percent / 100)\n",
+ " \n",
+ " # Apply tier bonus\n",
+ " tier_bonuses = {'standard': 0, 'premium': 5, 'vip': 10}\n",
+ " if customer_tier not in tier_bonuses:\n",
+ " raise ValueError(f\"Invalid tier: {customer_tier}\")\n",
+ " \n",
+ " tier_bonus = tier_bonuses[customer_tier]\n",
+ " final_price = discounted_price * (1 - tier_bonus / 100)\n",
+ " \n",
+ " return round(final_price, 2)\n",
+ "\"\"\"\n",
+ "\n",
+ "# STEP 1: Analyze function and identify test scenarios\n",
+ "print(\"=\" * 60)\n",
+ "print(\"STEP 1: Analyze Function and Identify Test Scenarios\")\n",
+ "print(\"=\" * 60)\n",
+ "\n",
+ "step1_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a QA engineer analyzing code for test coverage.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Analyze this function and identify all test scenarios needed:\n",
+ "\n",
+ "\n",
+ "{function_to_test}\n",
+ "\n",
+ "\n",
+ "Identify and categorize test scenarios:\n",
+ "1. Happy path scenarios\n",
+ "2. Edge cases\n",
+ "3. Error cases\n",
+ "4. Boundary conditions\n",
+ "\n",
+ "Provide your analysis in tags.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "test_scenarios = get_chat_completion(step1_messages)\n",
+ "print(test_scenarios)\n",
+ "print(\"\\n\")\n",
+ "\n",
+ "# STEP 2: Generate test cases based on scenarios\n",
+ "print(\"=\" * 60)\n",
+ "print(\"STEP 2: Generate Test Cases\")\n",
+ "print(\"=\" * 60)\n",
+ "\n",
+ "step2_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a test automation engineer. Write pytest test cases.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Based on these test scenarios, generate pytest test cases:\n",
+ "\n",
+ "\n",
+ "{function_to_test}\n",
+ "\n",
+ "\n",
+ "\n",
+ "{test_scenarios}\n",
+ "\n",
+ "\n",
+ "Generate complete, executable pytest test cases in tags.\n",
+ "Include assertions, test data, and descriptive test names.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "test_code = get_chat_completion(step2_messages)\n",
+ "print(test_code)\n",
+ "print(\"\\n\")\n",
+ "\n",
+ "# STEP 3: Review and enhance test coverage\n",
+ "print(\"=\" * 60)\n",
+ "print(\"STEP 3: Review Test Coverage and Suggest Enhancements\")\n",
+ "print(\"=\" * 60)\n",
+ "\n",
+ "step3_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a principal QA engineer reviewing test coverage.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Review this test suite for completeness:\n",
+ "\n",
+ "\n",
+ "{function_to_test}\n",
+ "\n",
+ "\n",
+ "\n",
+ "{test_scenarios}\n",
+ "\n",
+ "\n",
+ "\n",
+ "{test_code}\n",
+ "\n",
+ "\n",
+ "Evaluate:\n",
+ "1. Are all scenarios covered?\n",
+ "2. Are there any missing edge cases?\n",
+ "3. Is the test data comprehensive?\n",
+ "4. Estimate coverage percentage\n",
+ "\n",
+ "Provide:\n",
+ "- Coverage assessment in tags\n",
+ "- Any additional test cases needed in tags\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "coverage_review = get_chat_completion(step3_messages)\n",
+ "print(coverage_review)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Key Takeaways: Prompt Chaining\n",
+ "\n",
+ "**What We Demonstrated:**\n",
+ "\n",
+ "**Example 1: Code Review Chain**\n",
+ "- **Step 1:** Initial analysis identifies security vulnerabilities and code quality issues\n",
+ "- **Step 2:** Principal engineer validates the analysis and adds severity ratings\n",
+ "- **Step 3:** Generates actionable fixes and refactored code\n",
+ "\n",
+ "**Example 2: Test Generation Chain**\n",
+ "- **Step 1:** Analyzes function to identify all necessary test scenarios\n",
+ "- **Step 2:** Generates complete pytest test cases with proper structure\n",
+ "- **Step 3:** Reviews coverage and suggests additional tests for completeness\n",
+ "\n",
+ "**Why Chaining Works Better Than Single Prompts:**\n",
+ "- **Focused attention:** Each step handles one specific task without distraction\n",
+ "- **Quality control:** Later steps can review and enhance earlier outputs\n",
+ "- **Iterative refinement:** Each link improves the overall result\n",
+ "- **Easier debugging:** Problems can be isolated to specific steps\n",
+ "\n",
+ "**Best Practices Demonstrated:**\n",
+ "1. **Pass context forward:** Each step receives relevant outputs from previous steps\n",
+ "2. **Use XML tags:** Structured tags (``, ``, ``) organize data flow\n",
+ "3. **Clear objectives:** Each step has one specific, measurable goal\n",
+ "4. **Role specialization:** Different expert personas for different steps\n",
+ "\n",
+ "**Real-World Applications:**\n",
+ "- **Multi-stage refactoring:** Analyze → Plan → Refactor → Validate → Document\n",
+ "- **Comprehensive security audits:** Scan → Analyze → Prioritize → Generate fixes → Verify\n",
+ "- **API development:** Design schema → Generate code → Create tests → Write docs → Review\n",
+ "- **Database migrations:** Analyze schema → Generate migration → Create rollback → Test → Deploy\n",
+ "- **CI/CD pipeline generation:** Analyze project → Design workflow → Generate config → Add tests → Optimize\n",
+ "\n",
+ "**Pro Tip:** You can also create **self-correction chains** where the AI reviews its own work! Just pass the output back with a review prompt to catch errors and refine results.\n",
+ "\n",
+ "---\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "### ⚖️ Tactic 7: LLM-as-Judge\n",
+ "\n",
+ "**Create evaluation rubrics and self-critique loops**\n",
+ "\n",
+ "**Core Principle:** One of the most powerful patterns in prompt engineering is using an AI model as a judge or critic to evaluate and improve outputs. This creates a self-improvement loop where the AI reviews, critiques, and refines work—either its own outputs or those from other sources.\n",
+ "\n",
+ "**Why Use LLM-as-Judge:**\n",
+ "- **Quality assurance:** Catch errors, inconsistencies, and areas for improvement\n",
+ "- **Objective evaluation:** Get unbiased assessment based on specific criteria\n",
+ "- **Iterative refinement:** Continuously improve outputs through multiple review cycles\n",
+ "- **Scalable review:** Automate code reviews, documentation checks, and quality audits\n",
+ "\n",
+ "**When to Use LLM-as-Judge:**\n",
+ "- Code review and quality assessment\n",
+ "- Evaluating multiple solution approaches\n",
+ "- Grading or scoring responses against rubrics\n",
+ "- Providing constructive feedback on technical writing\n",
+ "- Testing and validation of AI-generated content\n",
+ "- Comparing different implementations\n",
+ "\n",
+ "**How to Implement:**\n",
+ "1. **Define clear criteria:** Specify what makes a good/bad output\n",
+ "2. **Provide rubrics:** Give the judge specific evaluation dimensions\n",
+ "3. **Request structured feedback:** Ask for scores, ratings, or categorized feedback\n",
+ "4. **Include examples:** Show what excellent vs. poor outputs look like\n",
+ "5. **Iterate:** Use feedback to improve and re-evaluate\n",
+ "\n",
+ "**Software Engineering Application Preview:** Essential for automated code reviews, architecture decision validation, test coverage assessment, documentation quality checks, and comparing multiple implementation approaches.\n",
+ "\n",
+ "*Reference: This technique combines elements from evaluation frameworks and self-critique patterns used in production AI systems.*\n",
+ "\n",
+ "---\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Example 1: Code Quality Judge\n",
+ "\n",
+ "Let's use AI as a judge to evaluate and compare two different implementations:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Example: LLM as Judge - Comparing Two Implementations\n",
+ "implementation_a = \"\"\"\n",
+ "def find_duplicates(items):\n",
+ " duplicates = []\n",
+ " for i in range(len(items)):\n",
+ " for j in range(i + 1, len(items)):\n",
+ " if items[i] == items[j] and items[i] not in duplicates:\n",
+ " duplicates.append(items[i])\n",
+ " return duplicates\n",
+ "\"\"\"\n",
+ "\n",
+ "implementation_b = \"\"\"\n",
+ "def find_duplicates(items):\n",
+ " from collections import Counter\n",
+ " counts = Counter(items)\n",
+ " return [item for item, count in counts.items() if count > 1]\n",
+ "\"\"\"\n",
+ "\n",
+ "print(\"=\" * 70)\n",
+ "print(\"LLM AS JUDGE: Comparing Implementations\")\n",
+ "print(\"=\" * 70)\n",
+ "\n",
+ "judge_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"\"\"You are a senior software engineer acting as an impartial code judge.\n",
+ " \n",
+ "Evaluate code based on these criteria:\n",
+ "1. Time Complexity (weight: 30%)\n",
+ "2. Space Complexity (weight: 20%)\n",
+ "3. Readability (weight: 25%)\n",
+ "4. Maintainability (weight: 15%)\n",
+ "5. Edge Case Handling (weight: 10%)\n",
+ "\n",
+ "Provide:\n",
+ "- Scores (0-10) for each criterion\n",
+ "- Overall weighted score\n",
+ "- Pros and cons for each implementation\n",
+ "- Final recommendation\"\"\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Compare these two implementations of a function that finds duplicate items in a list:\n",
+ "\n",
+ "\n",
+ "{implementation_a}\n",
+ "\n",
+ "\n",
+ "\n",
+ "{implementation_b}\n",
+ "\n",
+ "\n",
+ "Evaluate both implementations using the criteria provided. Structure your response with:\n",
+ "1. tags for Implementation A analysis\n",
+ "2. tags for Implementation B analysis\n",
+ "3. tags for side-by-side comparison\n",
+ "4. tags for final verdict\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "judge_response = get_chat_completion(judge_messages)\n",
+ "print(judge_response)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Example 2: Self-Critique and Improvement Loop\n",
+ "\n",
+ "Now let's create an improvement loop where AI generates code, critiques it, and then improves it:\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Example: Self-Critique and Improvement Loop\n",
+ "requirement = \"Create a function that validates and sanitizes user input for a SQL query\"\n",
+ "\n",
+ "# STEP 1: Generate initial solution\n",
+ "print(\"=\" * 70)\n",
+ "print(\"STEP 1: Generate Initial Solution\")\n",
+ "print(\"=\" * 70)\n",
+ "\n",
+ "generate_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a Python developer. Generate code solutions.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"{requirement}\n",
+ "\n",
+ "Provide your implementation in tags.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "initial_code = get_chat_completion(generate_messages)\n",
+ "print(initial_code)\n",
+ "print(\"\\n\")\n",
+ "\n",
+ "# STEP 2: Critique the solution\n",
+ "print(\"=\" * 70)\n",
+ "print(\"STEP 2: Critique the Solution\")\n",
+ "print(\"=\" * 70)\n",
+ "\n",
+ "critique_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"\"\"You are a security-focused code reviewer. \n",
+ " \n",
+ "Evaluate code for:\n",
+ "- Security vulnerabilities\n",
+ "- Best practices\n",
+ "- Error handling\n",
+ "- Edge cases\n",
+ "- Code quality\n",
+ "\n",
+ "Provide brutally honest feedback with specific issues and severity levels.\"\"\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Requirement: {requirement}\n",
+ "\n",
+ "Initial implementation:\n",
+ "{initial_code}\n",
+ "\n",
+ "Critique this implementation. Identify all issues, rate severity (Critical/High/Medium/Low), and suggest specific improvements.\n",
+ "\n",
+ "Structure your response:\n",
+ "Your detailed critique\n",
+ "List of specific issues with severity\n",
+ "Actionable improvement suggestions\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "critique = get_chat_completion(critique_messages)\n",
+ "print(critique)\n",
+ "print(\"\\n\")\n",
+ "\n",
+ "# STEP 3: Improve based on critique\n",
+ "print(\"=\" * 70)\n",
+ "print(\"STEP 3: Improved Implementation\")\n",
+ "print(\"=\" * 70)\n",
+ "\n",
+ "improve_messages = [\n",
+ " {\n",
+ " \"role\": \"system\",\n",
+ " \"content\": \"You are a senior Python developer who learns from feedback.\"\n",
+ " },\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": f\"\"\"Requirement: {requirement}\n",
+ "\n",
+ "Original implementation:\n",
+ "{initial_code}\n",
+ "\n",
+ "Critique received:\n",
+ "{critique}\n",
+ "\n",
+ "Create an improved implementation that addresses ALL the issues raised in the critique.\n",
+ "Provide the improved code in tags and explain key changes in tags.\"\"\"\n",
+ " }\n",
+ "]\n",
+ "\n",
+ "improved_code = get_chat_completion(improve_messages)\n",
+ "print(improved_code)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Key Takeaways: LLM-as-Judge\n",
+ "\n",
+ "**What We Demonstrated:**\n",
+ "\n",
+ "**Example 1: Code Quality Judge**\n",
+ "- Defined clear evaluation criteria with weights\n",
+ "- Provided structured rubrics for assessment\n",
+ "- Got objective comparison of two implementations\n",
+ "- Received scored evaluation with pros/cons and recommendation\n",
+ "\n",
+ "**Example 2: Self-Critique and Improvement Loop**\n",
+ "- **Step 1:** Generated initial code solution\n",
+ "- **Step 2:** Used AI as brutal critic to identify issues\n",
+ "- **Step 3:** Improved code based on critique feedback\n",
+ "- Created a self-improvement cycle\n",
+ "\n",
+ "**Benefits of LLM-as-Judge:**\n",
+ "\n",
+ "1. **Objective Evaluation:**\n",
+ " - Unbiased assessment based on defined criteria\n",
+ " - Consistent scoring across multiple evaluations\n",
+ " - Reduces human bias in code reviews\n",
+ "\n",
+ "2. **Continuous Improvement:**\n",
+ " - Iterative refinement through critique loops\n",
+ " - Learn from mistakes and feedback\n",
+ " - Progressive quality enhancement\n",
+ "\n",
+ "3. **Scalable Reviews:**\n",
+ " - Automate repetitive evaluation tasks\n",
+ " - Handle multiple implementations simultaneously\n",
+ " - Save senior engineers' time for complex decisions\n",
+ "\n",
+ "4. **Structured Feedback:**\n",
+ " - Clear, actionable improvement suggestions\n",
+ " - Severity ratings for prioritization\n",
+ " - Specific examples and recommendations\n",
+ "\n",
+ "**Real-World Applications:**\n",
+ "\n",
+ "- **Automated Code Reviews:** Evaluate PRs against coding standards before human review\n",
+ "- **Architecture Decisions:** Compare multiple design approaches objectively\n",
+ "- **Test Quality Assessment:** Evaluate test coverage and edge case handling\n",
+ "- **Documentation Quality:** Grade documentation completeness and clarity\n",
+ "- **API Design Review:** Compare REST vs GraphQL implementations\n",
+ "- **Performance Optimization:** Evaluate before/after optimization attempts\n",
+ "- **Security Audits:** Systematic vulnerability assessment with severity ratings\n",
+ "\n",
+ "**Implementation Patterns:**\n",
+ "\n",
+ "```python\n",
+ "# Pattern 1: Single evaluation\n",
+ "judge_prompt = \"\"\"\n",
+ "Evaluate [OUTPUT] based on:\n",
+ "1. Criterion A (weight: X%)\n",
+ "2. Criterion B (weight: Y%)\n",
+ "\n",
+ "Provide scores and recommendation.\n",
+ "\"\"\"\n",
+ "\n",
+ "# Pattern 2: Comparative evaluation\n",
+ "judge_prompt = \"\"\"\n",
+ "Compare [OPTION_A] and [OPTION_B] against:\n",
+ "- Criteria 1\n",
+ "- Criteria 2\n",
+ "- Criteria 3\n",
+ "\n",
+ "Recommend the better option with justification.\n",
+ "\"\"\"\n",
+ "\n",
+ "# Pattern 3: Self-improvement loop\n",
+ "1. Generate solution\n",
+ "2. Critique solution (AI as judge)\n",
+ "3. Improve based on critique\n",
+ "4. (Optional) Re-evaluate improvement\n",
+ "```\n",
+ "\n",
+ "**Pro Tips:**\n",
+ "- **Define clear rubrics:** Specific criteria produce better judgments\n",
+ "- **Use weighted scoring:** Prioritize what matters most\n",
+ "- **Request examples:** Ask for specific code snippets in feedback\n",
+ "- **Iterate multiple times:** Don't stop at first critique\n",
+ "- **Combine with other tactics:** Use with prompt chaining for multi-stage reviews\n",
+ "\n",
+ "---\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "### 🤫 Tactic 8: Inner Monologue\n",
+ "\n",
+ "**Separate reasoning from clean final outputs**\n",
+ "\n",
+ "**Core Principle:** The Inner Monologue technique guides AI models to articulate their thought process internally before delivering a final response, effectively \"hiding\" the reasoning steps from the end user. This is particularly useful when you want the benefits of chain-of-thought reasoning without exposing the intermediate thinking to users.\n",
+ "\n",
+ "**Why Use Inner Monologue:**\n",
+ "- **Cleaner output:** Users see only the final answer, not the reasoning steps\n",
+ "- **Better reasoning:** The AI still benefits from step-by-step thinking internally\n",
+ "- **Professional presentation:** Provides concise, polished responses without verbose explanations\n",
+ "- **Flexible control:** You decide what to show and what to keep internal\n",
+ "\n",
+ "**When to Use Inner Monologue:**\n",
+ "- Customer-facing applications where clean responses are important\n",
+ "- API responses that need to be concise\n",
+ "- Documentation generation where only conclusions matter\n",
+ "- Code generation where you want the code, not the thought process\n",
+ "- Production systems where token efficiency is critical\n",
+ "\n",
+ "**How to Implement:**\n",
+ "1. **Instruct internal thinking:** Tell the AI to think through the problem internally\n",
+ "2. **Separate reasoning from output:** Use tags like `` for internal reasoning and `