# 📚 Prompt Engineering Techniques from Google's Whitepaper

This notebook expands on the patterns described in *whitepapers/22365_3_Prompt Engineering_v7 (1).pdf*. Each section summarizes the guidance, then provides a runnable cell you can adapt to your projects.


In [1]:
%pip install --upgrade --quiet google-genai python-dotenv


Note: you may need to restart the kernel to use updated packages.


In [None]:
# --- Import required libraries ---
import os
from collections import Counter

from dotenv import load_dotenv
from google import genai
from google.genai.types import GenerateContentConfig, Part

# --- Load environment variables from .env file (if present) ---
load_dotenv()

# --- Retrieve Google Cloud project ID from environment variables ---
PROJECT_ID = (
    os.getenv("GOOGLE_CLOUD_PROJECT")
    or os.getenv("PROJECT_ID")
)

# --- Check that a valid project ID is set, otherwise raise an error ---
if not PROJECT_ID or PROJECT_ID.strip() in {"", "your-project-id", "<project_id>"}:
    raise ValueError("Set GOOGLE_CLOUD_PROJECT or PROJECT_ID in your environment before running the examples.")

# --- Set the default location and model for the GenAI client ---
LOCATION = os.getenv("GOOGLE_CLOUD_REGION", "us-central1")
DEFAULT_MODEL = os.getenv("GENAI_MODEL", "gemini-2.0-flash-001")

# --- Initialize the GenAI client with Vertex AI, project, and location ---
client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)

# --- Helper function to create a system message for the prompt ---
def system(text: str) -> dict:
    return {"role": "system", "parts": [text]}

# --- Helper function to create a user message for the prompt ---
def user(text: str) -> dict:
    return {"role": "user", "parts": [text]}

# --- Function to generate a response from the model using the provided prompt and optional overrides ---
def generate(contents, **overrides):
    # Prepare configuration parameters, using defaults or overrides if provided
    config_kwargs = {
        "temperature": overrides.get("temperature", 0.2),
        "top_p": overrides.get("top_p", 0.9),
        "top_k": overrides.get("top_k", 40),
        "candidate_count": overrides.get("candidate_count", 1),
        "max_output_tokens": overrides.get("max_output_tokens", 512),
    }
    # Optionally add stop sequences and response MIME type if specified
    if "stop_sequences" in overrides:
        config_kwargs["stop_sequences"] = overrides["stop_sequences"]
    if "response_mime_type" in overrides:
        config_kwargs["response_mime_type"] = overrides["response_mime_type"]

    # Create the configuration object for content generation
    config = GenerateContentConfig(**config_kwargs)
    # Call the model to generate content with the specified configuration
    response = client.models.generate_content(
        model=overrides.get("model", DEFAULT_MODEL),
        contents=contents,
        config=config,
    )
    return response

# --- Function to print the text output from the model's response ---
def show_text(response) -> None:
    print(response.text.strip())


## 🎯 Zero-Shot Prompting

- Express the task, domain constraints, and desired format directly in the prompt.
- The whitepaper stresses setting evaluation signals (word limits, bullet requirements, tone) inside the instructions.


In [6]:
zero_shot_prompt = """Task: Provide a 3-bullet post-operative checklist for patients who received a mandibular dental implant.
Audience: Non-technical adults.
Constraints:
- Keep each bullet under 18 words.
- Highlight red-flag symptoms that require immediate clinician contact.
- Close with a one-sentence reassurance.
Output: Markdown bullet list plus closing sentence."""

# Pass the prompt directly as a string, not wrapped in user()
response = generate(
    zero_shot_prompt,  # Remove the [user()] wrapper
    temperature=0.2,
    max_output_tokens=256,
)
show_text(response)

Here's your post-operative checklist:

*   Take prescribed medications as directed for pain and infection prevention. **Call us immediately if you experience excessive bleeding or fever.**
*   Stick to a soft food diet and avoid chewing directly on the implant site. **Contact us right away if the implant feels loose.**
*   Gently rinse with saltwater after meals to keep the area clean. **Severe pain or swelling that worsens is a red flag - call us!**

We are here to support your healing process and ensure your implant is successful.


## 🧩 One- and Few-Shot Prompting

- Provide representative exemplars that illustrate structure, tone, and level of detail.
- Label inputs and outputs clearly so the model can copy the pattern, as recommended in the whitepaper's few-shot section.


In [8]:
few_shot_prompt = """Example 1\nInput: Patient reports "persistent metallic taste" after implant surgery.\nOutput: 1) Explain the most common benign causes in one sentence. 2) Suggest one at-home mitigation tip. 3) State when to escalate to the care team.\n\nExample 2\nInput: Patient reports "mild swelling on day 3" after sinus lift.\nOutput: 1) Explain expectation-setting for swelling windows. 2) Recommend one monitoring action. 3) Describe the escalation trigger.\n\nNow follow the pattern.\nInput: Patient reports "numbness near the lower lip" after mandibular implant placement.\nOutput:"""

response = generate(
    few_shot_prompt,
    temperature=0.2,
    max_output_tokens=320,
)
show_text(response)


1) Explain the likelihood and typical resolution timeframe. 2) Suggest one self-assessment technique. 3) State when to escalate to the care team.


## 🧭 System, Role, and Contextual Prompting

- **System prompts** set non-negotiable guardrails (whitepaper §18).
- **Role prompts** shift the model's voice or expertise (whitepaper §21).
- **Contextual prompts** inject factual briefs or conversation history that the model should rely on (whitepaper §23).


In [11]:
system_message = "You are the compliance reviewer for a dental clinic. Cite policy sections when flagging issues and never invent new guidelines."
policy_excerpt = """Policy snippet:
Section 4.2 requires surgical instrument sterilization logs to be completed within 12 hours of procedure end.
Section 5.1 mandates post-operative call attempts within 24 hours."""
user_request = "Audit yesterday's mandibular implant case 48273 for policy gaps."

# Combine all messages into a single string
combined_prompt = f"""System: {system_message}

{policy_excerpt}

Task: {user_request}"""

response = generate(
    combined_prompt,  # Pass as single string
    temperature=0.2,
    max_output_tokens=300,
)
show_text(response)

Okay, I will review the documentation for mandibular implant case 48273 from yesterday and flag any potential compliance issues based on the provided policy snippets.

After reviewing the documentation for case 48273, I have identified the following potential policy gaps:

*   **Surgical Instrument Sterilization Log:** The surgical instrument sterilization log for case 48273 was completed 18 hours after the procedure end time documented in the patient chart. This violates **Section 4.2**, which requires completion within 12 hours of the procedure end.

*   **Post-Operative Call Attempts:** The documentation indicates that the first post-operative call attempt was made 30 hours after the procedure end. This violates **Section 5.1**, which mandates post-operative call attempts within 24 hours.


In [12]:
role_message = "You are an experienced dental hygienist explaining procedures to anxious patients in a calm, friendly tone."
patient_question = "Why do I need a custom abutment for my crown?"

# Combine all messages into a single string
combined_prompt = f"""You are an experienced dental hygienist explaining procedures to anxious patients in a calm, friendly tone.

Question: {patient_question}
Format: Plain-language paragraph under 90 words."""

response = generate(
    combined_prompt,  # Pass as single string
    temperature=0.3,
    max_output_tokens=220,
)
show_text(response)

Okay, so we're recommending a custom abutment for your crown, and I understand that might sound a little confusing. Think of the abutment as a connector piece between your implant and the crown itself. A custom abutment is made specifically for your mouth, taking into account the angle of your implant and the way your teeth naturally come together. This allows us to create a crown that fits perfectly, looks more natural, and distributes biting forces evenly. It's like having a tailored suit instead of one off the rack – it just fits better and works better in the long run!


In [13]:
context_brief = """Patient record excerpt:
- Name: Alex Chen, 47
- Procedure: Immediate load implant, tooth #19
- Medication: Amoxicillin 500 mg TID, Ibuprofen 600 mg PRN
- Notes: Reports mild occlusal discomfort on day 4"""

# Combine all messages into a single string
combined_prompt = f"""Use only the provided context and highlight any missing data you would normally request.

{context_brief}

Task: Draft the follow-up message in 3 numbered steps."""

response = generate(
    combined_prompt,  # Pass as single string
    temperature=0.2,
    max_output_tokens=280,
)
show_text(response)

Here's a follow-up message based on the provided context, highlighting missing information:

1.  **Subject: Follow-up Regarding Immediate Load Implant - Alex Chen**

2.  **Body:**

    Dear Alex,

    This message is to follow up on your immediate load implant procedure for tooth #19 on [**Missing: Date of Procedure**]. We understand you reported mild occlusal discomfort on day 4.

    Please describe the discomfort in more detail. Specifically, can you tell us:

    *   The location of the discomfort (is it specifically on tooth #19 or elsewhere?)
    *   The intensity of the discomfort (on a scale of 1-10, with 1 being very mild and 10 being severe)
    *   What makes the discomfort better or worse?
    *   How often are you taking the Ibuprofen?

3.  **Closing:**

    Please reply to this message or call us at [**Missing: Phone Number**] at your earliest convenience so we can address your concerns.

    Sincerely,

    [**Missing: Name/Title of Sender/Practice**]

**Missing Data I w

## 🪜 Step-Back Prompting

- Ask the model to reflect at a higher level before answering, then re-focus on the concrete question (whitepaper §25).
- This reduces myopic answers and surfaces additional considerations.


In [14]:
# Combine all messages into a single string
combined_prompt = """First provide a 2-sentence abstract of the problem, then answer the question with actionable guidance.

Case: A patient presents with peri-implant mucositis three months after crown placement.
Question: Outline an immediate chairside plan and a 7-day home protocol."""

response = generate(
    combined_prompt,  # Pass as single string
    temperature=0.25,
    max_output_tokens=320,
)
show_text(response)

Peri-implant mucositis, inflammation of the soft tissues surrounding a dental implant, requires prompt intervention to prevent progression to peri-implantitis. This plan outlines immediate chairside treatment to debride the affected area and a 7-day home protocol focused on meticulous oral hygiene and antimicrobial therapy to reduce inflammation and promote healing.

**Immediate Chairside Plan:**

1.  **Assessment:**
    *   **Confirm Diagnosis:** Visually inspect for inflammation (redness, swelling), bleeding on probing (BOP), and measure probing depths (PD). Rule out mobility to differentiate from peri-implantitis.
    *   **Radiographic Evaluation:** If not already done, take a periapical radiograph to assess bone levels and rule out bone loss.
    *   **Occlusal Evaluation:** Check for occlusal overload, which can contribute to inflammation.

2.  **Debridement:**
    *   **Supragingival Debridement:** Remove plaque and calculus from the crown and implant abutment using plastic or t

## 🔗 Chain-of-Thought Prompting

- Encourage step-by-step reasoning with explicit instructions and expected structure (whitepaper §29).
- Combine with formatted outputs so the rationale and answer stay separated.


In [15]:
# Create a structured CoT prompt that maintains the reasoning flow
cot_prompt = """You are a surgical planning assistant. Think aloud in numbered steps before giving a final decision.

SCENARIO: Patient has 6 mm posterior mandibular bone height above the nerve canal.

OPTIONS:
1. Short implant
2. Sinus lift with longer implant  
3. Angled placement

TASK: Provide reasoning steps then recommend the safest option.

Please structure your response as:
Step 1: [Your first reasoning step]
Step 2: [Your second reasoning step]
Step 3: [Your third reasoning step]
...
FINAL RECOMMENDATION: [Your final decision with justification]"""

response = generate(
    cot_prompt,
    temperature=0.4,
    max_output_tokens=400,
)
show_text(response)

Okay, I will analyze the scenario and provide a recommendation for the safest option for implant placement in the posterior mandible with 6mm of bone height above the nerve canal.

Step 1: **Assess the primary risk:** The most significant risk in this scenario is damaging the inferior alveolar nerve (IAN) during implant placement. This can lead to permanent numbness, tingling, or pain in the lower lip and chin. Therefore, any option that minimizes the risk of nerve injury is preferred.

Step 2: **Evaluate the short implant option:** A 6mm bone height is at the absolute minimum for a short implant. While short implants have improved significantly, their long-term success rate can be lower than longer implants, especially in the posterior mandible where occlusal forces are high. However, the primary advantage here is nerve avoidance.

Step 3: **Evaluate the sinus lift option:** A sinus lift with a longer implant would provide better long-term stability and a more favorable crown-to-impla

## 🔄 Self-Consistency Decoding

- Sample multiple reasoning paths at higher temperature and keep the majority answer (whitepaper §32).
- Useful when the question has a single best answer but requires reasoning to uncover it.


In [23]:
from collections import Counter

cot_prompt = """Think step by step. Conclude with a line that begins 'Answer:' followed by your recommendation.

CLINICAL SCENARIO: Immediate implant placement is planned in the upper lateral incisor position with thin facial bone.

QUESTION: Choose between (A) immediate provisional crown or (B) healing abutment with delayed provisional. Justify the safer choice.

Please structure your response as:
Step 1: [Your first reasoning step]
Step 2: [Your second reasoning step]
Step 3: [Your third reasoning step]
...
Answer: [Your final recommendation with justification]"""

# Generate multiple responses manually for self-consistency
candidates = []
num_candidates = 5

print("🔄 Generating multiple candidates for self-consistency...")

for i in range(num_candidates):
    print(f"\n--- Candidate {i+1} ---")
    
    response = generate(
        cot_prompt,
        temperature=0.8,  # High temperature for diversity
        max_output_tokens=280,
    )
    
    # Extract text from the correct path: candidate.content.parts[0].text
    if response.candidates and len(response.candidates) > 0:
        candidate = response.candidates[0]
        if candidate.content and candidate.content.parts and len(candidate.content.parts) > 0:
            text = candidate.content.parts[0].text
            print(text)
            
            # Extract answer if present
            if "Answer:" in text:
                answer = text.split("Answer:", 1)[1].strip()
                candidates.append(answer)
                print(f"✅ Answer extracted: {answer}")
        else:
            print("❌ No content parts found")
    else:
        print("❌ No candidates found")

# Find consensus
if candidates:
    print(f"\n📊 SELF-CONSISTENCY ANALYSIS:")
    print(f"Total candidates: {len(candidates)}")
    
    consensus = Counter(candidates).most_common(1)[0]
    print(f"🎯 CONSENSUS ANSWER: {consensus[0]} (votes: {consensus[1]})")
    
    # Show all unique answers
    unique_answers = Counter(candidates)
    print(f"\n📋 ALL ANSWERS:")
    for answer, count in unique_answers.most_common():
        print(f"  {answer} (votes: {count})")
else:
    print("❌ No answers found in candidates")

🔄 Generating multiple candidates for self-consistency...

--- Candidate 1 ---
Step 1: Immediate implant placement in an area with thin facial bone is a high-risk situation for recession and compromised esthetics. The thin bone is susceptible to resorption following extraction.

Step 2: An immediate provisional crown, while potentially beneficial for soft tissue contouring and patient satisfaction, places immediate load on the implant, which could lead to micromotion and compromise osseointegration, particularly in the presence of thin facial bone. This can lead to bone loss and failure.

Step 3: A healing abutment allows for undisturbed osseointegration. A delayed provisional avoids any immediate loading forces on the implant during the critical healing phase. Soft tissue management can be performed later once the implant is stable. Grafting can be done at the time of implant placement and covered by the healing abutment to promote bone fill.

Step 4: While esthetics are important, the

## 🌳 Tree-of-Thought Planning

- Explore multiple solution branches, then score and select the best one (whitepaper §36).
- We can chain two calls: one to expand possible plans, another to pick the strongest path.


In [26]:
problem_statement = """Goal: Reduce chair time for full-arch implant impression appointments.
Constraints: Maintain accuracy within 50 microns. Patient tolerance is low for long procedures."""

# First exploration phase
exploration_prompt = f"""Brainstorm three distinct strategy branches. For each, note benefits and risks.

{problem_statement}"""

exploration = generate(
    exploration_prompt,
    temperature=0.6,
    max_output_tokens=360,
)

# Extract text using the correct path
branches = exploration.candidates[0].content.parts[0].text.strip()
print("Exploration summary:\n", branches)

# Second evaluation phase
evaluation_prompt = f"""Select the branch that best balances accuracy and patient comfort. Cite the deciding factors.

{branches}"""

evaluation = generate(
    evaluation_prompt,
    temperature=0.3,
    max_output_tokens=280,
)

# Pass the response object to show_text, not the extracted text
show_text(evaluation)  # Pass the response object, not evaluation_text

Exploration summary:
 Okay, here are three distinct strategy branches to reduce chair time for full-arch implant impression appointments, keeping accuracy and patient tolerance in mind, along with their benefits and risks:

**Strategy Branch 1:  Optimized Digital Impression Workflow (Focus: Speed and Efficiency)**

*   **Description:**  Emphasizes a completely digital workflow, focusing on maximizing the efficiency of intraoral scanning and digital model creation. This could involve advanced scanning techniques, optimized scan bodies, and streamlined software processing.

    *   **Tactics:**
        *   Utilize high-speed intraoral scanners with wide fields of view.
        *   Employ pre-operative digital scans for planning and initial data capture.
        *   Use scan bodies designed for rapid alignment and minimal scanning passes.
        *   Implement AI-powered software for automated scan alignment and model creation.
        *   Consider hybrid approaches: use a pre-made custom

## 🤖 Automatic Prompt Engineering

- Let the model iterate on your base prompt to propose refined variants (whitepaper §40).
- Pair this with a separate evaluation prompt or human review to select the winner.


In [28]:
base_prompt = """Draft a same-day surgery follow-up SMS for implant patients. Tone: empathetic, professional. Include escalation instructions and a reminder about medication timing."""

# Combine system and user messages into a single string
improver_prompt = f"""You are a prompt engineer improving requests for Gemini. Return three improved variants labeled Variant 1-3 with a short rationale for each.

Original prompt: {base_prompt}"""

improved = generate(
    improver_prompt,  # Pass as single string
    temperature=0.7,
    max_output_tokens=420,
)
show_text(improved)

Here are three improved variants of the original prompt, designed to elicit more specific and useful responses from Gemini:

**Variant 1:**

*   **Prompt:** "Draft an SMS message to be sent to patients the same day after they receive dental implants. The message should be empathetic and professional. Specifically, include: 1) a check-in to see how they are feeling post-surgery; 2) a reminder to take their prescribed pain medication as directed; 3) clear instructions on how to contact the clinic immediately if they experience excessive bleeding, swelling, or severe pain (include a phone number and specify acceptable hours for contact); and 4) a closing statement offering reassurance and wishing them a comfortable recovery. The message should be concise, ideally under 160 characters. Assume the patient's name is already known and can be inserted dynamically. Example opening: 'Hi [Patient Name],'"
*   **Rationale:** This variant adds significant detail and structure. It breaks down the de

## 🧑‍💻 Code Prompting Patterns

- The whitepaper highlights specialized prompts for writing, explaining, translating, and debugging code (whitepaper §42-48).
- Control output modality with `response_mime_type` when you need raw code back.


In [30]:
# Combine system and user messages into a single string
code_prompt = """You are a Python automation expert. Return only runnable code.

Create a function that ingests a CSV of implant torque readings and returns a summary dictionary with mean, min, max, and any values below 30 Ncm.
Constraints: use pandas, raise ValueError if required columns are missing."""

# Remove response_mime_type to avoid INVALID_ARGUMENT error
code_writer = generate(
    code_prompt,  # Pass as single string
    temperature=0.2,
    max_output_tokens=320,
)
print(code_writer.candidates[0].content.parts[0].text)  # Use correct path to extract text

```python
import pandas as pd
import io

def analyze_implant_torque(csv_data):
    """
    Analyzes implant torque readings from a CSV string.

    Args:
        csv_data (str): A string containing the CSV data.  Must have a 'Torque (Ncm)' column.

    Returns:
        dict: A dictionary containing the mean, min, max, and values below 30 Ncm.

    Raises:
        ValueError: If the input is not a string or if the 'Torque (Ncm)' column is missing.
    """

    if not isinstance(csv_data, str):
        raise ValueError("Input must be a string containing CSV data.")

    try:
        df = pd.read_csv(io.StringIO(csv_data))
    except Exception as e:
        raise ValueError(f"Error reading CSV data: {e}")

    required_column = 'Torque (Ncm)'
    if required_column not in df.columns:
        raise ValueError(f"Required column '{required_column}' is missing.")

    try:
        df[required_column] = pd.to_numeric(df[required_column], errors='coerce')
        df = df.dropna(subset=[required

In [32]:
code_to_explain = """def recommended_torque(surface_area: float, bone_quality: str) -> float:
    baseline = 35.0
    modifiers = {"D1": 5.0, "D2": 0.0, "D3": -5.0, "D4": -8.0}
    adjustment = modifiers.get(bone_quality.upper(), -3.0)
    area_factor = min(max(surface_area - 1.5, -0.5), 3.0) * 2.5
    return max(20.0, baseline + adjustment + area_factor)"""

explanation = generate(
    f"""You are a Python code explainer for junior dental residents. Explain this code line-by-line:

{code_to_explain}""",
    temperature=0.2,
    max_output_tokens=360,
)
show_text(explanation)


Alright, future oral surgeons! Let's break down this Python function, `recommended_torque`, line by line. This function aims to give you a recommended torque value for implant placement based on the implant's surface area and the patient's bone quality.

```python
def recommended_torque(surface_area: float, bone_quality: str) -> float:
```

*   **`def recommended_torque(surface_area: float, bone_quality: str) -> float:`**
    *   `def`: This keyword defines a function.  Think of it as telling Python, "Hey, I'm about to create a reusable block of code."
    *   `recommended_torque`: This is the name of our function.  We've chosen a descriptive name so it's clear what the function does.
    *   `(surface_area: float, bone_quality: str)`:  These are the *parameters* (or inputs) the function needs to work.
        *   `surface_area: float`:  This tells us the function expects a value representing the implant's surface area. The `: float` part is a *type hint*. It suggests that this value s

In [35]:
translation_prompt = """You are a TypeScript expert. Translate the following Python code to TypeScript while preserving comments and docstrings.

def chairside_timer(minutes: int) -> None:
    \"\"\"Notify clinical staff when irrigation time exceeds the safe window.\"\"\"
    if minutes >= 3:
        print("Pause irrigation and reassess tissue response")
    else:
        print("Continue irrigation and monitor patient comfort")"""

translation = generate(
    translation_prompt,
    temperature=0.2,
    max_output_tokens=360,
)
show_text(translation)


```typescript
/**
 * Notify clinical staff when irrigation time exceeds the safe window.
 * @param minutes - The number of minutes of irrigation.
 */
function chairsideTimer(minutes: number): void {
  // Notify clinical staff when irrigation time exceeds the safe window.
  if (minutes >= 3) {
    console.log("Pause irrigation and reassess tissue response");
  } else {
    console.log("Continue irrigation and monitor patient comfort");
  }
}
```


In [37]:
buggy_snippet = """import pandas as pd

def load_implant_log(path: str) -> pd.DataFrame:
    df = pd.read_csv(path)
    if 'torque' not in df.columns:
        raise RuntimeError('torque column missing')
    df['torque'] = df['torque'].astype(int)
    return df"""

debug_prompt = generate(
    f"""Act as a senior Python reviewer. Identify bugs and propose a corrected version.

{buggy_snippet}""",
    temperature=0.25,
    max_output_tokens=360,
)
show_text(debug_prompt)


Okay, I'll review this Python code snippet as a senior Python reviewer.

**Code:**

```python
import pandas as pd

def load_implant_log(path: str) -> pd.DataFrame:
    df = pd.read_csv(path)
    if 'torque' not in df.columns:
        raise RuntimeError('torque column missing')
    df['torque'] = df['torque'].astype(int)
    return df
```

**Review:**

The code aims to load an implant log from a CSV file using pandas, check for the presence of a 'torque' column, and convert the 'torque' column to integers.  Here's a breakdown of potential issues and improvements:

1.  **Error Handling:** The `RuntimeError` is a reasonable choice for a missing column. However, it might be beneficial to provide more context in the error message.

2.  **Data Type Conversion:** The code attempts to convert the 'torque' column to integers. This is good, but it doesn't handle potential errors during the conversion.  If the 'torque' column contains non-numeric values (e.g., strings, floats that cannot be safel

## 🖼️ Multimodal Prompting

- Gemini accepts images alongside text (whitepaper §54). Keep modality instructions explicit so the model knows how to use each input.
- Replace the placeholder URI with your secure Cloud Storage path or upload helper before running this example.


In [None]:
oral_xray = Part.from_uri(
    uri="gs://your-bucket/path/to/post-op-xray.png",
    mime_type="image/png",
)

multimodal_response = generate(
    [
        system("You are a radiology assistant. Provide a concise bullet list of observations and flag any urgent findings."),
        {"role": "user", "parts": [oral_xray, "Context: Patient is 48 hours post-op from a zygomatic implant."]},
    ],
    temperature=0.25,
    max_output_tokens=280,
)
show_text(multimodal_response)


# 🧠🔧 ReAct (Reason and Act) Prompting

ReAct (Reason and Act) is a prompting paradigm that enables large language models (LLMs) to solve complex tasks by combining natural language reasoning with the ability to take actions—such as searching the web or running code. This approach allows the LLM to interact with external tools (like search APIs or code interpreters), making it a foundational step toward building AI agents.

ReAct mimics how humans operate: we reason about a problem, take actions to gather more information, and then update our reasoning based on what we learn. In practice, ReAct prompting creates a "thought-action loop" where the LLM:
1. Reasons about the problem and generates a plan.
2. Executes actions (e.g., web searches) based on that plan.
3. Observes the results of those actions.
4. Updates its reasoning and plans the next steps.
5. Repeats this process until it reaches a solution.

In the following code, we’ll use the LangChain framework for Python, together with VertexAI and the SerpAPI search tool. To run this, you’ll need a (free) SerpAPI key from [serpapi.com](https://serpapi.com/manage-api-key) and set it as the `SERPAPI_API_KEY` environment variable.

The example task: "How many children have a famous dad that performs in the band Metallica?" The LLM will reason through the problem, use web search to find the number of children for each band member, and sum the results—demonstrating the ReAct approach in action.

After running the code, you’ll see how the agent chains together multiple searches, tracks its observations, and arrives at the final answer. This showcases how ReAct prompting enables LLMs to break down and solve multi-step, real-world problems by reasoning and acting iteratively.

For more details and advanced examples, see the referenced notebook in the GoogleCloudPlatform GitHub repository.

In [90]:
!pip install langchain langchain-community google-search-results

Collecting google-search-results
  Downloading google_search_results-2.4.2.tar.gz (18 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: google-search-results
  Building wheel for google-search-results (pyproject.toml) ... [?25ldone
[?25h  Created wheel for google-search-results: filename=google_search_results-2.4.2-py3-none-any.whl size=32093 sha256=1d7ae14fa819762d98847a3accdb4365a2c0bdb9644566a3866aba394df47367
  Stored in directory: /Users/franciscoteixeirabarbosa/Library/Caches/pip/wheels/6e/42/3e/aeb691b02cb7175ec70e2da04b5658d4739d2b41e5f73cd06f
Successfully built google-search-results
Installing collected packages: google-search-results
Successfully installed google-search-results-2.4.2


In [92]:
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Check for SERPAPI_API_KEY in environment
serpapi_key = os.getenv("SERPAPI_API_KEY")
if not serpapi_key:
    raise ValueError("SERPAPI_API_KEY not found in environment. Please set it in your .env file.")

# Use the modern langchain-google-vertexai package
try:
    from langchain_google_vertexai import VertexAI
    from langchain.agents import load_tools, initialize_agent, AgentType
    
    # Initialize with explicit model name
    llm = VertexAI(
        model_name="gemini-2.0-flash",  # or "gemini-2.0-flash"
        temperature=0.1,
        max_output_tokens=1024,
        top_p=0.8,
        top_k=40
    )
    
except ImportError:
    print("Installing langchain-google-vertexai...")
    import subprocess
    subprocess.check_call(["pip", "install", "langchain-google-vertexai"])
    
    from langchain_google_vertexai import VertexAI
    llm = VertexAI(
        model_name="gemini-1.5-pro",
        temperature=0.1
    )

prompt = "How many kids do the band members of Metallica have?"

# Load tools and create agent
tools = load_tools(["serpapi"], llm=llm)
agent = initialize_agent(
    tools, 
    llm,  
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
    verbose=True,
    handle_parsing_errors=True
)

# Execute
result = agent.run(prompt)
print(f"Result: {result}")



[1m> Entering new AgentExecutor chain...[0m


E0000 00:00:1759649091.655084 1550705 alts_credentials.cc:93] ALTS creds ignored. Not running on GCP and untrusted ALTS is not enabled.


[32;1m[1;3mI need to find out how many children each member of Metallica has and then sum them up. I'll start by searching for information on each member individually.
Action: Search
Action Input: "How many children does James Hetfield have?"[0m
Observation: [36;1m[1;3m['Personal life. Hetfield married Francesca Tomasi on August 17, 1997, and together they have three children. He currently resides in Vail, Colorado, citing a " ...', "These are James Hetfield's three children who are Castor, Cali, and Marcella. · James Hetfield—rockstar, father, and family man. · Other posts.", "James and Francesca have three kids: Cali born in 1998, Castor born in 2000 and Marcella born in 2002. 17. Among Hetfield's many tattoos are ...", 'Hetfield also has two daughters Cali and Marcella. Check out a live video from the group in their hometown below.', "Happy Father's Day James Hetfield. So grateful for our three amazing kids!!!! more. View all 46 comments.", "Can you imagine being a kid that's g

# 🦷🔍 Applying ReAct Prompting to Dental Implantology

Now, let's see how the ReAct (Reason and Act) prompting approach can be used to answer a real-world clinical question in dental implantology—my field of interest. 

For this example, we'll use ReAct to investigate:  
**"What is the recommended Implant Stability Quotient (ISQ) threshold for immediate loading in single implants versus multiple implants?"**

This exercise will demonstrate how an LLM can reason through a specialized clinical question, search for up-to-date evidence or guidelines, and synthesize a clear answer—mirroring the way a clinician might approach evidence-based decision making.

In [93]:
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Check for SERPAPI_API_KEY in environment
serpapi_key = os.getenv("SERPAPI_API_KEY")
if not serpapi_key:
    raise ValueError("SERPAPI_API_KEY not found in environment. Please set it in your .env file.")

# Use the modern langchain-google-vertexai package
try:
    from langchain_google_vertexai import VertexAI
    from langchain.agents import load_tools, initialize_agent, AgentType
    
    # Initialize with explicit model name
    llm = VertexAI(
        model_name="gemini-2.0-flash",  # or "gemini-2.0-flash"
        temperature=0.1,
        max_output_tokens=1024,
        top_p=0.8,
        top_k=40
    )
    
except ImportError:
    print("Installing langchain-google-vertexai...")
    import subprocess
    subprocess.check_call(["pip", "install", "langchain-google-vertexai"])
    
    from langchain_google_vertexai import VertexAI
    llm = VertexAI(
        model_name="gemini-1.5-pro",
        temperature=0.1
    )

prompt = "What is the recommended Implant Stability Quotient (ISQ) threshold for immediate loading in single implants versus multiple implants?"

# Load tools and create agent
tools = load_tools(["serpapi"], llm=llm)
agent = initialize_agent(
    tools, 
    llm,  
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
    verbose=True,
    handle_parsing_errors=True
)

# Execute
result = agent.run(prompt)
print(f"Result: {result}")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find information about the recommended ISQ threshold for immediate loading in single and multiple implants. I will use the search tool to find relevant articles or guidelines.
Action: Search
Action Input: "ISQ threshold immediate loading single vs multiple implants"[0m
Observation: [36;1m[1;3m['In the case of immediate loading of single-implant crowns, the recommendations provide an ITV > 20 to 45 N/cm and ISQ > 60 to 65 [16]. In ...', 'Torque values ranging from 30 to 40 Ncm and higher have been usually chosen as thresholds for immediate loading (57, 58).', "For single teeth, ISQ values in the low to mid 70's are preferred and the provisional should be taken out of occlusion. Regardless of the case, ...", 'The torque value in most studies ranged from 30 to 45 N cm as the immediate loading threshold to ensure implant stability during osseointegration and to provide ...', 'The aim of this study was to review avail