# 01 - Prompt Chaining: LPP Classification Pipeline

## What is Prompt Chaining?

Prompt chaining breaks a complex task into a sequence of fixed steps. The output of each step feeds into the next.

## Why use it for privilege review?

It mirrors how a lawyer thinks through privilege:

1. Is this a communication?
2. Who are the parties?
3. Is a lawyer involved?
4. Is legal advice being sought or given?
5. Was it made in confidence?
6. Has privilege been waived?

Each step is auditable - you can see exactly where the reasoning happened.

## Australian Law Reference

- Evidence Act 1995 (Cth) ss 118-119
- Dominant purpose test: *Esso Australia Resources Ltd v Commissioner of Taxation* (1999)


Step 2: Import libraries and set up the client

In [None]:
from openai import OpenAI
from IPython.display import display, Markdown
import json

client = OpenAI()
MODEL = "gpt-4.1-nano"

**What this does:**
- `from openai import OpenAI` — loads the OpenAI library
- `from IPython.display import display, Markdown` — for formatted output
- `import json` — we'll use this later for structured data
- `client = OpenAI()` — creates the connection to OpenAI (uses your API key from `.env`)
- `MODEL = "gpt-4.1-nano"` — sets which model to use (cheap and fast for testing)

Step 3: Create a synthetic test email

A synthetic privileged email for testing our classification pipeline.



In [None]:
test_email = {
    "id": "DOC001",
    "from": "sarah.chen@acmecorp.com.au",
    "to": "michael.wong@wongpartners.com.au",
    "cc": [],
    "date": "2024-03-15",
    "subject": "Urgent - Contract dispute advice needed",
    "body": """
Hi Michael,

As discussed on the phone, we've received a letter from BuildRight Pty Ltd claiming we breached the construction contract for the Melbourne warehouse project.

They're claiming $2.3 million in damages and threatening to commence proceedings in the Victorian Supreme Court.

Can you please review the attached contract and advise on:
1. Whether we have a valid defence
2. Our potential liability exposure
3. Whether we should attempt to negotiate a settlement

This is confidential and I need your legal advice urgently.

Regards,
Sarah Chen
General Counsel
ACME Corporation Pty Ltd
""",
    "attachments": ["construction_contract.pdf"]
}

print(f"Test email created: {test_email['subject']}")

**What this does:**
- Creates a fake email from a client to their lawyer
- Seeks legal advice on a contract dispute
- This should be classified as PRIVILEGED

## Step 4: Chain Step 1 - Identify the Parties

The first step in our privilege analysis - who sent this and who received it?

**What this does:**
- Sends the email to the LLM
- Asks it to extract sender, recipients, and their roles
- Returns structured information for the next step in the chain

In [None]:
def step1_identify_parties(email):
    """Extract sender, recipients and their roles"""
    
    messages = [
        {"role": "system", "content": "You are a legal document analyst. Extract party information concisely."},
        {"role": "user", "content": f"""
Analyse this email and identify:
1. Sender name and role
2. Recipient name and role  
3. Any CC'd parties

Email:
From: {email['from']}
To: {email['to']}
CC: {email['cc']}
Subject: {email['subject']}
Body: {email['body']}

Respond in this format:
SENDER: [name] - [role]
RECIPIENT: [name] - [role]
CC: [names and roles, or "None"]
"""}
    ]
    
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages
    )
    
    return response.choices[0].message.content

# Run Step 1
parties_result = step1_identify_parties(test_email)
display(Markdown(f"### Step 1 Result: Parties Identified\n\n{parties_result}"))

## Step 5: Chain Step 2 - Is a Lawyer Involved?

For privilege to apply, the communication must involve a lawyer.

**What this does:**
- Takes the parties identified in Step 1
- Determines if any party is a lawyer or legal professional
- This is essential for privilege - no lawyer, no privilege

In [None]:
def step2_lawyer_involved(email, parties_result):
    """Determine if a lawyer is involved in the communication"""
    
    messages = [
        {"role": "system", "content": "You are an Australian legal privilege expert."},
        {"role": "user", "content": f"""
Based on the parties identified, determine if a lawyer is involved.

Parties:
{parties_result}

Email domain context:
- Sender domain: {email['from'].split('@')[1]}
- Recipient domain: {email['to'].split('@')[1]}

Consider:
1. Is the sender or recipient a lawyer?
2. Look for indicators: law firm domains, titles like "Partner", "Solicitor", "Counsel"
3. In-house counsel also qualifies

Respond in this format:
LAWYER_INVOLVED: [Yes/No]
LAWYER_NAME: [name or "N/A"]
LAWYER_ROLE: [role - e.g., "External solicitor", "In-house counsel", or "N/A"]
REASONING: [brief explanation]
"""}
    ]
    
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages
    )
    
    return response.choices[0].message.content

# Run Step 2 - using output from Step 1
lawyer_result = step2_lawyer_involved(test_email, parties_result)
display(Markdown(f"### Step 2 Result: Lawyer Analysis\n\n{lawyer_result}"))

## Step 6: Chain Step 3 - Is Legal Advice Being Sought or Given?

The dominant purpose test: was the communication made for the dominant purpose of obtaining legal advice?

**What this does:**
- Analyses the content of the email
- Determines if legal advice is being requested or provided
- Looks for advice-seeking language vs operational/business discussion
- This is the core of the *Esso v Commissioner of Taxation* test

In [None]:
def step3_legal_advice(email, lawyer_result):
    """Determine if legal advice is being sought or given"""
    
    messages = [
        {"role": "system", "content": """You are an Australian legal privilege expert.
Apply the dominant purpose test from Esso Australia Resources Ltd v Commissioner of Taxation (1999)."""},
        {"role": "user", "content": f"""
Analyse whether legal advice is being sought or given in this communication.

Previous analysis - Lawyer involvement:
{lawyer_result}

Email content:
Subject: {email['subject']}
Body: {email['body']}

Consider:
1. Is the sender requesting legal advice?
2. Is the sender providing legal advice?
3. Is the DOMINANT PURPOSE legal advice, or is it operational/business?
4. Look for phrases like "please advise", "legal opinion", "our exposure"

Respond in this format:
LEGAL_ADVICE: [Yes/No]
DIRECTION: [Seeking/Giving/Neither]
DOMINANT_PURPOSE: [Legal advice/Business operational/Mixed]
KEY_PHRASES: [quote relevant phrases from the email]
REASONING: [brief explanation applying the dominant purpose test]
"""}
    ]
    
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages
    )
    
    return response.choices[0].message.content

# Run Step 3 - using output from Step 2
advice_result = step3_legal_advice(test_email, lawyer_result)
display(Markdown(f"### Step 3 Result: Legal Advice Analysis\n\n{advice_result}"))

## Step 7: Chain Step 4 - Was It Made in Confidence?

Privilege only attaches to confidential communications.

**What this does:**
- Checks if the communication was intended to be confidential
- Looks for confidentiality markers and indicators
- Checks if third parties were included who might break confidence

In [None]:
def step4_confidentiality(email, parties_result):
    """Determine if the communication was made in confidence"""
    
    messages = [
        {"role": "system", "content": "You are an Australian legal privilege expert."},
        {"role": "user", "content": f"""
Analyse whether this communication was made in confidence.

Parties:
{parties_result}

Email:
Subject: {email['subject']}
CC: {email['cc']}
Body: {email['body']}

Consider:
1. Are there confidentiality statements in the email?
2. Were any third parties CC'd who are not the client or lawyer?
3. Was it sent to a broad distribution list?
4. Any "without prejudice" or "privileged and confidential" markers?

Respond in this format:
CONFIDENTIAL: [Yes/No/Uncertain]
CONFIDENTIALITY_MARKERS: [list any found, or "None"]
THIRD_PARTIES: [list any non-lawyer/non-client recipients, or "None"]
WAIVER_RISK: [None/Low/Medium/High]
REASONING: [brief explanation]
"""}
    ]
    
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages
    )
    
    return response.choices[0].message.content

# Run Step 4 - using output from Step 1
confidentiality_result = step4_confidentiality(test_email, parties_result)
display(Markdown(f"### Step 4 Result: Confidentiality Analysis\n\n{confidentiality_result}"))

## Step 8: Chain Step 5 - Final Privilege Determination

Combine all analysis steps into a final classification.

**What this does:**
- Takes the outputs from all previous chain steps
- Applies the complete Australian LPP test
- Makes a final determination: PRIVILEGED, NOT PRIVILEGED, or UNCERTAIN
- Provides confidence score and reasoning for HITL review

In [None]:
def step5_final_determination(email, parties_result, lawyer_result, advice_result, confidentiality_result):
    """Make final privilege determination based on all chain steps"""
    
    messages = [
        {"role": "system", "content": """You are a senior Australian litigation lawyer making a privilege determination.
Apply Evidence Act 1995 (Cth) ss 118-119 and the dominant purpose test from Esso v Commissioner of Taxation (1999)."""},
        {"role": "user", "content": f"""
Make a final privilege determination based on this analysis chain.

STEP 1 - Parties:
{parties_result}

STEP 2 - Lawyer Involvement:
{lawyer_result}

STEP 3 - Legal Advice Analysis:
{advice_result}

STEP 4 - Confidentiality:
{confidentiality_result}

Document: {email['subject']}
Attachments: {email['attachments']}

Apply the test:
1. Is there a lawyer involved? (Required)
2. Is the dominant purpose seeking/giving legal advice? (Required - Esso test)
3. Was it made in confidence? (Required)
4. Has privilege been waived? (Check for third party disclosure)

Respond in this format:
CLASSIFICATION: [PRIVILEGED/NOT_PRIVILEGED/UNCERTAIN]
CONFIDENCE_SCORE: [0-100]
LEGAL_BASIS: [cite relevant statute or case]
DOMINANT_PURPOSE_MET: [Yes/No]
CONFIDENTIALITY_MET: [Yes/No]
WAIVER: [Yes/No]
ATTACHMENTS_NOTE: [Are attachments also covered by privilege?]
REASONING: [2-3 sentence summary for senior lawyer review]
ESCALATION_REQUIRED: [Yes/No]
ESCALATION_REASON: [if Yes, explain why human review needed]
"""}
    ]
    
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages
    )
    
    return response.choices[0].message.content

# Run Step 5 - Final determination using all previous outputs
final_result = step5_final_determination(
    test_email, 
    parties_result, 
    lawyer_result, 
    advice_result, 
    confidentiality_result
)
display(Markdown(f"### Step 5 Result: Final Privilege Determination\n\n{final_result}"))

## Step 9: Complete Chain Summary

Display the full analysis pipeline showing how each step fed into the next.

**What this does:**
- Shows the complete chain of reasoning
- Demonstrates the audit trail for legal defensibility
- Each step is traceable and explainable

In [None]:
summary = f"""
# Prompt Chaining: Complete Privilege Analysis

## Document: {test_email['subject']}

---

### Chain Step 1: Parties
{parties_result}

---

### Chain Step 2: Lawyer Involvement
{lawyer_result}

---

### Chain Step 3: Legal Advice
{advice_result}

---

### Chain Step 4: Confidentiality
{confidentiality_result}

---

### Chain Step 5: Final Determination
{final_result}

---

## Chain Flow
```
Email → Parties → Lawyer? → Legal Advice? → Confidential? → PRIVILEGED ✓
```
"""

display(Markdown(summary))

## Step 10: Export to CSV for Senior Lawyer Review

Create a CSV output for human-in-the-loop (HITL) review.

**What this does:**
- Extracts key fields from each chain step
- Creates a structured CSV row for this document
- Includes blank columns for senior lawyer review and sign-off
- This is the deliverable for legal defensibility

In [None]:
import pandas as pd
from datetime import datetime

# Parse results into structured data
def parse_result(result_text, field):
    """Extract a field value from the LLM output"""
    for line in result_text.split('\n'):
        if line.startswith(field + ':'):
            return line.split(':', 1)[1].strip()
    return "Not found"

# Build the CSV row
csv_row = {
    "doc_id": test_email['id'],
    "filename": f"{test_email['id']}.eml",
    "doc_type": "Email",
    "date": test_email['date'],
    "subject": test_email['subject'],
    "sender": test_email['from'],
    "recipients": test_email['to'],
    "has_attachments": "Yes" if test_email['attachments'] else "No",
    "attachment_names": ", ".join(test_email['attachments']),
    "classification": parse_result(final_result, "CLASSIFICATION"),
    "confidence_score": parse_result(final_result, "CONFIDENCE_SCORE"),
    "dominant_purpose_met": parse_result(final_result, "DOMINANT_PURPOSE_MET"),
    "confidentiality_met": parse_result(final_result, "CONFIDENTIALITY_MET"),
    "waiver": parse_result(final_result, "WAIVER"),
    "legal_basis": parse_result(final_result, "LEGAL_BASIS"),
    "reasoning": parse_result(final_result, "REASONING"),
    "escalation_required": parse_result(final_result, "ESCALATION_REQUIRED"),
    "escalation_reason": parse_result(final_result, "ESCALATION_REASON"),
    # Blank columns for senior lawyer HITL review
    "reviewer_notes": "",
    "reviewer_decision": "",
    "reviewed_by": "",
    "review_date": ""
}

# Create DataFrame and export
df = pd.DataFrame([csv_row])
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
csv_filename = f"privilege_review_chaining_{timestamp}.csv"

# Display preview
display(Markdown("### CSV Preview for HITL Review"))
display(df[['doc_id', 'subject', 'classification', 'confidence_score', 'escalation_required']])

# Save to file
df.to_csv(csv_filename, index=False)
display(Markdown(f"**Exported:** `{csv_filename}`"))

## Conclusion: Prompt Chaining for LPP Classification

### What We Built

A 5-step sequential pipeline for Australian Legal Professional Privilege classification:

| Step | Analysis | Output |
|------|----------|--------|
| 1 | Identify parties | Sender, recipient, roles |
| 2 | Lawyer involved? | Yes/No + reasoning |
| 3 | Legal advice sought/given? | Dominant purpose test |
| 4 | Confidential? | Waiver risk assessment |
| 5 | Final determination | Classification + confidence |

### Why Prompt Chaining Works for Privilege

- **Auditable:** Each step is logged and traceable
- **Mirrors legal reasoning:** Follows how a lawyer analyses privilege
- **Defensible:** Can explain exactly why a document was classified
- **HITL ready:** CSV export for senior lawyer review and sign-off

### Limitations

**Fixed steps may not suit all document types**
- This pipeline assumes a standard email format
- Board minutes, file notes, or multi-party correspondence may need different analysis steps
- Documents with complex attachment hierarchies (email with attachment that itself contains attachments) don't fit neatly into this linear flow

**Complex email chains may need dynamic analysis**
- A 50-message email thread with multiple parties joining and leaving requires adaptive analysis
- The Orchestrator-Worker pattern (Notebook 04) breaks down such documents dynamically rather than forcing them through fixed steps
- Some documents need more steps, some need fewer - chaining can't adjust

**Single model - no consensus checking**
- We're trusting one model's judgment entirely
- A hallucination or misinterpretation has no safety net
- The Parallelization pattern (Notebook 03) runs multiple models and flags disagreements
- For high-stakes privilege decisions, consensus from 2-3 models is more defensible

### Next Notebook

`02_routing.ipynb` - Route documents to specialist classifiers based on document type.