#**README**



---



## About the notebook

This notebook demonstrates a critical concept of Prompt Engineering called **Prompt Chaining**. It shows how we can leverage response from one prompt to chain to the next using stateful conversational capablity of Responses API, so an AI model can solve a complex, multi-step problem without a single bulky prompt, overflowing context window or latency overhead.<br>

*Pre-requisites:* To run this notebook,

- Refer to [Google Colab](https://colab.research.google.com/) to get started instantly, for *free* !

- Download and open this notebook in Google Colab.

- Get your access key to the [OpenAI](https://platform.openai.com/account/api-keys) API.<br>

- Set up this access key under *secrets* in your Google Colab runtime environment. </br>
How to configure API access keys in Colab:

  - Go to the "🔑" icon in the left sidebar (Secrets).
  - Click "Add new secret".
  - For the name, use 'openai_api_key'.
  - For the value, paste your OpenAI API key.
  - Make sure "Notebook access" is enabled for this secret.

- You are all set! Have fun!

---


## Topic: Prompt Chaining - Transform Your AI Assistant into a Strategic Problem-Solving Partner
#### Overview
Transform generic AI responses into comprehensive, multi-dimensional solutions that address technical design, business impact, and implementation strategy simultaneously.<br>

#### The Problem
Single prompts typically produce surface-level, generic solutions when complex problems require deep analysis across multiple domains and stakeholder perspectives.<br>

#### Example Use Case
Challenge: "Our Features system lacks real-time enrollment tracking, making customer debugging impossible."<br>

This isn't just a technical issue, it's a multi-dimensional challenge requiring:
<br>

- System architecture understanding
- User experience design
- Dashboard requirements
- Resource planning
- Risk assessment

Solution: Prompt Chaining Technique
Prompt chaining creates intelligent conversation flows where each AI response builds on previous context, delivering actionable solutions that address real-world complexities.

#### Implementation Framework
*Link 1:* Problem Decomposition & System Analysis
- Identify core issues and dependencies
- Map current system limitations
- Define success criteria

*Link 2:* Technical Architecture & Data Requirements
- Design system components
- Define data flow and storage needs
- Identify integration points

*Link 3:* Dashboard Design & Implementation Roadmap
- Create user interface specifications
- Plan development phases
- Define deployment strategy

*Link 4:* Resource Estimation & Risk Mitigation
- Calculate effort and timeline
- Identify potential blockers
- Plan contingency strategies
<br>
What do you think the response would be?<br>

---

In [1]:
# Hands-on excercise using OpenAI API and GPT5
# This hands-on exercise provides a prompt chaining example

### About this excercise:
</br>

|Category|Description|
|:--|:--|
|Task |AI tool as a strategic advisor on problem solving|
|Difficuly Level|Intermediate|
|Skills|Python|
</br>

In [2]:
#Import display and Markdown from IPython for formatted rendering of generated response
from IPython.display import Markdown, display

In [3]:
def printmd(string):
    display(Markdown(string))

Access setup - This function allows secure access to user-defined secrets stored in the Colab environment, such as API keys.

In [4]:
from google.colab import userdata

Import - interact with the OpenAI API, allows us to make requests to models like GPT-5

In [5]:
#make use of the stateful nature of OpenAI Responses api to learn from responses and debug easily
#use conversational benefits of passing the response ID from the earlier conversation to the next without additional latency or broadening the context window
from openai import OpenAI

In [6]:
client = OpenAI(api_key=userdata.get('openai_api_key'))

**Link 1:** Prepare basic prompt input and generate response.

In [7]:
input_problem = 'Our Features system lacks real-time enrollment tracking, making customer debugging impossible'

In [8]:
response = client.responses.create(
    model="gpt-5",
    input=f"Build a presentation of total 5 slides, maximum 3 bullet points on each slide to solve problem {input_problem}. think step-by-step"
)

In [9]:
about_this_print='Following is how the generic model responds to a basic prompt:'
printmd('<div style="background-color: lightblue; padding: 10px;">%s</div><br>' % about_this_print)

printmd('<div style="background-color: lightblue; padding: 10px;">%s</div>' % response.output_text)

<div style="background-color: lightblue; padding: 10px;">Following is how the generic model responds to a basic prompt:</div><br>

<div style="background-color: lightblue; padding: 10px;">Slide 1 — Problem & Goal
- Today: no real-time enrollment tracking → blind to customer state, slow debugging
- Impact: long MTTR, escalations, erosion of trust and revenue risk
- Goal: real-time, per-customer enrollment timeline enabling fast, self-serve debugging

Slide 2 — Key Requirements
- Freshness ≤1 minute; ordered per user; idempotent with replay/backfill
- Searchable by customer, feature, device; correlate with requests/releases
- Secure by design: minimize PII, RBAC, retention/redaction policies

Slide 3 — Proposed Architecture
- Instrument enrollment lifecycle events with correlation IDs (server + client SDKs)
- Stream (Kafka/Kinesis/PubSub) → process (dedupe/order/PII scrub) → store (ClickHouse/Druid/Elastic) + hot cache (Redis)
- Debug UI/API: per-customer timeline, env diff, search; anomaly alerts

Slide 4 — Implementation Plan (Step-by-Step)
- Phase 1: define event schema/IDs; update SDKs; create topics; ship server-side events; baseline dashboards
- Phase 2: build processors (idempotency, ordering, PII); backfill history; indexed queries + freshness/completeness metrics
- Phase 3: ship Debug UI/APIs; alerts/runbooks/RBAC; staged rollout by tenant; set SLOs and on-call

Slide 5 — Success Metrics & Risks
- SLOs: p95 freshness ≤60s; 99.9% completeness; search p95 ≤2s
- Outcomes: MTTR -50%, support tickets -30%, on-call time -25%
- Risks/mitigations: scale/ordering (idempotent keys, partitioning), privacy (schema review/redaction), SDK adoption (flags, gradual rollout)</div>

**Link 2:** Start chaining with revised instruction for leadership presentation style.

In [10]:
response2 = client.responses.create(
    model="gpt-5",
    previous_response_id=response.id,
    input=[{"role": "user", "content": "make it fluent for product and business leadership teams"}]
)

In [11]:
about_this_print='Following is how the generic model responds to a chained prompt using earlier response id:'
printmd('<div style="background-color: lightblue; padding: 10px;">%s</div><br>' % about_this_print)

printmd('<div style="background-color: lightblue; padding: 10px;">%s</div>' % response2.output_text)

<div style="background-color: lightblue; padding: 10px;">Following is how the generic model responds to a chained prompt using earlier response id:</div><br>

<div style="background-color: lightblue; padding: 10px;">Slide 1 — Why this matters
- Today we can’t see a customer’s real-time enrollment state, so issues linger and escalate
- Business impact: longer MTTR, more support load, stalled deals, and trust risk with enterprise customers
- Goal: a live, per-customer enrollment timeline that enables fast, self-serve resolution

Slide 2 — What “good” looks like
- Fresh, reliable view (updates in under 1 minute), searchable by customer, feature, device
- Simple tools for Support, CS, Sales, and Eng: one place to answer “what happened and when?”
- Enterprise-ready: minimal PII, access controls, and clear retention policies

Slide 3 — Solution at a glance
- Capture key moments in the enrollment journey (attempt, decision, success/fail) with a shared ID
- Stream events into a real-time store that orders, deduplicates, and readies data for search
- Deliver a Debug Console and API: customer timeline, compare environments, alerts on anomalies

Slide 4 — Step-by-step rollout
- Phase 1 (Foundations): finalize event schema; instrument core paths; basic dashboards (2–3 weeks)
- Phase 2 (Build): real-time pipeline and searchable store; privacy/quality guards; backfill recent history (4–6 weeks)
- Phase 3 (Launch): Debug Console + RBAC; SLOs and runbooks; staged rollout to top accounts (3–4 weeks)

Slide 5 — Outcomes, ROI, and risks
- Target outcomes: MTTR down 50%, support tickets down 30%, on-call time down 25%
- Investment: small cross-functional squad (PM, BE, Data, FE) + modest infra; payback via support savings and churn prevention
- Key risks and mitigations: data scale/ordering (idempotent keys, partitions), privacy (schema review/redaction), adoption (training, champions, gradual rollout)</div>

**Link 3:** Continue chaining with revised instruction for engineering  presentation style.

In [12]:
response3 = client.responses.create(
    model="gpt-5",
    previous_response_id=response2.id,
    input=[{"role": "user", "content": "make it technical for engineering team to estimate the effort"}]
)

In [13]:
about_this_print='Following is how the generic model responds to a chained prompt using earlier response id and revised instruction:'
printmd('<div style="background-color: lightblue; padding: 10px;">%s</div><br>' % about_this_print)

printmd('<div style="background-color: lightblue; padding: 10px;">%s</div>' % response3.output_text)

<div style="background-color: lightblue; padding: 10px;">Following is how the generic model responds to a chained prompt using earlier response id and revised instruction:</div><br>

<div style="background-color: lightblue; padding: 10px;">Slide 1 — Scope, schema, scale
- Event types/fields: enrollment_attempt/decision/success/failure/state_change; keys: tenant_id, user_id_hash, device_id, feature_id, env, build/version, request_id, correlation_id, event_ts/ingest_ts, source, reason/error; no raw PII
- Coverage: server evaluator + gateways; SDKs (Web/iOS/Android/Node) with retry/offline queue; clock skew tolerance ±10m
- Scale assumptions: peak 5k RPS (~400M/day), ~400B/event (~160GB/day); hot retention 30d, cold archive 180d

Slide 2 — Ingestion and processing
- Transport: Kafka (12–24 partitions, RF=3, acks=all, idempotent producers, linger/batch tuned) or Kinesis (≥10 shards); DLQ for parse/schema errors
- Processor: dedupe by idempotency_key (tenant_id+request_id), order per (tenant_id,user/device_id) with 10m window; late/dup handling, PII scrubbing/redaction
- Storage: ClickHouse (MergeTree; partition by toDate(event_ts); order by (tenant_id, user/device_id, event_ts)) with secondary indexes; Redis cache for hot timelines (TTL ~15m)

Slide 3 — APIs and contracts
- Endpoints: GET /timelines (tenant_id, user|device, feature, from/to), GET /diff (envA vs envB), GET /search, POST /annotations; p95 ≤2s; 10k events cap/query
- Semantics: eventual consistency with freshness SLO ≤60s; pagination, time-bucketed queries; materialized aggregates (last_state, event_counts)
- Security: mTLS/JWT, RBAC roles (Support/Eng/Admin), per-tenant rate limits, audit logs; request/response size limits

Slide 4 — UI, observability, data quality
- Debug Console: ordered timeline with filters (tenant/feature/device/env), event detail panes, correlation IDs linking to logs/traces/deploys; env-compare view
- Observability: dashboards for lag, consumer liveness, DLQ rate, dedupe hit rate; SLOs freshness ≤60s, completeness ≥99.9%; alerting and runbooks
- Data quality: schema registry with compatibility checks, canary validators, replay/backfill tool (topic→store), synthetic tenants and fixtures for tests

Slide 5 — Plan, estimates, risks
- Team and schedule: 1 BE, 1 Data, 1 FE, 0.5 SRE; Phase 1 (2–3 wks) foundations; Phase 2 (4–6 wks) pipeline/APIs/SDKs; Phase 3 (3–4 wks) UI/RBAC/rollout
- Dependencies/deliverables: Kafka/Kinesis + schema registry, ClickHouse cluster + Redis, SDK releases, load/capacity tests, alerts/runbooks, staged rollout to top tenants
- Risks/mitigations: hot partitions (hash user/device + more partitions), privacy (PII redaction/review), SDK adoption (feature flags, gradual rollout), cost (tiered retention, compaction)</div>

**Link 4:** Continue chaining with revised instruction for reasoning and impacting talking points.

In [14]:
response4 = client.responses.create(
    model='o4-mini',
    input=[{"role": "user", "content": "explain the effort"}],
    previous_response_id=response3.id,
    reasoning={"effort": "high"},
    max_output_tokens=10000,
    stream=False
)

In [15]:
about_this_print='Following is how the reasoning optimized model explains the thought porcess:'
printmd('<div style="background-color: #a8ee90; padding: 10px;">%s</div><br>' % about_this_print)

printmd('<div style="background-color: #a8ee90; padding: 10px;">%s</div>' % response4.output_text)

<div style="background-color: #a8ee90; padding: 10px;">Following is how the reasoning optimized model explains the thought porcess:</div><br>

<div style="background-color: #a8ee90; padding: 10px;">Here’s a high-level breakdown of the engineering effort to deliver the real-time enrollment tracking system. All estimates assume a cross-functional team:  
 • Backend (BE) engineers  
 • Data engineers  
 • Frontend (FE) engineer  
 • SRE/infra engineer  

1. Phase 1 – Foundations (2–3 weeks) ~250 h  
 • Define event schema, idempotency keys, PII policy  
    – BE: 40 h; Data: 10 h  
 • Instrument server paths & SDKs (web, iOS, Android, Node)  
    – BE: 30 h; FE: 10 h  
 • Provision Kafka/Kinesis topics, schema registry  
    – SRE: 40 h  
 • Build basic producer pipelines & sanity dashboards  
    – BE: 40 h; Data: 20 h  
 • QA & integration tests  
    – BE/FE: 20 h  

2. Phase 2 – Real-time Pipeline & APIs (4–6 weeks) ~600 h  
 • Stream processors: dedupe, ordering window, PII scrub  
    – BE: 120 h; Data: 80 h  
 • Storage: ClickHouse schema, MergeTree tuning, Redis cache  
    – SRE: 40 h; Data: 40 h  
 • Build query APIs: /timelines, /search, /diff, POST annotations  
    – BE: 100 h  
 • SDK enhancements (retry/offline queue, correlation IDs)  
    – BE: 40 h; FE: 40 h  
 • End-to-end tests, backfill tooling, canary validations  
    – Data: 40 h; BE: 20 h; SRE: 20 h  

3. Phase 3 – Debug UI, Security, Rollout (3–4 weeks) ~350 h  
 • Debug Console UI: timeline view, filters, env-compare  
    – FE: 120 h; BE: 40 h for endpoints  
 • RBAC, mTLS/JWT, rate limits, audit-logging  
    – BE: 40 h; SRE: 20 h  
 • Observability: consumer lag, DLQ, SLO dashboards, alerts/runbooks  
    – SRE: 40 h  
 • Documentation, training, staged tenant rollout  
    – BE/FE/Data/SRE: 50 h  
 • Buffer for bug-fixes & iteration  
    – All: 40 h  

Total estimated effort ~1,200 engineering hours (~150 days).  
• If staffed as 2 BE, 0.5 Data, 0.5 FE, 0.5 SRE FTE → ~3 calendar months  
• Key dependencies: infra capacity (Kafka, ClickHouse, Redis), SDK release cadence, schema-registry setup, tenant coordination for rollout.  

This breakdown will help the team size stories, plan sprints, and track progress against milestones.</div>

---

##Congratulations on running this fun exercise!


**Key Advantages**
- Context Preservation & Knowledge Building
 - Each prompt builds on previous responses
 - Creates a cumulative knowledge base
 - Maintains understanding of specific constraints and objectives

- Multi-Dimensional Problem Solving
 - Addresses different stakeholder needs simultaneously
 - Technical depth for engineers
 - Business impact for executives
 - Resource planning for project managers
 - Risk assessment for leadership

**Expected Outcomes**

- Faster Decision Making - Comprehensive analysis reduces back-and-forth
- Better Stakeholder Alignment - Addresses all perspectives in a cohesive framework
- Higher Implementation Success - Solutions account for real-world complexities
- Strategic AI Partnership - AI evolves from tool to strategic advisor<br>

You just proved it yourself!<br>


**Fun Fact:** Did you notice that we could choose different models to chain the prompts?<br>



**Extra Credit:**
- Identify a Complex Problem - Multi-faceted challenges work best
- Define Stakeholder Needs - What does each audience require?
- Design Chain Structure - Plan 3-5 logical progression links
- Execute Sequential Prompts - Build context with each interaction
- Synthesize Results - Combine outputs into a comprehensive solution


**Pro Tip:** Best Practices

- Keep each link focused on a specific domain
- Explicitly reference previous context in subsequent prompts
- Design for different stakeholder audiences
- Plan 3-5 links maximum for optimal results
- Test chain logic before execution<br>


If you are an AI Enthusiast, don't stop here - you could start with your first, basic RAG based on prompt chaining and a few more techniques we will cover soon. Happy Prompt Chaining!<br>