# ðŸ““ The GenAI Revolution Cookbook

**Title:** How to Build Reliable LangChain LLM Workflows in 15 Minutes Flat

**Description:** Get hands-on with LangChain: install, configure models, build prompt-driven chains, and parse structured outputsâ€”launch reliable Python LLM workflows fast today.

---

*This jupyter notebook contains executable code examples. Run the cells below to try out the code yourself!*



You want a fast path to a reliable LLM workflow. In 15 minutes, you will build a production\-ready LLM workflow in Python using LangChain that outputs strict JSON for downstream systems. Setup friction and unclear abstractions stall delivery. Brittle parsing breaks downstream systems. This guide shows you how to build a chain that turns customer emails into strict JSON dictionaries ready for routing. You will route tickets, initiate refunds, and update analytics without manual intervention. You need Python 3\.9\+, an OpenAI account with API access, and basic familiarity with pip and environment variables.

## Why Use LangChain for This Problem

LangChain gives you composable building blocks for prompts, models, and parsers. You avoid boilerplate for message formatting, retry logic, and output parsing. The LCEL syntax makes chains readable and testable. You can swap models, adjust prompts, and add validation without rewriting integration code.

You could call the OpenAI SDK directly. You would get full control, but you would also write your own message formatting, error handling, and parsing logic. LlamaIndex is great for retrieval\-augmented generation. It adds complexity if you only need prompt\-to\-structured\-output workflows. LangChain strikes a balance. It handles the plumbing while keeping your code explicit and maintainable.

## Core Concepts for This Use Case

* Runnables and chains. A runnable is any component you can invoke with input and get output. Chains are runnables composed with the pipe operator. You will chain a prompt template, an LLM, and a parser into a single callable unit.
* Prompt templates. Templates let you parameterize system and user messages. You inject variables like persona and user input at runtime. This keeps prompts version\-controlled and testable.
* Structured output parsers. Parsers enforce a schema on the model output. You define fields and types. The parser validates the JSON. If the model returns invalid JSON, you can catch the exception and retry.
* Messages. LangChain uses message objects to represent conversation turns. This makes multi\-turn context explicit and portable across models.
* LCEL. LangChain Expression Language connects components with the pipe operator. You read chains top to bottom. You test them in small steps.

## Setup

Install the required packages. Run this command in your terminal or notebook.

In [None]:
# <pre><code class="language-python">!pip install -U langchain langchain-community langchain-openai openai python-dotenv

python \-m pip install \-U \\
 langchain langchain\-core langchain\-openai \\
 pydantic python\-dotenv tiktokenLoad your API keys securely. Never hardcode keys in your code. Use a .env file locally, or Colab secrets if you are in Colab.

In [None]:
# <pre><code class="language-python">import os
from google.colab import userdata
from google.colab.userdata import SecretNotFoundError

keys = ["OPENAI_API_KEY", "ANTHROPIC_API_KEY"]
missing = []
for k in keys:
    value = None
    try:
        value = userdata.get(k)
    except SecretNotFoundError:
        pass

    os.environ[k] = value if value is not None else ""

    if not os.environ[k]:
        missing.append(k)

if missing:
    raise EnvironmentError(f"Missing keys: {', '.join(missing)}. Add them in Colab â†’ Settings â†’ Secrets.")

print("All keys loaded.")

import os
from dotenv import load\_dotenv

\# Load .env if present
load\_dotenv()

\# Option A. Standard environment variables
required \= \["OPENAI\_API\_KEY"]
missing \= \[k for k in required if not os.getenv(k)]
if missing:
 raise ValueError(f"Missing environment variables: {missing}")

\# Optional. If you use Colab, uncomment and set your secrets there
\# from google.colab import userdata
\# os.environ\["OPENAI\_API\_KEY"] \= userdata.get("OPENAI\_API\_KEY") or os.getenv("OPENAI\_API\_KEY", "")

print("API key loaded.")Verify imports and environment. This block ensures your OpenAI key is present and imports the core LangChain components.

In [None]:
# <pre><code class="language-python">import os

assert os.getenv("OPENAI_API_KEY"), "Set OPENAI_API_KEY in your Colab secrets"

from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

import os
assert os.getenv("OPENAI\_API\_KEY"), "OPENAI\_API\_KEY is not set."

from langchain\_openai import ChatOpenAI
from langchain\_core.messages import SystemMessage, HumanMessage, AIMessage
from langchain\_core.prompts import ChatPromptTemplate
from langchain\_core.output\_parsers import StrOutputParser
from langchain.output\_parsers import ResponseSchema, StructuredOutputParser

print("Imports successful.")Instantiate the model. Use temperature 0 for deterministic output. Set max\_tokens to control cost.

In [None]:
# <pre><code class="language-python">llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0,
    max_tokens=300,
)

MODEL\_NAME \= os.getenv("MODEL\_NAME", "gpt\-4o\-mini")
TEMPERATURE \= float(os.getenv("TEMPERATURE", "0"))
MAX\_TOKENS \= int(os.getenv("MAX\_TOKENS", "256"))

llm \= ChatOpenAI(
 model\=MODEL\_NAME,
 temperature\=TEMPERATURE,
 max\_tokens\=MAX\_TOKENS,
)
print(f"Model ready. model\={MODEL\_NAME}, temperature\={TEMPERATURE}, max\_tokens\={MAX\_TOKENS}")Test the model with a single prompt. This verifies connectivity and credentials.

In [None]:
# <pre><code class="language-python">msg = llm.invoke("Summarize why consistent JSON outputs help downstream systems.")
print(type(msg))
print(msg.content)

res \= llm.invoke("Reply with the word: pong")
print(res.content) \# expected: "pong"Inspect the response metadata. This shows token usage and finish reason.

In [None]:
# <pre><code class="language-python">print("Response metadata:", getattr(msg, "response_metadata", {}))
print("Usage metadata:", getattr(msg, "usage_metadata", {}))

print(res.response\_metadata)
\# Example keys: token\_usage, finish\_reason, model\_name, system\_fingerprint

## Using the Tool in Practice

Build a multi\-turn conversation. Use explicit message roles to control context.

In [None]:
# <pre><code class="language-python">messages = [
    SystemMessage(content="You are a concise assistant that extracts key facts."),
    HumanMessage(content="I purchased earbuds last week. The left bud is dead."),
    AIMessage(content="Noted. A device failure on the left earbud."),
    HumanMessage(content="What information would you need to process a warranty claim?")
]

reply = llm.invoke(messages)
print(reply.content)

history \= \[
 SystemMessage(content\="You are a helpful assistant."),
 HumanMessage(content\="Who won the 2018 FIFA World Cup?"),
 AIMessage(content\="France won the 2018 FIFA World Cup."),
 HumanMessage(content\="Who was the runner\-up?")
]
followup \= llm.invoke(history)
print(followup.content)Create a prompt template with variables. Inject persona and user input at runtime.

In [None]:
# <pre><code class="language-python">prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "{persona}"),
        ("human", "{user_input}")
    ]
)

rendered = prompt.invoke({
    "persona": "You are a helpful customer support assistant.",
    "user_input": "Customer reports a faulty left earbud after 7 days. Next step?"
})
print(rendered.to_messages())

prompt \= ChatPromptTemplate.from\_messages(
 \[
 ("system", "{persona}\\nFollow the format instructions.\\n{format\_instructions}"),
 ("human", "{input}")
 ]
)

\# Quick smoke test with a simple string parser
simple\_chain \= prompt \| llm \| StrOutputParser()
print(simple\_chain.invoke({
 "persona": "You are concise.",
 "format\_instructions": "Answer in one short sentence.",
 "input": "What is LCEL?"
}))Parameterizing prompts and enforcing structured outputs are crucial for reliability. For more strategies on building robust LLM features and reducing errors in production, see the guide on prompt engineering with LLM APIs for reliable outputs at /article/prompt\-engineering\-with\-llm\-apis\-how\-to\-get\-reliable\-outputs\-4\.

Compose a chain with LCEL. Connect the prompt and model into a reusable workflow.

In [None]:
# <pre><code class="language-python">chain = prompt | llm

resp = chain.invoke({
    "persona": "You are a helpful customer support assistant.",
    "user_input": "The customer wants a refund for defective earbuds. What should we do?"
})
print(resp.content)

base\_chain \= prompt \| llm
resp \= base\_chain.invoke({
 "persona": "You give calm, factual answers.",
 "format\_instructions": "No preamble.",
 "input": "Explain what a LangChain runnable is."
})
print(resp.content)Define response schemas for structured fields. Guide the model to output specific fields.

In [None]:
# <pre><code class="language-python">schemas = [
    ResponseSchema(
        name="type",
        description="One of complaint, inquiry, feedback."
    ),
    ResponseSchema(
        name="product",
        description="Product or service mentioned, string."
    ),
    ResponseSchema(
        name="action",
        description="Recommended action like refund, replace, clarify, route_to_support."
    ),
]

schemas \= \[
 ResponseSchema(
 name\="intent",
 description\="One of: complaint, inquiry, refund, feedback"
 ),
 ResponseSchema(
 name\="customer\_name",
 description\="Full name of the sender if present. Else null"
 ),
 ResponseSchema(
 name\="email",
 description\="Sender email if present. Else null"
 ),
 ResponseSchema(
 name\="order\_id",
 description\="Order or ticket identifier if present. Else null"
 ),
 ResponseSchema(
 name\="priority",
 description\="One of: low, medium, high"
 ),
 ResponseSchema(
 name\="action",
 description\="One of: route, refund, escalate, ignore"
 ),
 ResponseSchema(
 name\="summary",
 description\="One sentence summary of the email"
 ),
]
parser \= StructuredOutputParser.from\_response\_schemas(schemas)Add a StructuredOutputParser and generate format instructions. This enforces strict JSON output.

In [None]:
# <pre><code class="language-python">parser = StructuredOutputParser.from_response_schemas(schemas)
format_instructions = parser.get_format_instructions()
print(format_instructions[:200], "...")

format\_instructions \= parser.get\_format\_instructions()
print("Format instructions ready.\\n")
print(format\_instructions\[:300], "...\\n") \# previewBuild the full prompt\-to\-model\-to\-parser chain. Extract structured fields from customer emails.

In [None]:
# <pre><code class="language-python">extraction_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You extract structured fields from customer emails. "
            "Return JSON that strictly follows these rules. {format_instructions}"
        ),
        ("human", "Email:\n{email}")
    ]
)

extraction_chain = extraction_prompt | llm | parser

email\_prompt \= ChatPromptTemplate.from\_messages(
 \[
 ("system", "You are a data extraction assistant. Produce only valid JSON that matches the schema."),
 ("human", "Follow these format instructions:\\n{format\_instructions}\\n\\nEmail:\\n{email\_text}")
 ]
)
extract\_chain \= email\_prompt \| llm \| parserRun customer emails through the extraction chain. Print parsed Python dictionaries.

In [None]:
# <pre><code class="language-python">emails = [
    "Hi, my left earbud stopped working after a week. I want a refund please.",
    "Hello, can you tell me if the Model X earbuds support wireless charging?",
    "Just wanted to say the new firmware fixed my microphone issue. Thanks."
]

for e in emails:
    result = extraction_chain.invoke({
        "email": e,
        "format_instructions": format_instructions
    })
    print(result)

email\_text \= """Subject: Refund request
Hi team,
I was charged twice for order \#A123\-44 on May 2\. Please refund the duplicate charge.
Thanks, Sam Lee, sam.lee@example.com
"""
result \= extract\_chain.invoke({
 "format\_instructions": format\_instructions,
 "email\_text": email\_text
})
print(result) \# dict with your fieldsValidate schema coverage and values. Make sure the model output matches expectations.

In [None]:
# <pre><code class="language-python">def validate_result(d):
    assert isinstance(d, dict)
    assert d["type"] in {"complaint", "inquiry", "feedback"}
    assert isinstance(d["product"], str)
    assert isinstance(d["action"], str)

for e in emails:
    d = extraction_chain.invoke({"email": e, "format_instructions": format_instructions})
    validate_result(d)

allowed\_intents \= {"complaint", "inquiry", "refund", "feedback"}
allowed\_priority \= {"low", "medium", "high"}
allowed\_action \= {"route", "refund", "escalate", "ignore"}

expected\_keys \= {"intent","customer\_name","email","order\_id","priority","action","summary"}
assert set(result.keys()) \=\= expected\_keys, f"Missing keys. got\={set(result.keys())}"

assert result\["intent"] in allowed\_intents, f"Bad intent. {result\['intent']}"
assert result\["priority"] in allowed\_priority, f"Bad priority. {result\['priority']}"
assert result\["action"] in allowed\_action, f"Bad action. {result\['action']}"

print("Validation passed.")

## Run and Evaluate

Inspect usage and latency without parsing. This helps you evaluate cost and performance.

In [None]:
# <pre><code class="language-python">from time import perf_counter

raw_chain = extraction_prompt | llm

start = perf_counter()
raw_msg = raw_chain.invoke({"email": emails[0], "format_instructions": format_instructions})
elapsed = perf_counter() - start

print("Latency seconds:", round(elapsed, 3))
print("Usage:", getattr(raw_msg, "usage_metadata", {}))

parsed = parser.invoke(raw_msg)
print(parsed)

import time

t0 \= time.time()
msg \= (email\_prompt \| llm).invoke({
 "format\_instructions": format\_instructions,
 "email\_text": email\_text
})
latency\_ms \= int((time.time() \- t0\) \* 1000\)
usage \= msg.response\_metadata.get("token\_usage", {})
print(f"Latency: {latency\_ms} ms. Usage: {usage}")Build a reusable function for production calls. Include validation and usage tracking.

In [None]:
# <pre><code class="language-python">def extract_email_fields(email: str) -> dict:
    raw = raw_chain.invoke({"email": email, "format_instructions": format_instructions})
    usage = getattr(raw, "usage_metadata", {})
    parsed = parser.invoke(raw)
    validate_result(parsed)
    return {"data": parsed, "usage": usage}

print(extract_email_fields("The Model X case will not charge. Need a replacement."))

from typing import Dict, Any, Tuple

def process\_email(
 email\_text: str,
 persona: str \= "You are a data extraction assistant.",
 llm\_model: ChatOpenAI \= llm,
 response\_parser: StructuredOutputParser \= parser,
 prompt\_template: ChatPromptTemplate \= email\_prompt,
 retries: int \= 1
) \-\> Tuple\[Dict\[str, Any], Dict\[str, Any]]:
 """Returns (parsed\_data, telemetry)."""
 fmt \= response\_parser.get\_format\_instructions()
 chain \= prompt\_template \| llm\_model \| response\_parser

 last\_error \= None
 for attempt in range(retries \+ 1\):
 t0 \= time.time()
 try:
 \# You can also include persona as a system tweak if needed
 parsed \= chain.invoke({
 "format\_instructions": fmt,
 "email\_text": email\_text
 })
 latency\_ms \= int((time.time() \- t0\) \* 1000\)
 \# Capture usage from the last LLM call by running model separately
 msg \= (prompt\_template \| llm\_model).invoke({
 "format\_instructions": fmt,
 "email\_text": email\_text
 })
 usage \= msg.response\_metadata.get("token\_usage", {})
 telemetry \= {
 "latency\_ms": latency\_ms,
 "usage": usage,
 "attempts": attempt \+ 1,
 "model": llm\_model.model,
 }
 return parsed, telemetry
 except Exception as e:
 last\_error \= e
 if attempt \< retries:
 continue

 raise RuntimeError(f"Failed to parse after {retries\+1} attempts. Last error: {last\_error}")Add a persona layer for domain\-specific language. Tailor tone and vocabulary.

In [None]:
# <pre><code class="language-python">persona = "You are a support triage assistant for consumer audio devices."
persona_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", persona + " Return strict JSON. {format_instructions}"),
        ("human", "Email:\n{email}")
    ]
)
persona_chain = persona_prompt | llm | parser

print(persona_chain.invoke({"email": emails[1], "format_instructions": format_instructions}))

persona\_prompt \= ChatPromptTemplate.from\_messages(
 \[
 ("system", "{persona} Produce only valid JSON that matches the schema."),
 ("human", "Follow these format instructions:\\n{format\_instructions}\\n\\nEmail:\\n{email\_text}")
 ]
)

def process\_email\_with\_persona(email\_text: str, persona: str) \-\> Dict:
 chain \= persona\_prompt \| llm \| parser
 return chain.invoke({
 "persona": persona,
 "format\_instructions": parser.get\_format\_instructions(),
 "email\_text": email\_text
 })

print(process\_email\_with\_persona(email\_text, "You specialize in ecommerce support. Prefer precise field extraction."))Run the workflow with your own examples. Include edge cases to test robustness.

In [None]:
# <pre><code class="language-python">my_emails = [
    "Order 1234. Model X earbuds arrived scratched. I want a refund.",
    "Do the Model Y earbuds pair with two phones at once?",
    "Love the sound on Model Z. Battery could be better, just feedback."
]

for e in my_emails:
    print(extraction_chain.invoke({"email": e, "format_instructions": format_instructions}))

examples \= \[
 "Subject: Where is my order?\\nI ordered on April 2\. Order 9981\. No delivery yet. \- Maria",
 "Please cancel order \#ZX\-77\. Wrong size. Email me at alex@domain.com",
 "Thanks for the quick support. Everything is resolved now. \- Ray",
 "I was overcharged. Twice on the same card. Order A\-55\. Need a refund asap."
]
for e in examples:
 data, meta \= process\_email(e, retries\=1\)
 print(data, meta)Track basic telemetry for each invocation. Return latency, usage, and parsed data.

In [None]:
# <pre><code class="language-python">def timed_invoke(email):
    import time
    t0 = time.perf_counter()
    raw = raw_chain.invoke({"email": email, "format_instructions": format_instructions})
    dt = time.perf_counter() - t0
    usage = getattr(raw, "usage_metadata", {})
    return dt, usage, parser.invoke(raw)

for e in emails:
    dt, usage, data = timed_invoke(e)
    print({"latency_s": round(dt, 3), "usage": usage, "data": data})

sample, telemetry \= process\_email(email\_text, retries\=1\)
print("Parsed:", sample)
print("Telemetry:", telemetry)Handle invalid JSON gracefully with retries. Use a corrective prompt if parsing fails.

In [None]:
# <pre><code class="language-python">from langchain.output_parsers import OutputParserException

def safe_extract(email, max_retries=1):
    for attempt in range(max_retries + 1):
        try:
            return extraction_chain.invoke({"email": email, "format_instructions": format_instructions})
        except OutputParserException:
            corrective = ChatPromptTemplate.from_messages(
                [
                    ("system", "Return valid JSON only. Do not include commentary. {format_instructions}"),
                    ("human", "Email:\n{email}")
                ]
            )
            retry_chain = corrective | llm | parser
            if attempt < max_retries:
                result = retry_chain.invoke({"email": email, "format_instructions": format_instructions})
                return result
            raise

print(safe_extract("Refund me please. Model X left earbud broke in a week."))

from langchain.schema.output\_parser import OutputParserException

def robust\_process\_email(email\_text: str, max\_retries: int \= 2\) \-\> Dict:
 fmt \= parser.get\_format\_instructions()

 corrective\_prompt \= ChatPromptTemplate.from\_messages(
 \[
 ("system", "You are a strict JSON generator. Produce only valid JSON. No prose."),
 ("human", "If the prior output was invalid, correct it.\\nSchema:\\n{format\_instructions}\\n\\nEmail:\\n{email\_text}")
 ]
 )

 base \= email\_prompt \| llm \| parser
 fixer \= corrective\_prompt \| llm \| parser

 for attempt in range(max\_retries \+ 1\):
 try:
 return base.invoke({"format\_instructions": fmt, "email\_text": email\_text})
 except OutputParserException:
 if attempt \< max\_retries:
 return fixer.invoke({"format\_instructions": fmt, "email\_text": email\_text})
 raise

print(robust\_process\_email("Refund my order \#123\. I was billed twice. joe@ex.com"))Expand the schema when business needs change. Add an urgency field based on sentiment.

In [None]:
# <pre><code class="language-python">schemas_extended = schemas + [
    ResponseSchema(name="urgency", description="low, medium, high based on sentiment and urgency cues.")
]
parser_ext = StructuredOutputParser.from_response_schemas(schemas_extended)
fmt_ext = parser_ext.get_format_instructions()

prompt_ext = ChatPromptTemplate.from_messages(
    [
        ("system", "Extract fields and urgency. Return strict JSON. {format_instructions}"),
        ("human", "Email:\n{email}")
    ]
)
chain_ext = prompt_ext | llm | parser_ext

print(chain_ext.invoke({"email": emails[0], "format_instructions": fmt_ext}))

expanded\_schemas \= schemas \+ \[
 ResponseSchema(
 name\="urgency",
 description\="One of: low, medium, high. Based on sentiment and explicit urgency."
 ),
]
expanded\_parser \= StructuredOutputParser.from\_response\_schemas(expanded\_schemas)
expanded\_format\_instructions \= expanded\_parser.get\_format\_instructions()

expanded\_prompt \= ChatPromptTemplate.from\_messages(
 \[
 ("system", "You extract fields and determine urgency. Output valid JSON only."),
 ("human", "Format instructions:\\n{format\_instructions}\\n\\nEmail:\\n{email\_text}")
 ]
)
expanded\_chain \= expanded\_prompt \| llm \| expanded\_parser

print(expanded\_chain.invoke({
 "format\_instructions": expanded\_format\_instructions,
 "email\_text": "This is urgent. Order \#A33 is missing and the event is tonight."
}))Add minimal unit tests for the extraction chain. Validate field values and types.

In [None]:
# <pre><code class="language-python">def test_extraction():
    sample = "Left earbud on Model X stopped working. Please replace."
    d = extraction_chain.invoke({"email": sample, "format_instructions": format_instructions})
    assert d["type"] in {"complaint", "inquiry", "feedback"}
    assert isinstance(d["product"], str)
    assert d["action"] in {"refund", "replace", "clarify", "route_to_support"}

test_extraction()

\# Save as tests/test\_extraction.py
import os
import re

def test\_refund\_detection():
 email \= "Refund requested for order \#A1\. Double charged. user@mail.com"
 data \= extract\_chain.invoke({"format\_instructions": format\_instructions, "email\_text": email})
 assert data\["intent"] in {"refund", "complaint"}
 assert data\["action"] in {"refund", "escalate", "route"}
 assert re.match(r".\+@.\+..\+", data\["email"]) is not None

def test\_missing\_fields\_are\_null():
 email \= "Thanks for the help. Everything works."
 data \= extract\_chain.invoke({"format\_instructions": format\_instructions, "email\_text": email})
 assert data\["order\_id"] in (None, "", "null", "NULL") or data\["order\_id"] is NoneUse prompt fixtures for reproducibility. Keep test prompts stable across changes.

In [None]:
# <pre><code class="language-python">fixtures = [
    {
        "email": "Model X case not charging. Need a replacement.",
        "expect_type": {"complaint"},
    },
    {
        "email": "Do Model Y earbuds support USB C charging?",
        "expect_type": {"inquiry"},
    },
]

for fx in fixtures:
    d = extraction_chain.invoke({"email": fx["email"], "format_instructions": format_instructions})
    assert d["type"] in fx["expect_type"]

\# tests/fixtures.py
EXTRACTION\_FORMAT \= parser.get\_format\_instructions()
SIMPLE\_EMAIL \= "Order 42 arrived damaged. Need a replacement. jane@example.com"Batch processing with simple loops. Process multiple emails in one pass.

In [None]:
# <pre><code class="language-python">results = [extraction_chain.invoke({"email": e, "format_instructions": format_instructions}) for e in emails]
print(results)

emails \= \[
 "Please cancel order 77\. Wrong color.",
 "Where is order 88? No update in 10 days.",
 "I was charged twice for order 99\."
]
batched \= extract\_chain.batch(\[{"format\_instructions": format\_instructions, "email\_text": e} for e in emails], batch\_size\=3\)
for d in batched:
 print(d)Swap models with one line changes. Experiment with different models.

In [None]:
# <pre><code class="language-python">llm_alt = ChatOpenAI(model="gpt-4o", temperature=0, max_tokens=300)
extraction_chain_alt = extraction_prompt | llm_alt | parser

print(extraction_chain_alt.invoke({"email": emails[2], "format_instructions": format_instructions}))

\# Switch to a larger model if you need better accuracy
llm \= ChatOpenAI(model\="gpt\-4o", temperature\=0, max\_tokens\=MAX\_TOKENS)
\# Or try a cost\-efficient mini model
\# llm \= ChatOpenAI(model\="gpt\-4o\-mini", temperature\=0, max\_tokens\=MAX\_TOKENS)

\# Rebuild the chain with the new model
extract\_chain \= email\_prompt \| llm \| parserVersion prompts and parsers in code. Enable traceability and rollback.

In [None]:
# <pre><code class="language-python">PROMPT_VERSION = "v1.2"
SCHEMA_VERSION = "v1.1"

print({"prompt_version": PROMPT_VERSION, "schema_version": SCHEMA_VERSION})

PROMPT\_VERSION \= "email\_extractor\_v1"
SCHEMA\_VERSION \= "schema\_v1"

def log\_meta():
 return {
 "prompt\_version": PROMPT\_VERSION,
 "schema\_version": SCHEMA\_VERSION,
 "model": llm.model
 }

print("Meta:", log\_meta())Lightweight configuration with environment variables. Change model and temperature without code edits.

In [None]:
# <pre><code class="language-python">MODEL_NAME = os.getenv("MODEL_NAME", "gpt-4o-mini")
TEMP = float(os.getenv("TEMP", "0"))
llm_cfg = ChatOpenAI(model=MODEL_NAME, temperature=TEMP, max_tokens=300)
cfg_chain = extraction_prompt | llm_cfg | parser

\# .env
\# MODEL\_NAME\=gpt\-4o\-mini
\# TEMPERATURE\=0
\# MAX\_TOKENS\=256

\# In code, you already read these values earlier:
\# MODEL\_NAME \= os.getenv("MODEL\_NAME", "gpt\-4o\-mini")
\# TEMPERATURE \= float(os.getenv("TEMPERATURE", "0"))Sample test cases to try now. Cover complaints, inquiries, and feedback.

In [None]:
# <pre><code class="language-python">cases = [
    "I love the sound on Model X, but the right bud randomly disconnects. Can you replace it?",
    "Do Model Y earbuds work with iOS 17? If yes, how to pair?",
    "Great update, pairing is faster now. Just a note for your team."
]

for c in cases:
    print(cfg_chain.invoke({"email": c, "format_instructions": format_instructions}))

samples \= \[
 "My package arrived damaged. Order \#12345\. Please replace it. \- Kim",
 "What is your return window for electronics? \- Pat",
 "Double charge on my card for order X\-77\. Need a refund now. rita@mail.com",
 "Just wanted to say thank you for the fast support."
]
for s in samples:
 data \= extract\_chain.invoke({"format\_instructions": format\_instructions, "email\_text": s})
 print(data)

## Conclusion

You built a production\-ready LLM workflow that turns customer emails into strict JSON dictionaries. You used LangChain prompt templates, structured output parsers, and LCEL chains to keep the code readable and testable. You validated outputs, tracked token usage, and handled parsing errors with retries. This approach stays maintainable because prompts, schemas, and models are decoupled. You can version prompts, swap models, and extend schemas without rewriting integration logic.

If your workflow requires grounding responses in external knowledge, consider integrating retrieval\-augmented generation. The ultimate guide to vector store retrieval for RAG systems at /article/rag\-101\-build\-an\-index\-run\-semantic\-search\-and\-use\-langchain\-to\-automate\-it explains how to implement semantic search and chunking to minimize hallucinations and boost reliability.

If you want to go beyond prompt\-driven chains and customize your LLM behavior at a deeper level, consider learning how to fine\-tune models yourself. The step\-by\-step guide to fine\-tuning large language models at /article/fine\-tuning\-large\-language\-models\-a\-step\-by\-step\-guide\-2025\-5 walks you through dataset preparation, training, and evaluation using Hugging Face tools.