## Demo: Mastering LangChain Expression Language (LCEL)
This series of demos is designed to accompany the Generative AI Engineering workshop. We will move beyond single LLM calls to explore Reasoning Dataflows.

Key Objectives:
* Runnables: Understanding the atomic unit of LCEL. (invoke, stream, batch)
* The Pipe Operator (|): Visualizing the dataflow from prompt to parser.
* RAG Integration: Connecting models to external knowledge streams.
* Parallelism & Control: Implementing ensemble-style intelligence and cost-aware routing.

In [1]:
# Install necessary libraries
# !pip install langchain langchain-openai

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough

# Initialize the Model as a Transform Function
model = ChatOpenAI(model="gpt-4o-mini")
print("Setup Complete. Model Initialized.")

Setup Complete. Model Initialized.


### The Basic Pipe Flow
The Pipe Flow Concept
In LCEL, we treat reasoning as flowing data. Instead of nested function calls, we use the pipe operator (|) to create a clear, traceable pipeline.

In [3]:
# 1. Define the Prompt (Computational Object) [cite: 287]
prompt = ChatPromptTemplate.from_template(
    "Tell me a brief professional joke about {topic}."
)

# 2. Define the Output Parser (Intelligence Filter) [cite: 301]
parser = StrOutputParser()

# 3. Construct the Chain using Pipe Flow
# Flow: Input -> Prompt -> Model -> Parser
chain = prompt | model | parser

# Execute the pipeline
response = chain.invoke({"topic": "AI Engineering"})
print(f"Chain Output: {response}")


Chain Output: Why did the AI engineer break up with their algorithm?

Because they didn’t find it statistically significant anymore!


### Parallel Reasoning Pipelines
**Parallelism and Ensemble Intelligence:**
Real-world problems often require multi-step reasoning. LCEL allows us to run multiple reasoning paths simultaneously—for example, generating an answer while a separate path critiques the reasoning or verifies facts.

In [4]:
# Define two different reasoning paths
joke_chain = ChatPromptTemplate.from_template("Joke about {topic}") | model | parser
fact_chain = (
    ChatPromptTemplate.from_template("One real fact about {topic}") | model | parser
)

# Combine them into a Parallel Pipeline
map_chain = RunnableParallel(joke=joke_chain, fact=fact_chain)

# Execute both simultaneously
results = map_chain.invoke({"topic": "OpenAI"})
print(f"Parallel Result 1 (Joke): {results['joke']}")
print(f"Parallel Result 2 (Fact): {results['fact']}")

Parallel Result 1 (Joke): Why did OpenAI bring a ladder to the bar?

Because it wanted to reach new heights in AI conversation!
Parallel Result 2 (Fact): OpenAI was founded in December 2015 with the mission to ensure that artificial general intelligence (AGI) benefits all of humanity.


### Cost-Aware Routing
**Cost-Aware Execution:**
Not every task requires the most powerful model. With LCEL, we can implement Conditional Logic to route simple tasks to smaller models (like GPT-4o-mini) and complex reasoning to larger ones, optimizing the system economically.

RunnableBranch allows the pipeline to adapt dynamically, routing simple queries to efficient routes and complex tasks to deeper reasoning paths. This is essential for Cost-Aware Execution, ensuring intelligence is economically optimized by routing simpler steps to smaller models.

In [6]:
from langchain_openai import ChatOpenAI
from langchain_core.runnables import RunnableBranch

# 1. Define your "Economy" and "Premium" models
economy_model = ChatOpenAI(model="gpt-4o-mini")  # Fast and cheap
premium_model = ChatOpenAI(model="gpt-4o")  # High-reasoning and expensive

# 2. Logic to extract the question and route
# We use economy_model for simple paths to minimize price-per-token
model_path_economy = (lambda x: x["question"]) | economy_model
model_path_premium = (lambda x: x["question"]) | premium_model

# 3. Cost-Aware Router
branch = RunnableBranch(
    # If the question is short/simple, use the cheap model
    (lambda x: len(x["question"]) < 50, model_path_economy),
    # Only use the expensive model for long, complex tasks
    model_path_premium,
)