# LangChain Tutorial: Key Concepts Explained

This notebook provides a step-by-step tutorial on the core concepts of LangChain, based on our discussion. We'll cover LCEL (LangChain Expression Language), Runnables, components like prompts and models, and how the pipe operator works through Python's operator overloading.

We'll use Azure OpenAI's GPT-4o-mini model via the provided setup code. Make sure to run the setup cell first and have your API key stored in Google Colab's userdata.

**Prerequisites:**
- Install required packages: `!pip install langchain langchain-openai langchain-community langchain-core docarray hnswlib`
- Store your Azure OpenAI key in Colab secrets as 'eduhkkey'.

In [None]:
!pip install langchain langchain-openai langchain-community langchain-core langchain-huggingface docarray hnswlib sentence-transformers -q

## Setup: Azure OpenAI Model

This cell sets up the AzureChatOpenAI model using the provided code with the correct endpoint URL including the ?Hello= parameter.

In [None]:
import os
from langchain_openai import AzureChatOpenAI
from langchain_core.tools import tool
import requests
from datetime import datetime
import json
from google.colab import userdata

# Set your Azure OpenAI API key (keep it secret! In Colab, you can use os.environ for security)
os.environ["AZURE_OPENAI_API_KEY"] = userdata.get('eduhkkey')

# Set up the Azure OpenAI model (using gpt-4o-mini as per docs)
llm = AzureChatOpenAI(
    azure_endpoint="https://aai02.eduhk.hk/openai/deployments/gpt-4o-mini/chat/completions?Hello=",
    api_version="2024-02-15-preview",  # Use a recent version
    deployment_name="gpt-4o-mini",
    temperature=0,  # Low temperature for consistent tool calling
    streaming=False,  # Non-streaming for simplicity
)

# The actual endpoint used internally
print(f"Base URL: {llm.client._client._base_url}")
print(f"API Version: {llm.openai_api_version}")
print(f"Deployment: {llm.deployment_name}")
print(os.environ["AZURE_OPENAI_API_KEY"])  # This will print the key—remove in production!

## Section 1: Introduction to LCEL and Runnable Protocol

LCEL (LangChain Expression Language) is a declarative way to compose chains of components in LangChain. It uses the Runnable Protocol, which defines standardized methods (invoke, stream, batch) that all components must implement.

Key: Components like prompts and models are 'Runnables' that can be chained.

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Simple prompt template (Runnable)
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")

# Chain with pipe syntax (LCEL)
chain = prompt | llm | StrOutputParser()

# Invoke (sync, single)
result = chain.invoke({"topic": "cats"})
print(result)

## Section 2: Runnable Interface and Methods

Every Runnable supports:
- invoke/ainvoke: Sync/async single input.
- batch/abatch: Process multiple inputs.
- stream/astream: Incremental output.
- input_schema/output_schema: Define I/O types.

In [None]:
# Batch example
inputs = [{"topic": "cats"}, {"topic": "dogs"}]
results = chain.batch(inputs)
print(results)

# Stream example
for chunk in chain.stream({"topic": "birds"}):
    print(chunk, end="")

## Section 3: Advanced Features - Fallbacks, Parallelism, Logging

Fallbacks: Add backups with .with_fallbacks().
Parallelism: Use RunnableParallel/RunnableMap for concurrent steps.
Logging: Built-in to LangSmith (setup required).

In [None]:
from langchain_core.runnables import RunnableParallel

# Parallel chain
parallel_chain = RunnableParallel(
    joke=chain,
    fact=prompt | llm | StrOutputParser()  # Reuse
)

result = parallel_chain.invoke({"topic": "space"})
print(result)

# Fallback example (using a secondary model if primary fails)
fallback_llm = AzureChatOpenAI(  # Another instance as fallback
    azure_endpoint="https://aai02.eduhk.hk/openai/deployments/gpt-4o-mini/chat/completions?Hello=",
    api_version="2024-02-15-preview",
    deployment_name="gpt-4o-mini",
    temperature=0.5,
    max_retries=0  # Immediate fallback
)
chain_with_fallback = llm.with_fallbacks([fallback_llm])
result = (prompt | chain_with_fallback | StrOutputParser()).invoke({"topic": "fallback test"})
print(result)

## Section 4: Pipe Syntax vs. RunnableSequence

Pipe (`|`) is shorthand for RunnableSequence via operator overloading (__or__). Verbose alternative: Explicitly use RunnableSequence.

In [None]:
from langchain_core.runnables import RunnableSequence

# Pipe syntax
pipe_chain = prompt | llm

# Verbose equivalent
sequence_chain = RunnableSequence(prompt, llm)

# Both work the same
print(pipe_chain.invoke({"topic": "verbose"}))
print(sequence_chain.invoke({"topic": "verbose"}))

## Section 5: Understanding the Pipe Operator (|) and Operator Overloading

The `|` symbol in Python is natively the bitwise OR operator (for integers) or set/dict union operator (in Python 3.9+), but libraries like LangChain use a clever (but fully legitimate) technique called **operator overloading** to repurpose it as a "pipe" for composing objects, mimicking the Linux/Unix shell pipe (`|`) that chains commands.

### How It Works in Python
- **Native Behavior**: Without overloading, `a | b` does bitwise OR if `a` and `b` are ints (e.g., `5 | 3` is 7), or unions sets/dicts (e.g., `{"a":1} | {"b":2}` is `{"a":1, "b":2}`).
- **Overloading Trick**: Python classes can define special methods (dunder methods) to customize operators. For `|`, it's `__or__` (and optionally `__ror__` for reverse). If you implement this in a class, `obj1 | obj2` calls `obj1.__or__(obj2)`, letting you define custom behavior like chaining.
- **In LangChain's LCEL**: The `Runnable` class overloads `__or__` to create a `RunnableSequence`. So `prompt | model` returns a new object that pipes the output of `prompt` into `model`.

In [None]:
# Let's demonstrate operator overloading with a simple example
class SimplePipe:
    def __init__(self, func):
        self.func = func
    
    def __or__(self, other):
        # This is what happens when you use | operator
        def chained(x):
            return other.func(self.func(x))
        return SimplePipe(chained)
    
    def invoke(self, x):
        return self.func(x)

# Create simple pipe components
add_one = SimplePipe(lambda x: x + 1)
double = SimplePipe(lambda x: x * 2)

# Chain them with | operator (this calls add_one.__or__(double))
chain = add_one | double  # Overloads | to chain functions
result = chain.invoke(5)  # (5 + 1) * 2 = 12
print(f"Result: {result}")  # Output: 12

# Show what happens under the hood
print(f"Type of chain: {type(chain)}")
print(f"Chain is a SimplePipe: {isinstance(chain, SimplePipe)}")

### Why It Feels Like a Pipe

- **Inspired by shells**: In Linux, `cmd1 | cmd2` sends output from cmd1 to cmd2 as input. LCEL does the same for data flow (e.g., prompt output → model input).
- **Pros**: Makes code concise and intuitive, especially for pipelines.
- **Cons**: Can confuse beginners if they're expecting bitwise OR, but context (like importing LangChain) makes it clear.

This has been standard in Python for decades and remains unchanged in 2025—it's not going anywhere. Let's see how LangChain implements this:

In [None]:
# Demonstrate native Python operators vs LangChain overloading

# Native bitwise OR
print("Native bitwise OR:")
print(f"5 | 3 = {5 | 3}")  # Bitwise OR: 7

# Native set union (Python 3.9+)
print("\nNative set/dict union:")
set1 = {1, 2, 3}
set2 = {3, 4, 5}
print(f"{set1} | {set2} = {set1 | set2}")

dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
print(f"{dict1} | {dict2} = {dict1 | dict2}")

# LangChain overloaded behavior
print("\nLangChain overloaded | operator:")
simple_prompt = ChatPromptTemplate.from_template("Say hello to {name}")
chained = simple_prompt | llm
print(f"Type of result: {type(chained)}")
print(f"Result: {chained.invoke({'name': 'Alice'})}")

### Key Takeaways

1. **Not a hack**: Operator overloading is a core Python feature, like NumPy using `+` for array addition.
2. **Intuitive design**: `prompt | model | parser` reads left-to-right like Unix pipes.
3. **Under the hood**: `prompt | model` calls `prompt.__or__(model)` which returns a `RunnableSequence`.
4. **Flexible**: You can implement this pattern in your own classes for domain-specific pipelines.

This approach makes LangChain chains both powerful and readable!

## Section 6: Data Flow in Chains

In chains, output of one component becomes input to the next. E.g., Prompt output → Model input → Parser input.

In [None]:
# Inspect flow
prompt_output = prompt.invoke({"topic": "flow"})
print("Prompt Output:", prompt_output)

model_output = llm.invoke(prompt_output)
print("Model Output:", model_output)

parser = StrOutputParser()
final_output = parser.invoke(model_output)
print("Final Output:", final_output)

## Section 7: Switching Execution Modes

Sync → Async: Use ainvoke/astream/abatch.
Single → Batch: Pass list to batch/abatch.
Non-stream → Streaming: Use stream/astream.

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Redefine the LangChain chain (same as Section 1)
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm | StrOutputParser()

import asyncio

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Redefine the LangChain chain to avoid overwrite from Section 5
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm | StrOutputParser()

# Async invoke
async def async_invoke():
    return await chain.ainvoke({"topic": "async"})

result = await async_invoke()
print(result)

# Async stream
async def async_stream():
    async for chunk in chain.astream({"topic": "stream"}):
        print(chunk, end="@")

await async_stream()

## Section 8: RAG Example with VectorStore and Retriever

Build a Retrieval-Augmented Generation chain: Embed docs, retrieve relevant ones, augment prompt.

In [None]:
from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_core.runnables import RunnableMap

# Embeddings (use HuggingFace - free, local, no API key required)
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cpu'},
    encode_kwargs={'normalize_embeddings': True}
)

vectorstore = DocArrayInMemorySearch.from_texts(
    ["I am a superman", "This apple is great"],
    embedding=embeddings
)
retriever = vectorstore.as_retriever()

template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

chain = RunnableMap({
    "context": lambda x: retriever.get_relevant_documents(x["question"]),
    "question": lambda x: x["question"]
}) | prompt | llm | StrOutputParser()

result = chain.invoke({"question": "Who am I?"})
print(result)

## Section 9: Handling Custom Endpoints Without Native Binding

If the endpoint lacks function binding, use prompt engineering: Instruct the model to output JSON tool calls, parse, and invoke manually.

In [None]:
from langchain_core.tools import tool
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.tools import render_text_description
from langchain_core.runnables import RunnableLambda

@tool
def weather_search(airport_code: str) -> str:
    """Search for weather given an airport code."""
    return f"Weather for {airport_code}: Sunny, 75°F"

tools = [weather_search]
rendered_tools = render_text_description(tools)

system_prompt = f"""You are an assistant with access to tools.
{rendered_tools}

If relevant, return JSON with 'name' and 'arguments'."""

prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("human", "{input}")
])

parser = JsonOutputParser()
chain = prompt | llm | parser

def invoke_tool(tool_call: dict):
    tool_name = tool_call.get("name")
    tool_args = tool_call.get("arguments", {})
    for t in tools:
        if t.name == tool_name:
            return t.invoke(tool_args)
    raise ValueError("Tool not found")

full_chain = chain | RunnableLambda(invoke_tool)
result = full_chain.invoke({"input": "What's the weather at SFO?"})
print(result)