# 03 ‚Äì Memory & LCEL Basics

**Learning Goals:**
- Understand conversational memory in LangChain
- Compare memory types: Buffer vs Summary
- Master LCEL (LangChain Expression Language) composition
- Build streaming, retry, and fallback patterns

**What we'll cover:**
1. **Memory 101** - Buffer and Summary memory patterns
2. **Memory in Chains** - Inject memory into conversational flows
3. **LCEL Basics** - Compose runnables with `|` operator
4. **Advanced LCEL** - Streaming, retry, fallbacks
5. **Tool Use (Optional)** - Simple tool integration

**Note:** This notebook focuses on fundamentals, not RAG. No ChromaDB or retrieval here.


In [1]:
#  Global Config & Services (using centralized modules)
import sys
import json
from pathlib import Path
from datetime import datetime
from dotenv import load_dotenv

# Add parent directory to path and change to project root
import os

# Get the notebook's current directory and find project root
notebook_dir = Path.cwd()
if notebook_dir.name == "notebooks":
    project_root = notebook_dir.parent
else:
    project_root = notebook_dir

# Change to project root and add to path
os.chdir(project_root)
sys.path.insert(0, str(project_root))

print(f" Working directory: {os.getcwd()}")

from src.services.llm_services import (
    load_config,
    get_llm,
    validate_api_keys,
    print_config_summary
)

# Load environment variables
load_dotenv()

# Load configuration from config.yaml (now we're in project root)
config = load_config("src/config/config.yaml")

# Validate API keys
validate_api_keys(config, verbose=True)

# Print summary
print_config_summary(config)
print(f"  Note: Temperature is {config['temperature']} (good for conversational demos)")


 Working directory: d:\Courses\_Zuu Crew\AI Engineer Essentials\Programming\Week 03
 Config loaded:
  LLM: groq / openai/gpt-oss-120b
  Embeddings: sbert / sentence-transformers/all-MiniLM-L6-v2
  Temperature: 0.2
  Artifacts: ./artifacts
  Note: Temperature is 0.2 (good for conversational demos)


In [2]:
# Initialize LLM using factory from llm_services
llm = get_llm(config)
print(f" LLM initialized: {config['llm_provider']} / {config['llm_model']}")

# Verify API key with test completion
print("\n Testing API connection...")
try:
    test_response = llm.invoke("Say 'API working!' if you can read this.")
    test_msg = test_response.content if hasattr(test_response, 'content') else str(test_response)
    print(f" API key verified: {test_msg[:50]}")
except Exception as e:
    print(f" API key test failed: {e}")
    print("  Please check your .env file and API key configuration.")


 LLM initialized: groq / openai/gpt-oss-120b

 Testing API connection...
 API key verified: API working!


---

## Section A: Memory 101

LangChain provides memory primitives to maintain conversational context across turns.

### 1. ConversationBufferMemory

Stores **full chat history** in memory. Simple but can grow large.


In [6]:
from langchain_classic.memory import ConversationBufferMemory
from langchain_classic.chains import ConversationChain

buffer_memory = ConversationBufferMemory(return_messages=True)

# buffer_memory.save_context(
#     {"input": "Hello!, My name is Sahas."},
#     {"output": "Hello Sahas! Nice to meet you."}
# )

# buffer_memory.save_context(
#     {"input": "What is my name?"},
#     {"output": "Your name is Sahas."}
# )

# print(buffer_memory.load_memory_variables({}))  # {'history': ...}

conversation = ConversationChain(llm=llm,
                                memory=buffer_memory,
                                verbose=False)

  conversation = ConversationChain(llm=llm,


In [10]:
while True:
    user_input = input("\nYou: ")
    if user_input.lower() in ["exit", "quit"]:
        print("Exiting chat.")
        break

    response = conversation.predict(input=user_input)

    print(f"Human: {user_input}")
    print(f"Bot: {response}\n")

    
    

Human: Hello
Bot: Hey there, Sahas! üëã Great to hear from you again. How‚Äôs everything going on your side of the world?  

Since you mentioned you‚Äôre from Sri‚ÄØLanka, I‚Äôve been thinking about a few more cool things you might enjoy:

- **Cultural gems:** Have you ever visited the ancient city of **Anuradhapura**? The massive stone stupas (like Ruwanwelisaya) and the sacred Bo tree are truly awe‚Äëinspiring.
- **Food adventure:** If you haven‚Äôt tried **pol sambol** yet, it‚Äôs a fiery coconut relish that pairs perfectly with rice and curry‚Äîor even just a fresh roti.
- **Nature escape:** The **Horton Plains** plateau offers the famous ‚ÄúWorld‚Äôs End‚Äù cliff‚Äîan 870‚Äëmeter drop with breathtaking sunrise views.
- **Music & dance:** The rhythmic beats of **baila** and the graceful movements of **kandyan dance** are such vibrant expressions of Sri‚ÄØLankan heritage.

Anything exciting happening today? A project you‚Äôre working on, a favorite spot you love to visit, or maybe 

In [11]:
buffer_memory.load_memory_variables({})

{'history': [HumanMessage(content='Hello', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Hello! üëã It‚Äôs great to meet you. I‚Äôm an AI language model created by OpenAI‚Äîthink of me as a very well‚Äëread, endlessly curious conversational partner. A few quick facts about me:\n\n- **Model family:** GPT‚Äë4‚ÄëTurbo (the latest, most efficient version of the GPT‚Äë4 series).  \n- **Training cut‚Äëoff:** I was trained on data up to June\u202f2024, so I‚Äôm familiar with events, scientific advances, pop culture, and internet trends up to that point.  \n- **Current date:** It‚Äôs January\u202f30,\u202f2026 where I‚Äôm ‚Äúliving‚Äù in the digital realm.  \n- **Personality:** I aim to be friendly, talkative, and detail‚Äëoriented. If I ever don‚Äôt know something, I‚Äôll let you know honestly.\n\nI‚Äôm here to chat, answer questions, brainstorm ideas, help with writing, explain concepts, or just share a fun fact‚Äîwhatever you need. How can I assist you today?', addition

### 2. ConversationSummaryMemory

Instead of storing full history, **summarizes** past conversation using an LLM. Reduces token usage but may lose details.


In [14]:
from langchain_classic.memory import ConversationSummaryMemory

# Create summary memory (requires LLM)
summary_memory = ConversationSummaryMemory(llm=llm, return_messages=True)

# Simulate same conversation
summary_memory.save_context(
    {"input": "Hi, my name is Alice."},
    {"output": "Hello Alice! Nice to meet you."}
)
summary_memory.save_context(
    {"input": "What's my name?"},
    {"output": "Your name is Alice."}
)
summary_memory.save_context(
    {"input": "What's the capital of France?"},
    {"output": "The capital of France is Paris."}
)

# View summarized history
print(" Summary Memory:")
print(summary_memory.load_memory_variables({}))
print(f"\n Summary is more compact than full buffer")


  summary_memory = ConversationSummaryMemory(llm=llm, return_messages=True)


 Summary Memory:
{'history': [SystemMessage(content='The human introduces themselves as Alice, and the AI greets Alice in response. The human then asks for their name, and the AI confirms that the human‚Äôs name is Alice. The human then asks for the capital of France, and the AI replies that it is Paris.', additional_kwargs={}, response_metadata={})]}

 Summary is more compact than full buffer


### Trade-offs: Buffer vs Summary

| Memory Type | Pros | Cons |
|-------------|------|------|
| **Buffer** | Full detail, no LLM calls | Grows unbounded, context limits |
| **Summary** | Compact, scalable | LLM calls needed, possible drift |

**When to use:**
- **Buffer**: Short conversations, need exact history
- **Summary**: Long conversations, want cost efficiency


---

## Section B: Memory in Chains

Let's inject memory into a simple conversational chain.


In [16]:
from langchain_classic.chains import ConversationChain
from langchain_classic.memory import ConversationBufferMemory

# Create a conversational chain with memory
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=False,  # Set to True to see internal prompts
)

# Multi-turn conversation
print("  Conversational Chain with Memory\n")

response1 = conversation.predict(input="Hi, I'm Bob and I love Python programming.")
print(f"User: Hi, I'm Bob and I love Python programming.")
print(f"AI: {response1}\n")

response2 = conversation.predict(input="What's my name?")
print(f"User: What's my name?")
print(f"AI: {response2}\n")

response3 = conversation.predict(input="What do I love?")
print(f"User: What do I love?")
print(f"AI: {response3}\n")

# View memory
print(" Stored Memory:")
print(memory.load_memory_variables({}))


  Conversational Chain with Memory

User: Hi, I'm Bob and I love Python programming.
AI: Hey Bob! Great to meet you üëã‚ÄîI‚Äôm always excited to chat with fellow Python enthusiasts.  

A little about me: I‚Äôve been trained on a massive corpus of Python code, tutorials, and community discussions, so I can dive into everything from the basics of list comprehensions to the intricacies of async I/O and metaprogramming. I‚Äôve even ‚Äúread‚Äù the source code of popular libraries like **NumPy**, **Pandas**, **FastAPI**, and **TensorFlow**, so I can help you troubleshoot or explore new features.

Since you love Python, I‚Äôm curious‚Äîwhat kind of projects are you working on or dreaming about? Some fun directions people often go in include:

| Area | Typical Libraries / Tools | Cool Project Ideas |
|------|---------------------------|--------------------|
| **Data Science / Machine Learning** | `pandas`, `scikit‚Äëlearn`, `tensorflow`, `pytorch` | Build a model that predicts your favorite 

### Resetting Memory

Between sessions, clear memory to start fresh.


In [17]:
# Clear memory
memory.clear()

response4 = conversation.predict(input="What's my name?")
print(f"After clearing memory:")
print(f"User: What's my name?")
print(f"AI: {response4}")
print(f"\n Memory reset - AI no longer remembers Bob")


After clearing memory:
User: What's my name?
AI: I‚Äôm afraid I don‚Äôt actually know your name‚Äîmy system doesn‚Äôt have any personal data about you unless you share it with me. If you‚Äôd like, feel free to tell me what you‚Äôd like to be called, and I‚Äôll gladly use it from now on!

 Memory reset - AI no longer remembers Bob


---

## Section C: LCEL (LangChain Expression Language) Basics

LCEL is a declarative way to compose LangChain components using the `|` operator.

### Core Concepts

1. **Runnable**: Base interface for all LCEL components
2. **Pipe (`|`)**: Chain runnables together
3. **RunnablePassthrough**: Pass data through unchanged
4. **RunnableMap**: Apply multiple operations in parallel

### Simple LCEL Chain

Let's build: `PromptTemplate | LLM | StrOutputParser`


In [19]:
from langchain_classic.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Define prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Answer concisely."),
    ("human", "{question}")
])

# Build LCEL chain
chain = prompt | llm | StrOutputParser()

# Invoke
response = chain.invoke({"question": "What is eczema and how is it treated?"})
print(" Simple LCEL Chain:")
print(f"Question: What is eczema and how is it treated?")
print(f"Answer: {response}")


 Simple LCEL Chain:
Question: What is eczema and how is it treated?
Answer: **Eczema (atopic dermatitis)** is a chronic, inflammatory skin condition characterized by dry, itchy patches that may become red, scaly, cracked, or weepy. It often runs in families with a history of allergies, asthma, or hay fever.

### Key Features
| Aspect | Details |
|--------|---------|
| **Typical sites** | Flexural areas (inside elbows/knees), neck, face, hands, feet |
| **Symptoms** | Intense itching, redness, swelling, crusting, thickened skin (lichenification) |
| **Triggers** | Irritants (soaps, detergents), allergens (dust mites, pet dander), stress, temperature extremes, sweat, certain fabrics |

### Treatment Overview
| Goal | Options |
|------|----------|
| **Restore skin barrier** | ‚Ä¢ **Moisturizers** (thick ointments or creams) applied 2‚Äì3‚ÄØ√ó‚ÄØdaily; **emollient‚Äërich** products containing ceramides, petrolatum, or hyaluronic acid.<br>‚Ä¢ **Bathing strategy** ‚Äì lukewarm water, short b

### RunnablePassthrough & RunnableMap

Use `RunnablePassthrough` to pass input data and `RunnableMap` (via dict) for parallel operations.


In [20]:
from langchain_core.runnables import RunnableParallel
from langchain_core.prompts import ChatPromptTemplate


# 1. Create a prompt template using ChatPromptTemplate.from_template():
#    - The template should have placeholders: {context} and {question}
#    - Example format:
#      """Use the context to answer the question.
#      
#      Context: {context}
#      Question: {question}
#      
#      Answer:"""
#    - Store in variable: context_prompt
prompt = ChatPromptTemplate.from_template(
    """Use the context to answer the question.
    
    Context: {context}
    Question: {question}
    
    Answer:"""
)


# 2. Build an LCEL chain using the | (pipe) operator:
#    chain_with_context = (
#        RunnableParallel({
#            "context": RunnablePassthrough(),   # Pass context through
#            "question": RunnablePassthrough(),  # Pass question through
#        })
#        | context_prompt    # Format the prompt
#        | llm               # Generate answer
#        | StrOutputParser() # Extract string from response
#    )
chain_lsel = (
    RunnableParallel({
        "context": RunnablePassthrough(),   # Pass context through
        "question": RunnablePassthrough(),  # Pass question through
    })
    | prompt    # Format the prompt
    | llm               # Generate answer
    | StrOutputParser() # Extract string from response
)



# 3. Test the chain by invoking with:
#    result = chain_with_context.invoke({
#        "context": "Machine learning is a field of AI that learns from data.",
#        "question": "What is ML?"
#    })
#
# 4. Print the result with label " Chain with Context:"
result = chain_lsel.invoke({
    "context": "Machine learning is a field of AI that learns from data.",
    "question": "What is ML?"
})
result



'Machine learning (ML) is a branch of artificial intelligence that enables computers to learn patterns and make decisions directly from data.'

---

## Section D: Advanced LCEL Patterns

### 1. Streaming

Stream tokens as they're generated for better UX.


In [28]:
import sys
print("Without Streaming response:")
print(chain.invoke({"question": "Explain RAG in one sentence."}))

print("Streaming response:")
print("Answer: ", end="")

for chunk in chain.stream({"question": "Explain RAG in one sentence."}):
    print(chunk, end=".", flush=True)

Without Streaming response:
Retrieval‚ÄëAugmented Generation (RAG) is a technique that combines a language model with a searchable external knowledge base, retrieving relevant documents at inference time and feeding them into the model to produce more accurate, up‚Äëto‚Äëdate, and grounded responses.
Streaming response:
Answer: ..............................................Retr.ieval.‚Äë.Aug.mented. Generation. (.R.AG.). is. a. technique. that. combines. a. language. model. with. a. searchable. external. knowledge. base.,. retrieving. relevant. documents. at. inference. time. to. inform. and. improve. the. model.‚Äôs. generated. responses.....

### 2. Retry with Fallback

Use `.with_retry()` for automatic retries and `.with_fallbacks()` for fallback models.


In [29]:
# Add retry logic (max 3 attempts)
chain_with_retry = chain.with_retry(stop_after_attempt=3)

print(" Chain with retry enabled")
print("If LLM fails, will retry up to 3 times")

# Fallback example (requires a second LLM)
# Uncomment if you have multiple providers configured
# fallback_llm = ChatOpenAI(model="gpt-3.5-turbo")
# chain_with_fallback = chain.with_fallbacks([fallback_chain])
# print(" Fallback chain: tries primary LLM, falls back to gpt-3.5-turbo")

print(" Retry pattern configured")


 Chain with retry enabled
If LLM fails, will retry up to 3 times
 Retry pattern configured


---

## Section E: Minimal Tool Use (Optional)

LCEL can invoke tools. Here's a simple example with a current time tool.


In [None]:
from langchain.tools import tool
from langchain_core.runnables import RunnableLambda

### START CODE HERE ### (‚âà 25-30 lines)
# YOUR CODE HERE
# HINTS:
# Part 1: Create a tool
# 1. Use the @tool decorator
# 2. Define function: def get_current_time() -> str:
# 3. Add docstring: """Returns the current time in ISO format."""
# 4. Import datetime: from datetime import datetime
# 5. Return: datetime.now().isoformat()
#
# Part 2: Test the tool
# 6. Print " Tool Demo:"
# 7. Invoke the tool: get_current_time.invoke({})
# 8. Print the result with label "Current time:"
#
# Part 3: Build a tool chain
# 9. Create a ChatPromptTemplate with:
#    - Placeholders: {question} and {tool_result}
#    - Instructions mentioning the tool
#    - Store in: tool_prompt
#
# 10. Build an LCEL chain:
#     tool_chain = (
#         RunnableParallel({
#             "question": RunnablePassthrough(),
#             "tool_result": RunnableLambda(lambda x: get_current_time.invoke({}))
#         })
#         | tool_prompt
#         | llm
#         | StrOutputParser()
#     )
#
# 11. Test with: tool_chain.invoke({"question": "What time is it?"})
# 12. Print the answer with label " Tool Chain Result:"

raise NotImplementedError("Complete the tool integration exercise")
### END CODE HERE ###


üîß Tool Demo:
Current time: 2026-01-11T08:33:26.464479

üîó Tool Chain Result:
Answer: The time is 08:33 on January 11, 2026.


---

## Save Manifest


In [30]:
manifests_dir = Path(config["artifacts_root"]) / "manifests"
manifests_dir.mkdir(parents=True, exist_ok=True)

manifest = {
    "notebook": "12_memory_lcel_basics",
    "topics": [
        "ConversationBufferMemory",
        "ConversationSummaryMemory",
        "ConversationChain with memory",
        "LCEL composition",
        "Streaming",
        "Retry patterns",
        "Tool use"
    ],
    "llm_provider": config["llm_provider"],
    "llm_model": config["llm_model"],
    "created_at": datetime.now().isoformat(),
}

manifest_path = manifests_dir / "memory_lcel.json"
with open(manifest_path, "w") as f:
    json.dump(manifest, f, indent=2)

print(f" Manifest saved: {manifest_path}")


 Manifest saved: artifacts\manifests\memory_lcel.json


---

## Summary

**What we learned:**

### Memory
-  **Buffer Memory**: Stores full history (simple but grows)
-  **Summary Memory**: LLM-summarized history (compact but may drift)
-  **Memory in Chains**: Inject context into conversational flows
-  **Reset/Clear**: Start fresh between sessions

### LCEL
-  **Composition**: Use `|` to chain runnables
-  *RunnablePassthrough**: Pass data unchanged
-  **RunnableParallel**: Run operations in parallel
-  **Streaming**: Token-by-token generation
-  **Retry**: Automatic retries on failure
-  **Fallbacks**: Switch to backup LLM
-  **Tools**: Integrate external functions

**Key Patterns:**
```python
# Simple chain
chain = prompt | llm | parser

# With context
chain = RunnableParallel({...}) | prompt | llm | parser

# With retry
chain = chain.with_retry(stop_after_attempt=3)

# With streaming
for chunk in chain.stream(input):
    print(chunk)
```

**Artifacts:**
- `./artifacts/manifests/memory_lcel.json`
