# LangChain Fundamentals: Building Your First AI Chains

Welcome to LangChain! This notebook will teach you the fundamentals of building AI applications using LangChain Expression Language (LCEL) and local LLMs.

## 🎯 Learning Objectives

By the end of this notebook, you'll understand:
- **LangChain Core Concepts**: Prompts, Models, Output Parsers
- **LCEL Syntax**: Building chains with the pipe operator
- **Local LLM Integration**: Working with Ollama models
- **Chain Composition**: Creating complex multi-step workflows
- **Practical Applications**: Real-world AI chain examples

## 📚 What is LangChain?

LangChain is a framework for developing applications powered by language models. It provides:

- **🔗 Chain Composition**: Connect multiple AI components
- **🧠 Model Abstraction**: Work with different LLMs uniformly
- **📝 Prompt Management**: Structured prompt templates
- **🔄 Memory & State**: Maintain context across interactions
- **🔌 Integrations**: Connect to databases, APIs, and tools

## ⚡ LCEL (LangChain Expression Language)

LCEL is LangChain's declarative way to compose chains:

```python
# Basic pattern
chain = prompt | model | output_parser

# More complex
chain = input_formatter | prompt | model | output_parser | post_processor
```

## 🛠️ Environment Setup

Let's start by setting up our LangChain environment and verifying everything works.

In [None]:
# Install required packages (run this if needed)
# !pip install langchain langchain-community langchain-core

import subprocess
import sys
import os

def setup_langchain_environment():
    """Set up and verify LangChain environment"""
    print("🔧 LANGCHAIN ENVIRONMENT SETUP")
    print("=" * 35)
    
    # Check Python version
    python_version = sys.version_info
    print(f"🐍 Python: {python_version.major}.{python_version.minor}.{python_version.micro}")
    
    # Check required packages
    required_packages = {
        'langchain': 'Core LangChain framework',
        'langchain_community': 'Community integrations',
        'langchain_core': 'Core abstractions'
    }
    
    print("\n📦 Checking LangChain packages...")
    for package, description in required_packages.items():
        try:
            module = __import__(package)
            version = getattr(module, '__version__', 'unknown')
            print(f"   ✅ {package} v{version} - {description}")
        except ImportError:
            print(f"   ❌ {package} - MISSING ({description})")
            print(f"      Install with: pip install {package}")
    
    # Check Ollama
    print("\n🧠 Checking Ollama...")
    try:
        result = subprocess.run(['ollama', '--version'], 
                              capture_output=True, text=True, timeout=5)
        if result.returncode == 0:
            print("   ✅ Ollama installed and running")
            
            # Check for available models
            models_result = subprocess.run(['ollama', 'list'], 
                                         capture_output=True, text=True, timeout=10)
            if models_result.returncode == 0:
                models = models_result.stdout.strip().split('\n')[1:]  # Skip header
                if models and models[0]:
                    print(f"   📋 Available models: {len(models)}")
                    for model in models[:3]:  # Show first 3
                        model_name = model.split()[0] if model.strip() else 'Unknown'
                        print(f"      • {model_name}")
                else:
                    print("   ⚠️  No models downloaded")
                    print("      Download with: ollama pull llama3.2")
        else:
            print("   ❌ Ollama command failed")
    except (subprocess.TimeoutExpired, FileNotFoundError):
        print("   ❌ Ollama not found")
        print("      Install from: https://ollama.com")
        print("      Required for local LLM exercises")
    
    print("\n🎯 Environment setup complete!")
    return True

setup_langchain_environment()

## 1. LangChain Core Components

Let's explore the three fundamental building blocks of LangChain applications.

In [None]:
# Import core LangChain components
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
from langchain_community.chat_models import ChatOllama
from langchain_community.llms import Ollama

print("📚 Core LangChain Components")
print("=" * 30)

print("\n🔧 1. PROMPTS - Structure your inputs")
print("   • ChatPromptTemplate: For chat-based models")
print("   • PromptTemplate: For completion models")
print("   • Custom templates with variables")

print("\n🧠 2. MODELS - The AI brain")
print("   • ChatOllama: Local chat models")
print("   • Ollama: Local completion models")
print("   • Consistent interface across providers")

print("\n📤 3. OUTPUT PARSERS - Format responses")
print("   • StrOutputParser: Plain text output")
print("   • JsonOutputParser: Structured JSON")
print("   • Custom parsers for specific formats")

print("\n✅ Core components imported successfully!")

### 1.1 Creating Your First Prompt Template

Prompt templates help you create reusable, structured prompts with variables.

In [None]:
print("📝 Creating Prompt Templates")
print("=" * 28)

# 1. Simple prompt template
simple_prompt = PromptTemplate(
    input_variables=["topic"],
    template="Explain {topic} in simple terms."
)

print("🔹 Simple Prompt Template:")
print(f"   Template: {simple_prompt.template}")
print(f"   Variables: {simple_prompt.input_variables}")

# Test the template
formatted_prompt = simple_prompt.format(topic="machine learning")
print(f"   Example: {formatted_prompt}")

# 2. Chat prompt template (more sophisticated)
chat_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful {role} with expertise in {domain}."),
    ("human", "Please explain {concept} and provide a practical example."),
])

print("\n💬 Chat Prompt Template:")
print(f"   Messages: {len(chat_prompt.messages)}")
print(f"   Variables: {chat_prompt.input_variables}")

# Test the chat template
formatted_chat = chat_prompt.format_messages(
    role="data scientist",
    domain="machine learning",
    concept="overfitting"
)

print("   Example messages:")
for msg in formatted_chat:
    print(f"     {msg.type}: {msg.content}")

# 3. Advanced template with multiple variables and formatting
analysis_prompt = ChatPromptTemplate.from_template(
    """You are an expert data analyst. 
    
Dataset: {dataset_name}
Size: {num_rows} rows, {num_cols} columns
Task: {analysis_task}

Please provide a {analysis_type} analysis and {num_insights} key insights.
Format your response as:
1. Overview
2. Key Findings
3. Recommendations"""
)

print("\n📊 Advanced Analysis Template:")
print(f"   Variables: {analysis_prompt.input_variables}")

# Test advanced template
test_analysis = analysis_prompt.format(
    dataset_name="Customer Churn Data",
    num_rows=10000,
    num_cols=15,
    analysis_task="predict customer churn",
    analysis_type="statistical",
    num_insights=3
)

print(f"   Example (first 200 chars): {test_analysis[:200]}...")

print("\n✅ Prompt templates created successfully!")

### 1.2 Setting Up Local LLM Models

Let's connect to Ollama and test different local models.

In [None]:
print("🧠 Setting Up Local LLM Models")
print("=" * 32)

# Model configurations for different use cases
model_configs = {
    "fast": {
        "model": "llama3.2:1b",
        "temperature": 0.1,
        "description": "Fast, lightweight model for quick responses"
    },
    "balanced": {
        "model": "llama3.2",
        "temperature": 0.3,
        "description": "Balanced model for general use"
    },
    "creative": {
        "model": "llama3.2",
        "temperature": 0.7,
        "description": "Higher temperature for creative tasks"
    }
}

# Initialize models
models = {}
available_models = []

for config_name, config in model_configs.items():
    try:
        model = ChatOllama(
            model=config["model"],
            temperature=config["temperature"]
        )
        
        # Test the model with a simple query
        test_response = model.invoke([HumanMessage(content="Hello! Respond with just 'OK' if you're working.")])
        
        models[config_name] = model
        available_models.append(config_name)
        
        print(f"   ✅ {config_name}: {config['model']} (temp: {config['temperature']})")
        print(f"      {config['description']}")
        print(f"      Test: {test_response.content}")
        
    except Exception as e:
        print(f"   ❌ {config_name}: {config['model']} - Failed")
        print(f"      Error: {str(e)}")
        print(f"      Try: ollama pull {config['model']}")

if available_models:
    print(f"\n🎯 Available models: {', '.join(available_models)}")
    # Use the first available model as default
    default_model = models[available_models[0]]
    print(f"   Using '{available_models[0]}' as default")
else:
    print("\n⚠️  No models available. Please install Ollama and download a model:")
    print("     1. Install Ollama: https://ollama.com")
    print("     2. Download model: ollama pull llama3.2")
    print("     3. Restart this notebook")
    default_model = None

print("\n✅ Model setup complete!")

### 1.3 Output Parsers and Response Formatting

Output parsers help you structure and validate LLM responses.

In [None]:
print("📤 Output Parsers and Response Formatting")
print("=" * 42)

# 1. String Output Parser (most common)
str_parser = StrOutputParser()
print("🔹 String Parser:")
print("   Converts AI message to plain string")

# 2. JSON Output Parser (for structured data)
from langchain_core.pydantic_v1 import BaseModel, Field

class DataAnalysis(BaseModel):
    """Structured data analysis output"""
    summary: str = Field(description="Brief summary of findings")
    insights: list = Field(description="List of key insights")
    confidence: float = Field(description="Confidence score 0-1")
    recommendations: list = Field(description="List of recommendations")

json_parser = JsonOutputParser(pydantic_object=DataAnalysis)

print("\n🔹 JSON Parser:")
print("   Parses response into structured format")
print(f"   Schema: {list(DataAnalysis.__fields__.keys())}")

# 3. Custom parser example
class ListOutputParser:
    """Custom parser to extract numbered lists"""
    
    def parse(self, text: str) -> list:
        """Extract numbered list items from text"""
        import re
        
        # Find numbered items (1. item, 2. item, etc.)
        pattern = r'\d+\.\s*(.+?)(?=\n\d+\.|$)'
        matches = re.findall(pattern, text, re.DOTALL)
        
        # Clean up the matches
        return [match.strip() for match in matches]

list_parser = ListOutputParser()

print("\n🔹 Custom List Parser:")
print("   Extracts numbered lists from responses")

# Test parsers with sample responses
if available_models:
    print("\n🧪 Testing parsers...")
    
    # Test string parser
    test_prompt = "List 3 benefits of machine learning in one sentence each."
    try:
        response = default_model.invoke([HumanMessage(content=test_prompt)])
        parsed_string = str_parser.parse(response)
        
        print(f"   Original response type: {type(response)}")
        print(f"   Parsed string type: {type(parsed_string)}")
        print(f"   Content preview: {parsed_string[:100]}...")
        
        # Test custom list parser
        parsed_list = list_parser.parse(parsed_string)
        print(f"   Extracted list items: {len(parsed_list)}")
        for i, item in enumerate(parsed_list[:3], 1):
            print(f"     {i}. {item[:50]}...")
            
    except Exception as e:
        print(f"   Parser test failed: {e}")

print("\n✅ Output parsers configured!")

## 2. Building Your First LCEL Chains

Now let's put it all together using LangChain Expression Language (LCEL)!

### 2.1 Simple Chain: prompt | model | output_parser

The most basic LCEL pattern connects three components with the pipe operator.

In [None]:
print("🔗 Building Simple LCEL Chains")
print("=" * 30)

if not available_models:
    print("⚠️  Skipping chain examples - no models available")
else:
    # Chain 1: Basic explanation chain
    explanation_prompt = ChatPromptTemplate.from_template(
        "Explain {concept} in simple terms with a practical example."
    )
    
    explanation_chain = explanation_prompt | default_model | str_parser
    
    print("🔹 Explanation Chain: prompt | model | str_parser")
    print("   Input: concept name")
    print("   Output: simple explanation")
    
    # Test the chain
    try:
        result = explanation_chain.invoke({"concept": "neural networks"})
        print(f"\n   Test Result: {result[:150]}...")
    except Exception as e:
        print(f"   Test failed: {e}")
    
    # Chain 2: Data analysis chain
    analysis_prompt = ChatPromptTemplate.from_template(
        """Analyze this dataset information:
        
Dataset: {dataset_name}
Rows: {rows}
Columns: {columns}
Target: {target_variable}

Provide 3 key insights and 2 recommendations."""
    )
    
    analysis_chain = analysis_prompt | default_model | str_parser
    
    print("\n🔹 Analysis Chain: prompt | model | str_parser")
    print("   Input: dataset metadata")
    print("   Output: insights and recommendations")
    
    # Test with sample data
    try:
        analysis_result = analysis_chain.invoke({
            "dataset_name": "Customer Churn",
            "rows": 5000,
            "columns": 12,
            "target_variable": "churn (yes/no)"
        })
        print(f"\n   Analysis Result: {analysis_result[:200]}...")
    except Exception as e:
        print(f"   Analysis test failed: {e}")
    
    # Chain 3: Code generation chain
    code_prompt = ChatPromptTemplate.from_template(
        "Write a Python function to {task}. Include docstring and example usage."
    )
    
    code_chain = code_prompt | default_model | str_parser
    
    print("\n🔹 Code Generation Chain: prompt | model | str_parser")
    print("   Input: task description")
    print("   Output: Python code with documentation")
    
    try:
        code_result = code_chain.invoke({"task": "calculate the mean and standard deviation of a list"})
        print(f"\n   Code Result: {code_result[:200]}...")
    except Exception as e:
        print(f"   Code generation test failed: {e}")

print("\n✅ Simple chains created and tested!")

### 2.2 Advanced Chain Composition

Let's build more sophisticated chains with multiple steps and data transformations.

In [None]:
print("🔗 Advanced Chain Composition")
print("=" * 28)

if not available_models:
    print("⚠️  Skipping advanced examples - no models available")
else:
    from langchain_core.runnables import RunnablePassthrough, RunnableLambda
    from operator import itemgetter
    
    # 1. Multi-step analysis chain
    print("🔹 Multi-step Analysis Chain")
    
    # Step 1: Data preprocessing function
    def preprocess_data_info(data_dict):
        """Process and format data information"""
        processed = {
            "formatted_info": f"Dataset '{data_dict['name']}' has {data_dict['rows']:,} rows and {data_dict['cols']} columns.",
            "complexity": "high" if data_dict['rows'] > 10000 else "medium" if data_dict['rows'] > 1000 else "low",
            "original": data_dict
        }
        return processed
    
    # Step 2: Analysis prompt
    analysis_prompt = ChatPromptTemplate.from_template(
        """Data Information: {formatted_info}
Complexity Level: {complexity}

Provide analysis recommendations based on the complexity level:
- Low complexity: Simple statistical analysis
- Medium complexity: Advanced statistical methods
- High complexity: Machine learning approaches

Give 3 specific recommendations."""
    )
    
    # Build the chain
    multi_step_chain = (
        RunnableLambda(preprocess_data_info)
        | analysis_prompt
        | default_model
        | str_parser
    )
    
    print("   Chain: preprocess | prompt | model | parser")
    
    # Test the multi-step chain
    try:
        test_data = {"name": "Sales Data", "rows": 50000, "cols": 25}
        multi_result = multi_step_chain.invoke(test_data)
        print(f"   Result: {multi_result[:150]}...")
    except Exception as e:
        print(f"   Test failed: {e}")
    
    # 2. Parallel processing chain
    print("\n🔹 Parallel Processing Chain")
    
    from langchain_core.runnables import RunnableParallel
    
    # Define different analysis perspectives
    technical_prompt = ChatPromptTemplate.from_template(
        "From a technical perspective, analyze: {topic}"
    )
    
    business_prompt = ChatPromptTemplate.from_template(
        "From a business perspective, analyze: {topic}"
    )
    
    # Create parallel chains
    parallel_chain = RunnableParallel(
        technical=technical_prompt | default_model | str_parser,
        business=business_prompt | default_model | str_parser
    )
    
    print("   Chain: parallel(technical + business analysis)")
    
    try:
        parallel_result = parallel_chain.invoke({"topic": "implementing machine learning in customer service"})
        print(f"   Technical: {parallel_result['technical'][:100]}...")
        print(f"   Business: {parallel_result['business'][:100]}...")
    except Exception as e:
        print(f"   Parallel test failed: {e}")
    
    # 3. Conditional chain
    print("\n🔹 Conditional Chain")
    
    def route_by_complexity(data):
        """Route to different prompts based on data complexity"""
        rows = data.get('rows', 0)
        if rows > 100000:
            return "big_data"
        elif rows > 10000:
            return "medium_data"
        else:
            return "small_data"
    
    # Different prompts for different data sizes
    prompts = {
        "small_data": ChatPromptTemplate.from_template("Analyze this small dataset ({rows} rows): {description}"),
        "medium_data": ChatPromptTemplate.from_template("Analyze this medium dataset ({rows} rows): {description}"),
        "big_data": ChatPromptTemplate.from_template("Analyze this large dataset ({rows} rows): {description}")
    }
    
    # This is a simplified conditional example
    print("   Chain: router → appropriate_prompt | model | parser")
    print("   Routes based on dataset size")

print("\n✅ Advanced chains demonstrated!")

### 2.3 Chain with Memory and Context

Let's create chains that maintain context across multiple interactions.

In [None]:
print("🧠 Chains with Memory and Context")
print("=" * 32)

if not available_models:
    print("⚠️  Skipping memory examples - no models available")
else:
    # Simple conversation memory
    class SimpleConversationMemory:
        """Basic conversation memory for maintaining context"""
        
        def __init__(self, max_messages=10):
            self.messages = []
            self.max_messages = max_messages
        
        def add_message(self, role, content):
            self.messages.append({"role": role, "content": content})
            # Keep only recent messages
            if len(self.messages) > self.max_messages:
                self.messages = self.messages[-self.max_messages:]
        
        def get_context(self):
            """Format conversation history for prompt"""
            if not self.messages:
                return "No previous conversation."
            
            context = "Previous conversation:\n"
            for msg in self.messages[-5:]:  # Last 5 messages
                context += f"{msg['role']}: {msg['content']}\n"
            return context
    
    # Initialize memory
    memory = SimpleConversationMemory()
    
    # Contextual prompt template
    contextual_prompt = ChatPromptTemplate.from_template(
        """{context}

User: {user_input}

As a helpful data science assistant, respond to the user's question considering the conversation history."""
    )
    
    # Contextual chain
    def create_contextual_response(user_input):
        """Generate response with conversation context"""
        context = memory.get_context()
        
        # Create the chain
        chain = contextual_prompt | default_model | str_parser
        
        # Generate response
        response = chain.invoke({
            "context": context,
            "user_input": user_input
        })
        
        # Update memory
        memory.add_message("user", user_input)
        memory.add_message("assistant", response)
        
        return response
    
    print("🔹 Contextual Conversation Chain")
    print("   Maintains conversation history")
    print("   Considers previous context in responses")
    
    # Test the contextual chain
    try:
        print("\n🧪 Testing contextual conversation:")
        
        # First interaction
        response1 = create_contextual_response("What is overfitting in machine learning?")
        print(f"   Q1: What is overfitting?")
        print(f"   A1: {response1[:100]}...")
        
        # Second interaction (references previous)
        response2 = create_contextual_response("How can I prevent it?")
        print(f"\n   Q2: How can I prevent it?")
        print(f"   A2: {response2[:100]}...")
        
        # Show memory state
        print(f"\n   Memory contains {len(memory.messages)} messages")
        
    except Exception as e:
        print(f"   Contextual test failed: {e}")
    
    # Data analysis session memory
    print("\n🔹 Data Analysis Session Memory")
    
    class AnalysisSession:
        """Maintains state for a data analysis session"""
        
        def __init__(self):
            self.dataset_info = {}
            self.analysis_steps = []
            self.findings = []
        
        def set_dataset(self, name, rows, cols, description=""):
            self.dataset_info = {
                "name": name,
                "rows": rows,
                "cols": cols,
                "description": description
            }
        
        def add_step(self, step_description):
            self.analysis_steps.append(step_description)
        
        def add_finding(self, finding):
            self.findings.append(finding)
        
        def get_session_summary(self):
            return {
                "dataset": self.dataset_info,
                "steps_completed": len(self.analysis_steps),
                "recent_steps": self.analysis_steps[-3:],
                "key_findings": self.findings[-3:]
            }
    
    # Example session
    session = AnalysisSession()
    session.set_dataset("Customer Data", 10000, 15, "E-commerce customer behavior")
    session.add_step("Loaded dataset and performed initial exploration")
    session.add_finding("High correlation between purchase amount and session duration")
    
    print("   Tracks dataset info, analysis steps, and findings")
    print(f"   Session summary: {session.get_session_summary()}")

print("\n✅ Memory and context patterns demonstrated!")

## 3. Practical Applications

Let's build some real-world applications using LangChain!

### 3.1 Data Analysis Assistant

A comprehensive assistant for data analysis tasks.

In [None]:
print("🔍 Data Analysis Assistant")
print("=" * 25)

if not available_models:
    print("⚠️  Skipping assistant demo - no models available")
else:
    class DataAnalysisAssistant:
        """AI-powered data analysis assistant"""
        
        def __init__(self, model):
            self.model = model
            self.setup_chains()
        
        def setup_chains(self):
            """Initialize different analysis chains"""
            
            # 1. Dataset overview chain
            self.overview_prompt = ChatPromptTemplate.from_template(
                """Dataset Analysis Overview:
                
Name: {name}
Size: {rows} rows × {cols} columns
Description: {description}
Missing Data: {missing_info}

Provide:
1. Initial assessment of data quality
2. Suggested first steps for analysis
3. Potential challenges to watch for"""
            )
            
            self.overview_chain = self.overview_prompt | self.model | str_parser
            
            # 2. Statistical analysis chain
            self.stats_prompt = ChatPromptTemplate.from_template(
                """Statistical Summary:
                
{statistical_summary}

Interpret these statistics and identify:
1. Key patterns in the data
2. Outliers or anomalies
3. Relationships between variables
4. Recommendations for further analysis"""
            )
            
            self.stats_chain = self.stats_prompt | self.model | str_parser
            
            # 3. Visualization recommendation chain
            self.viz_prompt = ChatPromptTemplate.from_template(
                """Data Visualization Recommendations:
                
Dataset Type: {data_type}
Variables: {variables}
Analysis Goal: {goal}

Recommend 3-5 specific visualizations with:
1. Chart type and rationale
2. Variables to include
3. Python code snippet (matplotlib/seaborn)"""
            )
            
            self.viz_chain = self.viz_prompt | self.model | str_parser
        
        def analyze_dataset_overview(self, dataset_info):
            """Generate initial dataset analysis"""
            return self.overview_chain.invoke(dataset_info)
        
        def interpret_statistics(self, stats_summary):
            """Interpret statistical summary"""
            return self.stats_chain.invoke({"statistical_summary": stats_summary})
        
        def recommend_visualizations(self, data_info):
            """Suggest appropriate visualizations"""
            return self.viz_chain.invoke(data_info)
    
    # Initialize assistant
    assistant = DataAnalysisAssistant(default_model)
    
    print("✅ Data Analysis Assistant initialized")
    print("   Capabilities: overview, statistics, visualizations")
    
    # Test the assistant
    try:
        print("\n🧪 Testing assistant capabilities:")
        
        # Test dataset overview
        test_dataset = {
            "name": "E-commerce Customer Behavior",
            "rows": 15000,
            "cols": 12,
            "description": "Customer purchase history and demographics",
            "missing_info": "Age column has 5% missing values"
        }
        
        overview = assistant.analyze_dataset_overview(test_dataset)
        print(f"   Overview: {overview[:150]}...")
        
        # Test visualization recommendations
        viz_request = {
            "data_type": "customer behavior",
            "variables": "age, purchase_amount, session_duration, category_preference",
            "goal": "understand customer segmentation patterns"
        }
        
        viz_recommendations = assistant.recommend_visualizations(viz_request)
        print(f"\n   Visualizations: {viz_recommendations[:150]}...")
        
    except Exception as e:
        print(f"   Assistant test failed: {e}")

print("\n✅ Data Analysis Assistant demo complete!")

### 3.2 Code Generation and Review Assistant

An assistant that helps with data science code generation and review.

In [None]:
print("💻 Code Generation and Review Assistant")
print("=" * 38)

if not available_models:
    print("⚠️  Skipping code assistant demo - no models available")
else:
    class CodeAssistant:
        """AI assistant for data science code tasks"""
        
        def __init__(self, model):
            self.model = model
            self.setup_chains()
        
        def setup_chains(self):
            """Setup code-related chains"""
            
            # Code generation chain
            self.code_gen_prompt = ChatPromptTemplate.from_template(
                """Generate Python code for the following task:
                
Task: {task}
Requirements: {requirements}
Libraries: {libraries}

Provide:
1. Complete, working Python code
2. Proper docstrings and comments
3. Example usage
4. Error handling where appropriate"""
            )
            
            self.code_gen_chain = self.code_gen_prompt | self.model | str_parser
            
            # Code review chain
            self.code_review_prompt = ChatPromptTemplate.from_template(
                """Review this Python code for data science:
                
```python
{code}
```

Provide feedback on:
1. Code quality and best practices
2. Potential bugs or issues
3. Performance considerations
4. Suggestions for improvement
5. Data science best practices"""
            )
            
            self.code_review_chain = self.code_review_prompt | self.model | str_parser
            
            # Debugging chain
            self.debug_prompt = ChatPromptTemplate.from_template(
                """Debug this Python code error:
                
Code:
```python
{code}
```

Error:
{error}

Provide:
1. Explanation of what's causing the error
2. Corrected code
3. Prevention tips for similar errors"""
            )
            
            self.debug_chain = self.debug_prompt | self.model | str_parser
        
        def generate_code(self, task, requirements="", libraries="pandas, numpy, matplotlib"):
            """Generate code for a specific task"""
            return self.code_gen_chain.invoke({
                "task": task,
                "requirements": requirements,
                "libraries": libraries
            })
        
        def review_code(self, code):
            """Review and provide feedback on code"""
            return self.code_review_chain.invoke({"code": code})
        
        def debug_code(self, code, error):
            """Help debug code errors"""
            return self.debug_chain.invoke({"code": code, "error": error})
    
    # Initialize code assistant
    code_assistant = CodeAssistant(default_model)
    
    print("✅ Code Assistant initialized")
    print("   Capabilities: generation, review, debugging")
    
    # Test the code assistant
    try:
        print("\n🧪 Testing code assistant:")
        
        # Test code generation
        code_task = "Create a function to detect outliers in a dataset using the IQR method"
        requirements = "Function should handle both pandas DataFrame and numpy arrays"
        
        generated_code = code_assistant.generate_code(code_task, requirements)
        print(f"   Generated code: {generated_code[:200]}...")
        
        # Test code review
        sample_code = """
def analyze_data(df):
    mean = df.mean()
    std = df.std()
    return mean, std
"""
        
        code_review = code_assistant.review_code(sample_code)
        print(f"\n   Code review: {code_review[:200]}...")
        
    except Exception as e:
        print(f"   Code assistant test failed: {e}")

print("\n✅ Code Assistant demo complete!")

## 4. Best Practices and Production Tips

Key recommendations for building robust LangChain applications.

In [None]:
print("🏭 LangChain Best Practices and Production Tips")
print("=" * 45)

best_practices = {
    "Prompt Engineering": [
        "Use clear, specific instructions",
        "Provide examples in prompts when helpful",
        "Structure prompts with clear sections",
        "Test prompts with different inputs",
        "Version control your prompt templates"
    ],
    "Chain Design": [
        "Keep chains focused on single responsibilities",
        "Use intermediate steps for complex workflows",
        "Implement proper error handling",
        "Add logging and monitoring",
        "Make chains testable and debuggable"
    ],
    "Model Management": [
        "Choose appropriate model sizes for your use case",
        "Implement fallback strategies",
        "Monitor model performance and costs",
        "Cache responses when appropriate",
        "Consider fine-tuning for specific domains"
    ],
    "Performance": [
        "Use async operations for concurrent requests",
        "Implement request batching",
        "Set appropriate timeouts",
        "Monitor response times and success rates",
        "Optimize prompt length and complexity"
    ],
    "Security": [
        "Validate and sanitize user inputs",
        "Implement rate limiting",
        "Protect sensitive data in prompts",
        "Use secure model hosting",
        "Log security-relevant events"
    ]
}

for category, practices in best_practices.items():
    print(f"\n🎯 {category}:")
    for practice in practices:
        print(f"   • {practice}")

# Example production-ready chain pattern
print("\n🔧 Production-Ready Chain Pattern:")

class ProductionChain:
    """Example of a production-ready LangChain implementation"""
    
    def __init__(self, model, max_retries=3, timeout=30):
        self.model = model
        self.max_retries = max_retries
        self.timeout = timeout
        self.setup_chain()
    
    def setup_chain(self):
        """Setup the chain with error handling"""
        self.prompt = ChatPromptTemplate.from_template(
            "Analyze the following data and provide insights: {data}"
        )
        self.chain = self.prompt | self.model | str_parser
    
    def analyze_with_retry(self, data):
        """Execute analysis with retry logic"""
        import time
        
        for attempt in range(self.max_retries):
            try:
                # Add input validation
                if not data or len(str(data)) > 10000:
                    raise ValueError("Invalid input data")
                
                # Execute chain
                result = self.chain.invoke({"data": data})
                
                # Validate output
                if len(result) < 10:
                    raise ValueError("Response too short")
                
                return {
                    "success": True,
                    "result": result,
                    "attempt": attempt + 1
                }
                
            except Exception as e:
                if attempt == self.max_retries - 1:
                    return {
                        "success": False,
                        "error": str(e),
                        "attempts": attempt + 1
                    }
                time.sleep(2 ** attempt)  # Exponential backoff

print("   ✅ Includes error handling, retries, and validation")
print("   ✅ Input/output validation")
print("   ✅ Exponential backoff for retries")
print("   ✅ Structured error responses")

print("\n📚 Additional Resources:")
resources = [
    "LangChain Documentation: https://python.langchain.com/",
    "LCEL Guide: https://python.langchain.com/docs/expression_language/",
    "Ollama Models: https://ollama.com/library",
    "Prompt Engineering Guide: https://www.promptingguide.ai/"
]

for resource in resources:
    print(f"   📖 {resource}")

print("\n✅ Best practices overview complete!")

## 🎯 Practice Exercises

Now it's your turn! Try these exercises to reinforce your learning.

### Exercise 1: Build a Data Science Tutor Chain

Create a chain that explains data science concepts with examples.

In [None]:
# 🎯 EXERCISE 1: Your solution here!
print("💻 Exercise 1: Data Science Tutor Chain")
print("=" * 38)

print("📋 Requirements:")
print("   1. Create a prompt that explains concepts at different levels (beginner, intermediate, advanced)")
print("   2. Include practical examples and use cases")
print("   3. Add follow-up questions to test understanding")
print("   4. Test with concepts like 'cross-validation', 'feature engineering', 'bias-variance tradeoff'")

if available_models:
    print("\n🔧 Build your tutor chain here:")
    
    # Your code here!
    # Hint: Use a prompt template with variables for concept, level, and context
    
    print("   Add your implementation above!")
else:
    print("\n⚠️  Install Ollama and download a model to complete this exercise")

print("\n✅ Exercise 1 space ready!")

### Exercise 2: Create a Model Comparison Chain

Build a chain that compares different machine learning models.

In [None]:
# 🎯 EXERCISE 2: Your solution here!
print("💻 Exercise 2: Model Comparison Chain")
print("=" * 35)

print("📋 Requirements:")
print("   1. Create a chain that compares ML models for a given dataset/problem")
print("   2. Consider factors: accuracy, interpretability, training time, overfitting risk")
print("   3. Provide recommendations based on dataset characteristics")
print("   4. Test with different scenarios (large dataset, small dataset, high-dimensional, etc.)")

if available_models:
    print("\n🔧 Build your model comparison chain here:")
    
    # Your code here!
    # Hint: Use parallel chains to analyze different aspects simultaneously
    
    print("   Add your implementation above!")
else:
    print("\n⚠️  Install Ollama and download a model to complete this exercise")

print("\n✅ Exercise 2 space ready!")

### Exercise 3: Advanced Chain with Error Handling

Create a robust chain with comprehensive error handling and fallbacks.

In [None]:
# 🎯 EXERCISE 3: Your solution here!
print("💻 Exercise 3: Robust Chain with Error Handling")
print("=" * 47)

print("📋 Requirements:")
print("   1. Build a chain that gracefully handles various error conditions")
print("   2. Implement retry logic with exponential backoff")
print("   3. Add input validation and output verification")
print("   4. Provide meaningful error messages and fallback responses")
print("   5. Include logging for debugging and monitoring")

if available_models:
    print("\n🔧 Build your robust chain here:")
    
    # Your code here!
    # Hint: Use try-catch blocks, validation functions, and fallback strategies
    
    print("   Add your implementation above!")
else:
    print("\n⚠️  Install Ollama and download a model to complete this exercise")

print("\n✅ Exercise 3 space ready!")

## 🎉 Summary and Next Steps

Congratulations! You've mastered the fundamentals of LangChain and LCEL.

In [None]:
print("🎓 LANGCHAIN FUNDAMENTALS COMPLETE!")
print("=" * 36)

print("🏆 What You've Learned:")
skills_learned = [
    "✅ LangChain core components (Prompts, Models, Parsers)",
    "✅ LCEL syntax and chain composition",
    "✅ Local LLM integration with Ollama",
    "✅ Advanced chain patterns and error handling",
    "✅ Memory and context management",
    "✅ Practical AI application development",
    "✅ Production best practices"
]

for skill in skills_learned:
    print(f"   {skill}")

print("\n🛠️ Key Concepts Mastered:")
concepts = {
    "Chain Patterns": ["Simple chains", "Parallel processing", "Conditional routing", "Multi-step workflows"],
    "LCEL Operations": ["Pipe operator |", "RunnablePassthrough", "RunnableParallel", "RunnableLambda"],
    "Production Skills": ["Error handling", "Retry logic", "Input validation", "Performance optimization"]
}

for category, skill_list in concepts.items():
    print(f"   🎯 {category}: {', '.join(skill_list)}")

print("\n🔮 What's Next:")
next_topics = [
    "🤖 LangChain Agents and Tools",
    "🗃️ Vector Databases and Embeddings",
    "🔍 Retrieval-Augmented Generation (RAG)",
    "💾 Advanced Memory Patterns",
    "🌐 Building Full AI Applications",
    "📊 LangSmith for Monitoring and Debugging"
]

for topic in next_topics:
    print(f"   {topic}")

print("\n💡 Practice Recommendations:")
practice_tasks = [
    "🔄 Complete the exercises above with different prompts and models",
    "🎨 Build a creative writing assistant using multiple LLM models",
    "📈 Create a financial analysis chain with data processing",
    "🔍 Develop a research assistant that summarizes multiple sources",
    "🏗️ Design a multi-agent system for complex workflows"
]

for task in practice_tasks:
    print(f"   {task}")

print("\n📚 Additional Learning:")
learning_resources = [
    "📖 Explore LangChain community integrations",
    "🎥 Watch LangChain YouTube tutorials",
    "💻 Contribute to open-source LangChain projects",
    "🏆 Build and share your own LangChain applications",
    "📝 Write about your LangChain experiences"
]

for resource in learning_resources:
    print(f"   {resource}")

print("\n🎯 Self-Assessment:")
assessment_questions = [
    "❓ Can you explain the difference between a prompt template and a chain?",
    "❓ When would you use parallel vs sequential chain composition?",
    "❓ How do you handle errors in production LangChain applications?",
    "❓ What are the trade-offs between local and API-based models?",
    "❓ How do you optimize chain performance for large-scale applications?"
]

for question in assessment_questions:
    print(f"   {question}")

print("\n🌟 Remember:")
reminders = [
    "🧠 Start with simple chains and gradually add complexity",
    "🔧 Always test your chains with various inputs",
    "📊 Monitor performance and costs in production",
    "🔒 Implement proper security and validation",
    "📈 Iterate and improve based on user feedback"
]

for reminder in reminders:
    print(f"   {reminder}")

print("\n🎉 You're now ready to build powerful AI applications with LangChain!")
print("🏆 - INRIVA AI Academy Team")

# Save learning progress
import json
from datetime import datetime

progress = {
    'module': 'langchain_fundamentals',
    'completed': True,
    'timestamp': datetime.now().isoformat(),
    'skills_learned': [skill.replace('✅ ', '') for skill in skills_learned],
    'concepts_mastered': {k: v for k, v in concepts.items()},
    'next_steps': [topic.replace('🤖 ', '').replace('🗃️ ', '').replace('🔍 ', '').replace('💾 ', '').replace('🌐 ', '').replace('📊 ', '') for topic in next_topics],
    'exercises_completed': 'Available for practice',
    'models_used': available_models if 'available_models' in globals() else []
}

print(f"\n💾 Learning progress saved: {len(json.dumps(progress))} characters")
print("📋 Ready for advanced LangChain topics in the next module!")