# Introduction to LangChain & Environment Setup

## Learning Objectives
By the end of this notebook, you will be able to:
- Understand what LangChain is and why it's essential for building LLM applications
- Set up your development environment with all necessary dependencies
- Make your first LangChain API calls
- Understand the core components of the LangChain ecosystem
- Navigate the LangChain documentation and resources

## Why This Matters: Real-World AI Applications

**In AI Systems:**
- LangChain provides a unified interface for working with multiple LLM providers
- Enables rapid prototyping and production deployment of AI applications
- Reduces boilerplate code and complexity

**In RAG Pipelines:**
- Standardizes document loading, splitting, and embedding generation
- Provides pre-built components for vector stores and retrieval
- Simplifies the integration of knowledge bases with LLMs

**In Agentic AI:**
- Offers tools and frameworks for building autonomous agents
- Manages agent memory and conversation state
- Enables complex multi-step reasoning and tool usage

## Prerequisites
- Python 3.10+ installed
- Basic Python knowledge (variables, functions, loops)
- Understanding of APIs and JSON
- OpenAI API key (free tier works)


## Setup: Install Dependencies

Run this cell first to install all required packages:

In [None]:
# Install core LangChain packages
!pip install -q langchain langchain-openai langchain-community python-dotenv

---

## Instructor Activity 1: The Problem LangChain Solves

**Concept**: Understanding why LangChain exists by comparing development with and without it.

### Example 1: Building a Simple Q&A System Without LangChain

**Problem**: Create a system that answers questions using an LLM
**Expected Output**: Complex, vendor-specific code

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
import openai
import anthropic

# Without LangChain - vendor-specific implementations
def answer_with_openai(question):
    """OpenAI-specific implementation"""
    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": question}
        ]
    )
    return response.choices[0].message.content

def answer_with_anthropic(question):
    """Anthropic-specific implementation"""
    client = anthropic.Anthropic()
    message = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1000,
        messages=[
            {"role": "user", "content": question}
        ]
    )
    return message.content[0].text

# Notice: Different APIs, different response structures, vendor lock-in
print("Without LangChain: Multiple implementations needed!")
```

**Why this is problematic:**
- Different API patterns for each provider
- Vendor lock-in makes switching providers difficult
- No standardized way to handle prompts, memory, or tools
- Lots of boilerplate code for common patterns

</details>

### Example 2: The Same System WITH LangChain

**Problem**: Create the same Q&A system using LangChain
**Expected Output**: Clean, provider-agnostic code

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

# With LangChain - unified interface
def answer_with_langchain(question, provider="openai"):
    """Provider-agnostic implementation"""
    
    # Switch providers with one line
    if provider == "openai":
        llm = ChatOpenAI(model="gpt-4o-mini")
    elif provider == "anthropic":
        llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
    
    # Same interface for all providers!
    response = llm.invoke(question)
    return response.content

# Example usage
question = "What is LangChain?"
print(answer_with_langchain(question))
```

**Why this works better:**
- One interface for all LLM providers
- Easy to switch between providers
- Consistent response format
- Less code, more maintainable

</details>

### Example 3: LangChain Ecosystem Components

**Problem**: Understand what components LangChain provides
**Expected Output**: Overview of major components

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
# LangChain provides these core components:

# 1. Models - Interface with LLMs
from langchain_openai import ChatOpenAI

# 2. Prompts - Manage and optimize prompts
from langchain_core.prompts import ChatPromptTemplate

# 3. Chains - Connect components
from langchain_core.runnables import RunnablePassthrough

# 4. Memory - Maintain conversation state
from langchain.memory import ConversationBufferMemory

# 5. Document Loaders - Load various data sources
from langchain_community.document_loaders import PyPDFLoader

# 6. Vector Stores - Semantic search
from langchain_community.vectorstores import Chroma

# 7. Agents - Autonomous decision-making
from langchain.agents import create_react_agent

# 8. Tools - Give LLMs capabilities
from langchain_community.tools import DuckDuckGoSearchRun

print("LangChain Components Overview:")
print("- Models: Interface with any LLM")
print("- Prompts: Template and manage prompts")
print("- Chains: Build complex workflows")
print("- Memory: Maintain conversation context")
print("- Document Processing: Load and split documents")
print("- Vector Stores: Semantic search and retrieval")
print("- Agents: Build autonomous AI systems")
print("- Tools: Extend LLM capabilities")
```

**Why this ecosystem matters:**
- Pre-built components for common AI patterns
- Everything works together seamlessly
- Battle-tested in production
- Active community and continuous updates

</details>

---

## Learner Activity 1: Understanding the Value of LangChain

**Practice**: Explore why LangChain is valuable for AI development

### Exercise 1: Identify the Problems

**Task**: List 3 problems you might face when building AI applications without a framework
**Expected Output**: A list of challenges

In [None]:
# Your code here
# Think about: What would be difficult about building a chatbot from scratch?

<details>
<summary>Solution</summary>

```python
# Common problems without a framework like LangChain:

problems = [
    "1. Vendor lock-in: Each LLM provider has different APIs",
    "2. Boilerplate code: Lots of repeated code for common patterns",
    "3. Complex integrations: Connecting documents, memory, and tools is difficult",
    "4. No standardization: Every developer reinvents the wheel",
    "5. Production challenges: Error handling, retries, streaming all custom-built"
]

for problem in problems:
    print(problem)

print("\nLangChain solves all these problems with pre-built, tested components!")
```

**Why these matter:**
- Real teams face these issues daily
- LangChain's standardization saves months of development
- Focus on your application logic, not infrastructure

</details>

### Exercise 2: Component Matching

**Task**: Match LangChain components to real-world use cases
**Expected Output**: Correct component for each scenario

In [None]:
# Your code here
# Match each use case to a LangChain component:

use_cases = {
    "Building a PDF Q&A system": "?",
    "Remembering user preferences in a chat": "?",
    "Searching the web for current information": "?",
    "Finding similar documents": "?",
    "Creating dynamic prompts with variables": "?"
}

# Fill in the ? with the correct component

<details>
<summary>Solution</summary>

```python
# Matching LangChain components to use cases

use_cases = {
    "Building a PDF Q&A system": "Document Loaders + Vector Stores",
    "Remembering user preferences in a chat": "Memory components",
    "Searching the web for current information": "Tools (like DuckDuckGo search)",
    "Finding similar documents": "Vector Stores + Embeddings",
    "Creating dynamic prompts with variables": "Prompt Templates"
}

print("LangChain Component Matching:")
print("=" * 50)
for use_case, component in use_cases.items():
    print(f"Use Case: {use_case}")
    print(f"Component: {component}\n")
```

**Why these matches work:**
- Each component is purpose-built for specific AI patterns
- Components can be combined for complex applications
- Pre-built solutions for common use cases

</details>

---

## Instructor Activity 2: Setting Up Your Environment

**Concept**: Properly configure your development environment for LangChain

### Example 1: Environment Variables Setup

**Problem**: Securely manage API keys
**Expected Output**: Loaded environment variables

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
import os
from dotenv import load_dotenv

# Method 1: Using .env file (recommended for development)
load_dotenv()  # Loads from .env file

# Method 2: Direct setting (for Google Colab or testing)
# Uncomment and add your key if not using .env file
# os.environ["OPENAI_API_KEY"] = "your-key-here"

# Verify key is loaded (without revealing it)
api_key = os.getenv("OPENAI_API_KEY")
if api_key:
    # Mask the key for security
    masked = api_key[:8] + "..." + api_key[-4:] if len(api_key) > 12 else "***"
    print(f"✅ OpenAI API key loaded: {masked}")
else:
    print("❌ No API key found. Please set OPENAI_API_KEY")
    print("Get your key from: https://platform.openai.com/api-keys")
```

**Why environment variables matter:**
- Keep secrets out of code
- Different keys for dev/prod
- Industry best practice for security

</details>

### Example 2: Verifying LangChain Installation

**Problem**: Ensure all components are properly installed
**Expected Output**: Version information and successful imports

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
# Check LangChain installation
import langchain
import langchain_core
import langchain_openai
import langchain_community

print("LangChain Installation Check:")
print("=" * 40)
print(f"✅ langchain: {langchain.__version__}")
print(f"✅ langchain-core: {langchain_core.__version__}")
print(f"✅ langchain-openai: {langchain_openai.__version__}")
print(f"✅ langchain-community: {langchain_community.__version__}")

# Test critical imports
try:
    from langchain_openai import ChatOpenAI
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.output_parsers import StrOutputParser
    print("\n✅ All critical imports successful!")
except ImportError as e:
    print(f"\n❌ Import error: {e}")
    print("Run: pip install langchain langchain-openai langchain-community")
```

**Why verification is important:**
- Catch issues early
- Ensure compatibility between packages
- Confirm environment is ready for development

</details>

---

## Learner Activity 2: Set Up Your Environment

**Practice**: Configure your development environment

### Exercise 1: Create and Load Environment Variables

**Task**: Set up your API key securely
**Expected Output**: Confirmation that key is loaded

In [None]:
# Your code here
# TODO: Import necessary modules and load your API key

<details>
<summary>Solution</summary>

```python
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# For Google Colab users (uncomment if needed):
# import getpass
# os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter OpenAI API Key: ")

# Check if key is loaded
if os.getenv("OPENAI_API_KEY"):
    print("✅ API key successfully loaded!")
    print("You're ready to use LangChain with OpenAI")
else:
    print("⚠️ No API key found")
    print("Please set OPENAI_API_KEY in your .env file or environment")
```

**Why this works:**
- `load_dotenv()` reads from .env file
- `os.getenv()` safely retrieves environment variables
- Never hardcode API keys in your code

</details>

### Exercise 2: Verify Your Installation

**Task**: Check that all required packages are installed
**Expected Output**: List of installed packages with versions

In [None]:
# Your code here
# TODO: Import packages and print their versions

<details>
<summary>Solution</summary>

```python
# Check all required packages
packages_to_check = [
    ("langchain", "LangChain Core"),
    ("langchain_core", "LangChain Core Components"),
    ("langchain_openai", "OpenAI Integration"),
    ("langchain_community", "Community Integrations"),
    ("dotenv", "Environment Management")
]

print("Package Installation Status:")
print("=" * 40)

for package_name, description in packages_to_check:
    try:
        package = __import__(package_name)
        version = getattr(package, "__version__", "installed")
        print(f"✅ {description}: {version}")
    except ImportError:
        print(f"❌ {description}: Not installed")

print("\nIf any packages are missing, run:")
print("pip install langchain langchain-openai langchain-community python-dotenv")
```

**Why this verification helps:**
- Ensures all dependencies are present
- Catches version mismatches early
- Provides clear fix instructions if needed

</details>

---

## Instructor Activity 3: Your First LangChain Call

**Concept**: Make your first successful API call with LangChain

### Example 1: Basic LLM Initialization

**Problem**: Create and configure an LLM instance
**Expected Output**: Configured ChatOpenAI object

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI

# Initialize the LLM with configuration
llm = ChatOpenAI(
    model="gpt-4o-mini",      # Efficient, cost-effective model
    temperature=0.7,          # Balance between creativity and consistency
    max_tokens=150,          # Limit response length
    timeout=30,              # API timeout in seconds
    max_retries=2,           # Retry on failure
)

# Display configuration
print("LLM Configuration:")
print(f"Model: {llm.model_name}")
print(f"Temperature: {llm.temperature}")
print(f"Max Tokens: {llm.max_tokens}")
print("\n✅ LLM initialized and ready!")
```

**Why these settings matter:**
- **model**: gpt-4o-mini is fast and affordable
- **temperature**: 0 = deterministic, 1 = creative
- **max_tokens**: Controls response length and cost
- **timeout/retries**: Production reliability

</details>

### Example 2: Your First API Call

**Problem**: Make a simple call to the LLM
**Expected Output**: AI-generated response

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI

# Initialize LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# Make your first call!
response = llm.invoke("Hello LangChain! Tell me a fun fact about AI in one sentence.")

# Display the response
print("🎉 Your First LangChain Response:")
print("=" * 50)
print(response.content)
print("=" * 50)
print("\n🚀 Congratulations! You've made your first LangChain call!")
```

**Why this is significant:**
- You've successfully connected to an LLM
- The same code works with any LLM provider
- Foundation for all future LangChain applications

</details>

### Example 3: Understanding the Response Object

**Problem**: Explore what the LLM returns
**Expected Output**: Response object details

In [None]:
# Empty cell for demonstration

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI

# Initialize and call LLM
llm = ChatOpenAI(model="gpt-4o-mini")
response = llm.invoke("What is Python?")

# Explore the response object
print("Response Object Analysis:")
print("=" * 50)
print(f"Type: {type(response)}")
print(f"\nContent: {response.content[:100]}...")
print(f"\nResponse metadata:")
print(f"- Token usage: {response.response_metadata.get('token_usage', 'N/A')}")
print(f"- Model: {response.response_metadata.get('model_name', 'N/A')}")
print(f"- Finish reason: {response.response_metadata.get('finish_reason', 'N/A')}")

# Access just the text content
text_content = response.content
print(f"\nJust the text (first 200 chars):")
print(text_content[:200])
```

**Why understanding responses matters:**
- Access token usage for cost tracking
- Check finish_reason for truncation
- Extract metadata for logging/debugging

</details>

---

## Learner Activity 3: Make Your First LangChain Calls

**Practice**: Create and use your own LLM instances

### Exercise 1: Initialize Your LLM

**Task**: Create a ChatOpenAI instance with custom settings
**Expected Output**: Configured LLM ready to use

In [None]:
# Your code here
# TODO: Create a ChatOpenAI instance with:
# - model: gpt-4o-mini
# - temperature: 0.5
# - max_tokens: 100

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI

# Create your LLM instance
my_llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.5,      # Balanced creativity
    max_tokens=100       # Keep responses concise
)

# Verify configuration
print("Your LLM Configuration:")
print(f"✅ Model: {my_llm.model_name}")
print(f"✅ Temperature: {my_llm.temperature}")
print(f"✅ Max Tokens: {my_llm.max_tokens}")
print("\nYour LLM is ready to use!")
```

**Why these settings:**
- Temperature 0.5 balances consistency with creativity
- Max tokens 100 keeps responses focused
- gpt-4o-mini is perfect for learning

</details>

### Exercise 2: Make Your First Call

**Task**: Ask the LLM to explain LangChain in simple terms
**Expected Output**: Simple explanation of LangChain

In [None]:
# Your code here
# TODO: Use your LLM to get an explanation of LangChain

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI

# Initialize your LLM
my_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.5)

# Make your first call
prompt = "Explain LangChain to me like I'm a beginner programmer in 2-3 sentences."
response = my_llm.invoke(prompt)

# Display the response
print("🎓 LangChain Explained Simply:")
print("=" * 50)
print(response.content)
print("=" * 50)
print("\n✅ Success! You've used LangChain to get AI-generated content!")
```

**What just happened:**
- You sent a prompt to an LLM
- LangChain handled all the API complexity
- You received structured response data

</details>

### Exercise 3: Experiment with Parameters

**Task**: Compare outputs with different temperature settings
**Expected Output**: See how temperature affects responses

In [None]:
# Your code here
# TODO: Create two LLMs with temperature 0 and 1
# Ask both the same creative question
# Compare the outputs

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI

# Create two LLMs with different temperatures
deterministic_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
creative_llm = ChatOpenAI(model="gpt-4o-mini", temperature=1)

# Same prompt for both
prompt = "Invent a creative name for a AI-powered coffee shop."

# Get responses
deterministic_response = deterministic_llm.invoke(prompt)
creative_response = creative_llm.invoke(prompt)

# Compare outputs
print("Temperature Comparison:")
print("=" * 50)
print(f"Temperature 0 (Deterministic): {deterministic_response.content}")
print()
print(f"Temperature 1 (Creative): {creative_response.content}")
print("=" * 50)
print("\n💡 Notice how temperature affects creativity and consistency!")
```

**Key insight:**
- Temperature 0: Consistent, predictable outputs
- Temperature 1: More creative, varied outputs
- Choose based on your use case needs

</details>

---

## Optional Extra Practice

**Challenge yourself with these exercises that combine all concepts**

### Challenge 1: Build a Simple Q&A Function

**Task**: Create a reusable function that answers questions
**Expected Output**: A function that takes questions and returns answers

In [None]:
# Your code here

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI
import os

def create_qa_assistant(model="gpt-4o-mini", temperature=0.7):
    """Create a Q&A assistant using LangChain"""
    
    # Initialize LLM
    llm = ChatOpenAI(model=model, temperature=temperature)
    
    def ask_question(question):
        """Ask a question and get an answer"""
        try:
            # Add context to make answers more helpful
            enhanced_prompt = f"""Please provide a clear, helpful answer to this question:
            
            Question: {question}
            
            Answer in a friendly, informative way."""
            
            response = llm.invoke(enhanced_prompt)
            return response.content
        except Exception as e:
            return f"Error: {str(e)}"
    
    return ask_question

# Create and test the assistant
qa = create_qa_assistant(temperature=0.7)

# Test with different questions
questions = [
    "What is machine learning?",
    "How do I get started with Python?",
    "What are the benefits of using LangChain?"
]

for q in questions:
    print(f"Q: {q}")
    print(f"A: {qa(q)[:200]}...\n")  # Show first 200 chars
```

**Why this pattern is useful:**
- Encapsulates LLM logic
- Reusable across your application
- Easy to test and maintain

</details>

### Challenge 2: Multi-Provider Comparison

**Task**: Create a function that can use different LLM providers
**Expected Output**: Same interface for different providers

In [None]:
# Your code here

<details>
<summary>Solution</summary>

```python
from langchain_openai import ChatOpenAI
# from langchain_anthropic import ChatAnthropic  # If you have Anthropic key
# from langchain_google_genai import ChatGoogleGenerativeAI  # If you have Google key

def create_multi_provider_llm(provider="openai", model=None, temperature=0.7):
    """Create LLM instance for different providers"""
    
    if provider == "openai":
        model = model or "gpt-4o-mini"
        return ChatOpenAI(model=model, temperature=temperature)
    
    # Uncomment if you have other API keys:
    # elif provider == "anthropic":
    #     model = model or "claude-3-5-sonnet-20241022"
    #     return ChatAnthropic(model=model, temperature=temperature)
    
    # elif provider == "google":
    #     model = model or "gemini-1.5-flash"
    #     return ChatGoogleGenerativeAI(model=model, temperature=temperature)
    
    else:
        raise ValueError(f"Unknown provider: {provider}")

# Test with available providers
providers_to_test = ["openai"]  # Add others if you have keys

prompt = "What makes Python a great programming language? (Answer in one sentence)"

for provider in providers_to_test:
    try:
        llm = create_multi_provider_llm(provider)
        response = llm.invoke(prompt)
        print(f"{provider.upper()} Response:")
        print(f"{response.content}\n")
    except Exception as e:
        print(f"{provider.upper()} Error: {e}\n")

print("💡 Same interface, different providers - that's the power of LangChain!")
```

**Why multi-provider support matters:**
- Avoid vendor lock-in
- Use best model for each task
- Fallback options for reliability
- Cost optimization

</details>

---

## Summary & Next Steps

### What You've Learned
✅ What LangChain is and why it's valuable  
✅ How to set up your development environment  
✅ Making your first LangChain API calls  
✅ Understanding LLM parameters and responses  
✅ The core components of the LangChain ecosystem  

### Key Takeaways
1. **LangChain simplifies AI development** by providing a unified interface
2. **Environment setup is crucial** - always use environment variables for API keys
3. **Temperature controls creativity** - adjust based on your use case
4. **Same code, multiple providers** - avoid vendor lock-in
5. **Foundation for complex apps** - these basics enable RAG, agents, and more

### What's Next?
In the next notebook (`01_basic_llm_calls.ipynb`), you'll learn:
- Working with different message types (System, Human, AI)
- Building multi-turn conversations
- Streaming responses for better UX
- Error handling and retries
- Cost optimization techniques

### Resources
- [LangChain Documentation](https://python.langchain.com/)
- [OpenAI API Keys](https://platform.openai.com/api-keys)
- [LangChain GitHub](https://github.com/langchain-ai/langchain)
- [Community Discord](https://discord.gg/langchain)

---

🎉 **Congratulations!** You've completed your introduction to LangChain! You're now ready to build AI-powered applications.