# 🐍 Python Typing Fundamentals & Pydantic Integration

## 📋 Learning Objectives
By the end of this notebook, you'll understand:
- **Basic Type Hints** - Foundation of Python typing
- **Generic Types** - Flexible, reusable type definitions
- **Union & Optional** - Handling multiple possible types
- **Literal & TypedDict** - Precise value and structure definitions
- **Protocol & ABC** - Interface definitions
- **Pydantic Integration** - Real-world data validation

## 🎯 Why Typing Matters for LangChain/LangGraph
- **Data Validation** - Ensure correct input/output formats
- **IDE Support** - Better autocomplete and error detection
- **Documentation** - Self-documenting code
- **Runtime Safety** - Catch errors before they happen

---

## 1️⃣ Basic Type Hints - The Foundation

### 🔍 What Are Type Hints?
Type hints tell Python (and developers) what type of data a variable should hold.

### 🏗️ Basic Syntax Pattern:
```python
variable_name: Type = value
def function_name(param: InputType) -> ReturnType:
    return something

In [3]:
# Basic Type Hints Examples
from typing import List, Dict, Tuple, Set, Optional

# 🎯 Simple Variables
name: str = "LangChain User"
age: int = 25
temperature: float = 98.6
is_learning: bool = True

print(f"Name: {name} (type: {type(name).__name__})")
print(f"Age: {age} (type: {type(age).__name__})")

# 🎯 Function with Type Hints
def greet_user(username: str, times: int = 1) -> str:
    """
    Greets user specified number of times

    Args:
        username: The name to greet
        times: How many times to repeat greeting

    Returns:
        Formatted greeting string
    """
    return f"Hello {username}! " * times

# Test the function
result = greet_user("Python Developer", 2)
print(f"Greeting: {result}")

# 🎯 Collection Types
user_scores: List[int] = [85, 92, 78, 96]
user_data: Dict[str, str] = {"name": "Alice", "role": "Developer"}
coordinates: Tuple[float, float] = (40.7128, -74.0060)  # NYC coordinates
unique_tags: Set[str] = {"python", "typing", "langchain"}

print(f"Scores: {user_scores}")
print(f"User Data: {user_data}")
print(f"NYC Coordinates: {coordinates}")
print(f"Tags: {unique_tags}")

Name: LangChain User (type: str)
Age: 25 (type: int)
Greeting: Hello Python Developer! Hello Python Developer! 
Scores: [85, 92, 78, 96]
User Data: {'name': 'Alice', 'role': 'Developer'}
NYC Coordinates: (40.7128, -74.006)
Tags: {'langchain', 'python', 'typing'}


## 2️⃣ Generic Types - Flexible & Reusable

### 🔍 What Are Generics?
Generics allow you to create flexible, reusable type definitions that work with multiple types.

### 🏗️ Common Generic Types:
- `List[T]` - List containing items of type T
- `Dict[K, V]` - Dictionary with keys of type K and values of type V  
- `Optional[T]` - Either type T or None
- `Union[A, B]` - Either type A or type B

### 💡 Real-World Analogy:
Think of generics like a **container label** - you can have a "Box[Books]" or "Box[Tools]", but the box structure remains the same.

```mermaid
graph TD
    A[Generic Container] --> B[List[str]]
    A --> C[List[int]]  
    A --> D[Dict[str, Any]]
    B --> E["['hello', 'world']"]
    C --> F["[1, 2, 3]"]
    D --> G["{'key': 'value'}"]

In [4]:
from typing import List, Dict, Tuple, Optional, Union, Any, Generic, TypeVar

# 🎯 Generic Collections for LangChain-like Data
def process_chat_messages(messages: List[Dict[str, str]]) -> List[str]:
    """
    Process chat messages and extract content
    Similar to LangChain message processing
    """
    processed = []
    for msg in messages:
        role = msg.get("role", "unknown")
        content = msg.get("content", "")
        processed.append(f"[{role.upper()}]: {content}")
    return processed

# Sample chat data (similar to LangChain format)
chat_history: List[Dict[str, str]] = [
    {"role": "human", "content": "What is Python typing?"},
    {"role": "assistant", "content": "Python typing helps with code reliability..."},
    {"role": "human", "content": "Can you show examples?"}
]

processed_messages = process_chat_messages(chat_history)
for msg in processed_messages:
    print(msg)

print("\n" + "="*50)

# 🎯 Optional and Union Types (Very Common in LangChain)
def create_llm_config(
    model_name: str,
    temperature: Optional[float] = None,  # Could be None or float
    max_tokens: Union[int, str] = "auto"  # Could be int or string
) -> Dict[str, Any]:
    """
    Create LLM configuration (similar to LangChain patterns)
    """
    config = {"model": model_name}

    if temperature is not None:
        config["temperature"] = temperature

    config["max_tokens"] = max_tokens
    return config

# Test different configurations
config1 = create_llm_config("gpt-3.5-turbo")
config2 = create_llm_config("gpt-4", temperature=0.7, max_tokens=1000)

print("Config 1:", config1)
print("Config 2:", config2)

[HUMAN]: What is Python typing?
[ASSISTANT]: Python typing helps with code reliability...
[HUMAN]: Can you show examples?

Config 1: {'model': 'gpt-3.5-turbo', 'max_tokens': 'auto'}
Config 2: {'model': 'gpt-4', 'temperature': 0.7, 'max_tokens': 1000}


## 3️⃣ Literal & TypedDict - Precise Definitions

### 🔍 What is Literal?
`Literal` restricts values to specific options - perfect for API endpoints, model names, etc.

### 🔍 What is TypedDict?
`TypedDict` defines the exact structure of dictionaries - like a schema for data.

### 🏗️ LangChain Use Cases:
- **Model Names**: `Literal["gpt-3.5-turbo", "gpt-4", "claude"]`
- **Message Types**: `Literal["human", "assistant", "system"]`
- **Configuration Schemas**: Structured dictionaries with known keys

### 💡 Benefits:
- **Validation** - Only allowed values accepted
- **Autocomplete** - IDE knows exact options
- **Documentation** - Clear constraints

In [5]:
from typing import Literal, TypedDict, List, Optional, Union

# 🎯 Literal Types for LangChain-style Constraints
ModelName = Literal["gpt-3.5-turbo", "gpt-4", "claude-3", "llama-2"]
MessageRole = Literal["human", "assistant", "system"]
ChainType = Literal["conversation", "retrieval", "summary", "analysis"]

# 🎯 TypedDict for Structured Data (Like LangChain Messages)
class ChatMessage(TypedDict):
    role: MessageRole
    content: str
    timestamp: Optional[str]

class LLMConfig(TypedDict, total=False):  # total=False means some keys are optional
    model: ModelName
    temperature: float
    max_tokens: int
    system_prompt: str

# 🎯 Function using Literal and TypedDict
def create_chat_chain(
    model: ModelName,
    chain_type: ChainType,
    config: Optional[LLMConfig] = None
) -> Dict[str, Union[str, List[ChatMessage]]]:
    """
    Create a chat chain configuration (LangChain-style)
    """
    # Create base configuration
    chain_config = {
        "model": model,
        "type": chain_type,
        "messages": []
    }

    # Add optional configuration
    if config:
        chain_config.update(config)

    return chain_config

# 🎯 Test with valid values
messages: List[ChatMessage] = [
    {"role": "system", "content": "You are a helpful assistant", "timestamp": "2024-01-01"},
    {"role": "human", "content": "Explain Python typing", "timestamp": None},
    {"role": "assistant", "content": "Python typing helps...", "timestamp": "2024-01-01"}
]

config: LLMConfig = {
    "model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 1000
}

chain = create_chat_chain("gpt-4", "conversation", config)
print("Chain Configuration:")
print(chain)

print("\nMessages:")
for msg in messages:
    role = msg["role"]
    content = msg["content"][:50] + "..." if len(msg["content"]) > 50 else msg["content"]
    print(f"  {role.upper()}: {content}")

# 🎯 Try uncommenting this to see IDE warnings (if using an IDE with type checking)
# invalid_model = "invalid-model"  # This would cause a type error
# create_chat_chain(invalid_model, "conversation")  # Type checker would warn!

Chain Configuration:
{'model': 'gpt-4', 'type': 'conversation', 'messages': [], 'temperature': 0.7, 'max_tokens': 1000}

Messages:
  SYSTEM: You are a helpful assistant
  HUMAN: Explain Python typing
  ASSISTANT: Python typing helps...


## 4️⃣ Protocol & Abstract Base Classes

### 🔍 What is Protocol?
`Protocol` defines interfaces - what methods/attributes a class should have, without inheritance.

### 🔍 What are Abstract Base Classes (ABC)?
ABC defines a template that subclasses must follow - enforces implementation of specific methods.

### 🏗️ LangChain Use Cases:
- **Tool Interface**: All tools must have `name`, `description`, and `run()` method
- **Memory Interface**: All memory systems must implement `save()` and `load()`
- **Chain Interface**: All chains must implement `run()` or `invoke()`

### 💡 Protocol vs ABC:
- **Protocol** = Duck Typing ("If it walks like a duck...")
- **ABC** = Inheritance-based ("Must inherit from Duck class")

```mermaid
graph TD
    A[Protocol: Runnable] --> B[Has run method]
    A --> C[No inheritance needed]
    D[ABC: BaseChain] --> E[Must inherit]
    D --> F[Must implement abstract methods]
    
    G[Tool Class] --> A
    H[CustomChain] --> D

In [6]:
from typing import Protocol, Any, Dict, List
from abc import ABC, abstractmethod

# 🎯 Protocol Example (LangChain-style Tool Interface)
class RunnableProtocol(Protocol):
    """Protocol for any runnable component (like LangChain's Runnable)"""
    name: str
    description: str

    def run(self, input_data: Any) -> Any:
        """Execute the runnable with input data"""
        ...

class ToolProtocol(Protocol):
    """Protocol for tools (similar to LangChain tools)"""
    name: str
    description: str

    def execute(self, query: str) -> str:
        """Execute tool with query and return result"""
        ...

# 🎯 Classes that follow the Protocol (no inheritance needed!)
class WeatherTool:
    name = "weather_check"
    description = "Get current weather information"

    def execute(self, query: str) -> str:
        return f"Weather for {query}: Sunny, 72°F"

class CalculatorTool:
    name = "calculator"
    description = "Perform mathematical calculations"

    def execute(self, query: str) -> str:
        try:
            result = eval(query)  # Don't use eval in production!
            return f"Result: {result}"
        except:
            return "Invalid calculation"

# 🎯 Function that works with any tool following the protocol
def use_tool(tool: ToolProtocol, user_query: str) -> Dict[str, str]:
    """Use any tool that follows the ToolProtocol"""
    return {
        "tool_name": tool.name,
        "description": tool.description,
        "query": user_query,
        "result": tool.execute(user_query)
    }

# Test with different tools
weather = WeatherTool()
calc = CalculatorTool()

weather_result = use_tool(weather, "New York")
calc_result = use_tool(calc, "2 + 3 * 4")

print("Weather Tool Result:")
print(weather_result)
print("\nCalculator Tool Result:")
print(calc_result)

print("\n" + "="*50)

# 🎯 Abstract Base Class Example (LangChain-style Chain)
class BaseChain(ABC):
    """Abstract base class for all chains (similar to LangChain)"""

    def __init__(self, name: str):
        self.name = name

    @abstractmethod
    def invoke(self, input_data: Dict[str, Any]) -> Dict[str, Any]:
        """All chains must implement invoke method"""
        pass

    @abstractmethod
    def get_input_schema(self) -> Dict[str, str]:
        """All chains must define their input schema"""
        pass

class ConversationChain(BaseChain):
    """Concrete implementation of BaseChain"""

    def invoke(self, input_data: Dict[str, Any]) -> Dict[str, Any]:
        user_input = input_data.get("message", "")
        return {
            "response": f"Echo: {user_input}",
            "chain_name": self.name
        }

    def get_input_schema(self) -> Dict[str, str]:
        return {"message": "string - user input message"}

# Test the chain
chain = ConversationChain("echo_chain")
result = chain.invoke({"message": "Hello, LangChain!"})
schema = chain.get_input_schema()

print("Chain Result:")
print(result)
print("\nInput Schema:")
print(schema)

Weather Tool Result:
{'tool_name': 'weather_check', 'description': 'Get current weather information', 'query': 'New York', 'result': 'Weather for New York: Sunny, 72°F'}

Calculator Tool Result:
{'tool_name': 'calculator', 'description': 'Perform mathematical calculations', 'query': '2 + 3 * 4', 'result': 'Result: 14'}

Chain Result:
{'response': 'Echo: Hello, LangChain!', 'chain_name': 'echo_chain'}

Input Schema:
{'message': 'string - user input message'}


## 5️⃣ Pydantic Integration - Data Validation Powerhouse

### 🔍 What is Pydantic?
Pydantic uses Python type hints to validate data at runtime - it's the backbone of modern Python APIs.

### 🏗️ Key Features:
- **Automatic Validation** - Converts and validates data based on type hints
- **Error Messages** - Clear, helpful validation errors
- **Serialization** - Easy conversion to/from JSON, dict
- **IDE Support** - Full autocomplete and type checking

### 💡 LangChain Connection:
LangChain heavily uses Pydantic for:
- **Message Models** - Structured chat messages
- **Tool Schemas** - Input/output validation for tools
- **Chain Configuration** - Validating parameters
- **Agent State** - Managing agent memory and state

### 🎯 Why Pydantic + Typing?
```python
# Without Pydantic (error-prone)
config = {"model": "gpt-4", "temp": "0.7"}  # temp should be float!

# With Pydantic (automatic validation & conversion)
class Config(BaseModel):
    model: str
    temperature: float = 0.0

config = Config(model="gpt-4", temperature="0.7")  # Auto-converts "0.7" to 0.7!

In [9]:
# Install pydantic if not already installed
from pydantic import BaseModel, Field, validator, root_validator
from typing import List, Optional, Literal, Dict, Any
from datetime import datetime

# 🎯 Basic Pydantic Model (LangChain Message Style)
class ChatMessage(BaseModel):
    role: Literal["human", "assistant", "system"]
    content: str
    timestamp: Optional[datetime] = None
    metadata: Dict[str, Any] = {}

    class Config:
        # Allow extra fields (useful for extensibility)
        extra = "allow"
        # Use enum values (for literals)
        use_enum_values = True

# 🎯 Advanced Model with Validation (LangChain LLM Config Style)
class LLMConfig(BaseModel):
    model_name: Literal["gpt-3.5-turbo", "gpt-4", "claude-3"] = "gpt-3.5-turbo"
    temperature: float = Field(default=0.0, ge=0.0, le=2.0, description="Controls randomness")
    max_tokens: int = Field(default=1000, gt=0, le=4000, description="Maximum tokens to generate")
    system_prompt: Optional[str] = Field(default=None, max_length=1000)

    @validator('temperature')
    def validate_temperature(cls, v):
        """Custom validation for temperature"""
        if v < 0 or v > 2:
            raise ValueError('Temperature must be between 0 and 2')
        return v

    @validator('system_prompt')
    def validate_system_prompt(cls, v):
        """Ensure system prompt is meaningful if provided"""
        if v is not None and len(v.strip()) < 10:
            raise ValueError('System prompt must be at least 10 characters')
        return v

# 🎯 Complex Model with Nested Validation (LangChain Chain Style)
class ChatChain(BaseModel):
    name: str = Field(..., min_length=1, description="Chain identifier")
    llm_config: LLMConfig
    messages: List[ChatMessage] = []
    memory_size: int = Field(default=10, ge=1, le=100)

    @root_validator
    def validate_chain(cls, values):
        """Validate the entire chain configuration"""
        name = values.get('name')
        messages = values.get('messages', [])

        # Ensure at least one system message if messages exist
        if messages:
            system_messages = [msg for msg in messages if msg.role == "system"]
            if not system_messages:
                # Add default system message
                default_system = ChatMessage(
                    role="system",
                    content="You are a helpful assistant.",
                    timestamp=datetime.now()
                )
                values['messages'] = [default_system] + messages

        return values

    def add_message(self, role: str, content: str) -> None:
        """Add a new message to the chain"""
        message = ChatMessage(
            role=role,
            content=content,
            timestamp=datetime.now()
        )
        self.messages.append(message)

        # Keep only recent messages based on memory_size
        if len(self.messages) > self.memory_size:
            # Keep system messages + recent messages
            system_msgs = [msg for msg in self.messages if msg.role == "system"]
            other_msgs = [msg for msg in self.messages if msg.role != "system"]
            recent_msgs = other_msgs[-(self.memory_size - len(system_msgs)):]
            self.messages = system_msgs + recent_msgs

# Test the models
print("🎯 Testing Pydantic Models")
print("="*40)

# Create LLM config
llm_config = LLMConfig(
    model_name="gpt-4",
    temperature=0.7,
    max_tokens=2000,
    system_prompt="You are an expert Python developer specializing in typing and Pydantic."
)

print("LLM Config:")
print(llm_config.json(indent=2))

# Create chat chain
chain = ChatChain(
    name="python_tutor",
    llm_config=llm_config
)

# Add messages
chain.add_message("human", "What is Pydantic?")
chain.add_message("assistant", "Pydantic is a data validation library...")

print(f"\nChain Messages ({len(chain.messages)} total):")
for msg in chain.messages:
    print(f"  [{msg.role.upper()}]: {msg.content[:50]}...")

# Test validation errors
print("\n🚨 Testing Validation Errors:")
try:
    invalid_config = LLMConfig(temperature=3.0)  # Too high!
except Exception as e:
    print(f"Temperature Error: {e}")

try:
    invalid_config = LLMConfig(system_prompt="Short")  # Too short!
except Exception as e:
    print(f"System Prompt Error: {e}")

<ipython-input-9-4fdb35a6fbdf>:26: PydanticDeprecatedSince20: Pydantic V1 style `@validator` validators are deprecated. You should migrate to Pydantic V2 style `@field_validator` validators, see the migration guide for more details. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  @validator('temperature')
<ipython-input-9-4fdb35a6fbdf>:33: PydanticDeprecatedSince20: Pydantic V1 style `@validator` validators are deprecated. You should migrate to Pydantic V2 style `@field_validator` validators, see the migration guide for more details. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  @validator('system_prompt')
<ipython-input-9-4fdb35a6fbdf>:47: PydanticDeprecatedSince20: Pydantic V1 style `@root_validator` validators are deprecated. You should migrate to Pydantic V2 style `@model_validator` validators, see the migration gu

PydanticUserError: If you use `@root_validator` with pre=False (the default) you MUST specify `skip_on_failure=True`. Note that `@root_validator` is deprecated and should be replaced with `@model_validator`.

For further information visit https://errors.pydantic.dev/2.11/u/root-validator-pre-skip

## 6️⃣ Real-World Pydantic Patterns for LangChain

### 🔍 Common Patterns You'll See:
1. **Tool Input/Output Schemas** - Define what tools expect and return
2. **Agent State Management** - Track agent memory and context
3. **Configuration Validation** - Ensure valid parameters
4. **API Response Models** - Structure external API responses

### 💡 Best Practices:
- **Use Field() for constraints** - min/max values, descriptions
- **Custom validators** - Business logic validation
- **Nested models** - Compose complex structures
- **Default values** - Sensible fallbacks

### 🎯 Next Steps:
- Practice with your own models
- Explore Pydantic documentation
- Try integrating with actual APIs
- Build validation for your specific use cases

In [10]:
# 🎯 Real-World Example: Building a Tool with Pydantic Validation
from pydantic import BaseModel, Field, validator
from typing import List, Dict, Any, Optional, Union
import json

class ToolInput(BaseModel):
    """Input schema for tools (similar to LangChain tools)"""
    query: str = Field(..., min_length=1, max_length=500, description="The user's query")
    context: Optional[Dict[str, Any]] = Field(default=None, description="Additional context")

class ToolOutput(BaseModel):
    """Output schema for tools"""
    result: str = Field(..., description="The tool's response")
    confidence: float = Field(default=1.0, ge=0.0, le=1.0, description="Confidence score")
    metadata: Dict[str, Any] = Field(default_factory=dict, description="Additional information")

class Tool(BaseModel):
    """A complete tool definition with Pydantic validation"""
    name: str = Field(..., regex=r'^[a-zA-Z_][a-zA-Z0-9_]*$', description="Tool identifier")
    description: str = Field(..., min_length=10, description="What the tool does")
    input_schema: type[ToolInput] = ToolInput
    output_schema: type[ToolOutput] = ToolOutput

    def run(self, input_data: Union[Dict[str, Any], ToolInput]) -> ToolOutput:
        """Execute the tool with validated input"""
        # Validate input
        if isinstance(input_data, dict):
            validated_input = self.input_schema(**input_data)
        else:
            validated_input = input_data

        # Mock tool execution (replace with actual logic)
        result = f"Processed query: '{validated_input.query}'"
        if validated_input.context:
            result += f" with context: {validated_input.context}"

        return self.output_schema(
            result=result,
            confidence=0.95,
            metadata={"tool_name": self.name, "processed_at": "2024-01-01"}
        )

    class Config:
        arbitrary_types_allowed = True

# Create and test a tool
search_tool = Tool(
    name="web_search",
    description="Search the web for information on any topic"
)

# Test with valid input
test_input = ToolInput(
    query="What is Python typing?",
    context={"user_level": "beginner"}
)

result = search_tool.run(test_input)
print("Tool Result:")
print(result.json(indent=2))

# Test with dictionary input (auto-validation)
dict_input = {
    "query": "How does Pydantic work?",
    "context": {"source": "tutorial"}
}

result2 = search_tool.run(dict_input)
print(f"\nSecond Result: {result2.result}")
print(f"Confidence: {result2.confidence}")

# 🎯 Summary of Key Concepts
print("\n" + "="*60)
print("🎓 KEY CONCEPTS LEARNED:")
print("="*60)
concepts = [
    "✅ Basic Type Hints - Foundation of type safety",
    "✅ Generic Types - Flexible, reusable definitions",
    "✅ Literal & TypedDict - Precise constraints",
    "✅ Protocol & ABC - Interface definitions",
    "✅ Pydantic Models - Runtime validation powerhouse",
    "✅ Real-world patterns - Ready for LangChain/LangGraph!"
]

for concept in concepts:
    print(concept)

print("\n🚀 Next: Advanced typing patterns for LangChain/LangGraph!")

PydanticUserError: `regex` is removed. use `pattern` instead

For further information visit https://errors.pydantic.dev/2.11/u/removed-kwargs

In [None]:
#Act as a Python Expert who have worked on various advance concepts of Typing Libraries in Python with various Real life like use cases. now I am learning langchain and langgraph and in that I have seen Typing lybraries been used in various use case time to time, so I though I should Learn Typing for myself. Now as someone who never know anything about typing. help me understand and learn best practices of typing library used on regular bases, and also help me learn how its useful while using pydantic, things I need to know in typing before learning langchain and langgraph. Dont explain every single thing in typing just focus on the components used in above mention use cases. "https://medium.com/@moraneus/exploring-the-power-of-pythons-typing-library-ff32cec44981", for teaching me all this you can help me create an google colab notebook ipynb file where you are first explaining a component, concept or practices, then explaining it through a simple program through sample code that can help me understand those component, concept or practices completely. make sure to create proper markdown and comments with flowchart(if required and can use tags and all for it) so I can understand theory and maths that works behind during that component, concept or practices, make sure to keep the explaination brief short and beginner-friendly. and only cover key topics not all of them mostly focus on Typing Libraries in Python with its best component, concept or practices and its most common usecase with pydantic. and then Typing Librarie's best component, concept or practices with respect to langchain and langgraph use case.

In [None]:
#Act as a Python Expert who have worked on various advance concepts of Typing Lybraries in Python with various Real life like use cases. now I am learning langchain and langgraph and in that I have seen Typing lybraries been used in various use case time to time, so I though I should Learn Typing for myself. Now as someone who never know anything about typing. help me understand and learn best practices of typing library used on regular bases, and also help me learn how its usefull while using pydantic, things I need to know in typing before learning langchain and langgraph. Dont explain every single thing in typing just focus on the components used in above mention use cases. "https://medium.com/@moraneus/exploring-the-power-of-pythons-typing-library-ff32cec44981", for teaching me all this you can help me create an google colab notebook ipynb file where you are first explaining a component, concept or practices, then explaining it through a simple program through sample code that can help me understand those component, concept or practices compleatly. make sure to create proper markdown and comments with flowchart(if required and can use tags and all for it) so I can understand theory and maths that works behind during that component, concept or practices like beginner-friendly.