## Structured Learning Agent using LangGraph
In this notebook, we will build a structured learning agent using LangGraph. The system will guide learners through a structured learning process of defined but customizable checkpoints. Verifying understanding at each step and providing feynman-style teaching when needed.

## Motivation
- Access to personalized 1:1 tutoring is expensive and not accessible to everyone.
- Provide individualized learning experience to each learner and feedback 24/7.
- Use own notes and web-retrieved content as context.
- Offer patient, simple explanations of complex topics.

## Key Components
1. Learning State Graph: Orchestrates the sequential learning workflow.
2. Checkpoint System: Defines structured learning milestones.
3. Web Search Integration: Dynamically retrives relevant learning materials.
4. Context Processing: Chunks and process learning materials.
5. Question Generation: Creates checkpoint-specific verification questions.
6. Understanding Verification: Evaluates learner comprehension with a clear threshold.
7. Feynman-style Teaching: Provides patient, simple explanations of complex topics.

## Method
The system follows a structured learning cycle.
1. Checkpoint Definition
* Generate sequential learning milestones with clear success criteria.

2. Context Building
* Processes student-provided materials or retrieves relevant web content.
3. Context Validation
* Validates context based on checkpoint criteria.
* Performs additional web searches if context doesn't meet criteria.
4. Embedding Storage
* Stores embeddings for retrieving only relevant chunks during verification.
5. Understanding Verification
* Generates checkpoint-specific questions.
* Evaluates learner's answers against correct answers.
* Provides clear feedback on understanding level.
6. Progressive Learning
* Advances to the next checkpoint when understanding is verified.
* Provides Feynman-style teaching when needed.

### Conclusion
This structured learning agent provides a personalized, 24/7 learning experience. It can be easily extended to include additional features such as progress tracking, personalized recommendations, and more.




## Requirements
#!pip install langchain-community langchain-openai langgraph pydantic python-dotenv semantic-chunkers semantic-router tavily-python

In [2]:
import os 
import operator
import uuid
from typing import Annotated, Dict, List, Optional, Tuple, TypedDict

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from IPython.display import Image, display
from langchain_community.utils.math import cosine_similarity
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.messages import HumanMessage, SystemMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, StateGraph, START
from pydantic import BaseModel, Field
from dotenv import load_dotenv
from semantic_chunkers import StatisticalChunker
from semantic_router.encoders import OpenAIEncoder

  from .autonotebook import tqdm as notebook_tqdm


Setup
This agent is implemented using OpenAI's models, but can be used also with self-hosted LLM and embedding models.

In [3]:
load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')
os.environ["TAVILY_API_KEY"] = os.getenv("TAVILY_API_KEY")

tavily_search = TavilySearchResults(max_results=3)
llm = ChatOpenAI(model="gpt-4o", temperature=0)
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

  tavily_search = TavilySearchResults(max_results=3)


## Data Models Definition
We will define data structures for our adaptive learning system using pydantic models. These models type safety and provide clear structure for:
* Learning goals and objectives
* Checkpoint definitions and tracking
* Search Queries for dynamic content.
* Verification of learning progres.
* Feynman teaching output format.
* Question generation.

Each model is designed to capture specific aspects of the learning process while maintaining type safety.

In [4]:
class Goals(BaseModel):
    """ Structure for defining learning goals."""
    goals: str = Field(None, description="Learning goals")

class LearningCheckpoint(BaseModel):
    """ Structure for a single checkpoint """
    description: str = Field(..., description="Main checkpoint description")
    criteria: List[str] = Field(...,description="List of success criteria")
    verification: str = Field(..., description="How to verify this checkpoint")

class Checkpoints(BaseModel):
    """ Main checkpoints container with index tracking """
    checkpoints: List[LearningCheckpoint] = Field(
        ...,
        description="List of checkpoints covering foundation, applicationm and mastery levels"
    )

class SearchQuery(BaseModel):
    """ Structure for search query collection"""
    search_queries: list = Field(None, description="Search queries for retrievel.")

class LearningVerification(BaseModel):
    """Structure for verification results"""
    understanding_level: float = Field(...,ge=0, le=1)
    feedback: str
    suggestions: List[str]
    context_alignment:bool

class FeynmanTeaching(BaseModel):
    """ Structure for feynman teaching Method """
    simplified_explanation: str
    KeyConcepts: List[str]
    analogies: List[str]
    
class QuestionOutput(BaseModel):
    """ Structure for question output """
    questions: str

class InContext(BaseModel):
    """ Structure for context verification """
    is_in_context: str = Field(..., description="Yes or No")

## Learning State Definition
* The Learning topic and goals
* Context and search results
* Current progress through checkpoints
* Verification results and teaching outputs
* Current question-answer pair

In [5]:
class LearningState(BaseModel):
    topic: str
    goals: List[str]
    context: str
    context_chunks: Annotated[list, operator.add]
    context_key: str
    search_queries: SearchQuery
    checkpoints: Checkpoints
    verifications: LearningVerification
    teachings: FeynmanTeaching
    current_checkpoint: int
    current_question: QuestionOutput
    current_answer: str

## Helper Functions
The system uses three utility functions:
1. extract_content_from_chunks: Processes and combines text chunks into coherent content.
2. Format_checkpoints_as_message: Converts checkpoint data into prompt format
3. generate_checkpoint_message: Creates formatted message for context retrievel

In [7]:
def extract_content_from_chunks(chunks):
    """Extract and combine content from chunks with splits attribute.
    
    Args:
        chunks: List of chunk objects that may contain splits attribute
        
    Returns:
        str: Combined content from all chunks joined with newlines
    """
    content = []
    
    for chunk in chunks:
        if hasattr(chunk, 'splits') and chunk.splits:
            chunk_content = ' '.join(chunk.splits)
            content.append(chunk_content)
    
    return '\n'.join(content)

def format_checkpoints_as_message(checkpoints: Checkpoints) -> str:
    """Convert Checkpoints object to a formatted string for the message.
    
    Args:
        checkpoints (Checkpoints): Checkpoints object containing learning checkpoints
        
    Returns:
        str: Formatted string containing numbered checkpoints with descriptions and criteria
    """
    message = "Here are the learning checkpoints:\n\n"
    for i, checkpoint in enumerate(checkpoints.checkpoints, 1):
        message += f"Checkpoint {i}:\n"
        message += f"Description: {checkpoint.description}\n"
        message += "Success Criteria:\n"
        for criterion in checkpoint.criteria:
            message += f"- {criterion}\n"
    return message

def generate_checkpoint_message(checks: List[LearningCheckpoint]) -> HumanMessage:
    """Generate a formatted message for learning checkpoints that need context.
    
    Args:
        checks (List[LearningCheckpoint]): List of learning checkpoint objects
        
    Returns:
        HumanMessage: Formatted message containing checkpoint descriptions, criteria and 
                     verification methods, ready for context search
    """
    formatted_checks = []
    
    for check in checks:
        checkpoint_text = f"""
        Description: {check.description}
        Success Criteria:
        {chr(10).join(f'- {criterion}' for criterion in check.criteria)}
        Verification Method: {check.verification}
        """
        formatted_checks.append(checkpoint_text)
    
    all_checks = "\n---\n".join(formatted_checks)
    
    checkpoints_message = HumanMessage(content=f"""The following learning checkpoints need additional context:
        {all_checks}
        
        Please generate search queries to find relevant information.""")
    
    return checkpoints_message

## Prompt Configuration
Here we will define core instructions prompts for our LLM. Each Message serves a specific purpose in the learning process.
1. learning_checkpoint_generator: Creates structured learning milestones with clear criteria.
2. checkpoint_based_query_generator: Generates targeted search queries for content retrieval.
3. question_generator: Creates verification questions aligned with checkpoints.
4. answer_verifier: Evaluates learner responses against success criteria.
5. feyman_teacher: Crafts simplified explanations using Feynman technique.