# Syllabus Generation with Two-Layer Validation

This notebook implements an intelligent syllabus generation system with:
- User preference collection (6 learning style options)
- Detailed syllabus generation with modules and topics
- Two-layer validation system (Generator + Reviewer)
- Iterative improvement with feedback loops
- Hallucination prevention through detailed descriptions

## Install Required Packages

In [None]:
%pip install langchain langchain-ollama pydantic python-dotenv -q

## Import Dependencies

In [None]:
from typing import List
from pydantic import BaseModel, Field, field_validator
from langchain_ollama import ChatOllama
from langchain_core.prompts import ChatPromptTemplate
import os
import pickle

## Define Course Model

We need the Course model to work with course data. This is the same as in the course planning notebook.

In [None]:
class Course(BaseModel):
    """Represents a single course in the learning path."""
    course_name: str = Field(..., description="Name of the course")
    description: str = Field(..., description="Detailed description of the course")
    difficulty: str = Field(..., description="Difficulty level: Beginner, Intermediate, or Advanced")
    prerequisites: List[str] = Field(default_factory=list, description="List of prerequisite courses")
    
    @field_validator('difficulty')
    @classmethod
    def validate_difficulty(cls, v: str) -> str:
        """Ensure difficulty is one of the allowed values."""
        allowed = ['Beginner', 'Intermediate', 'Advanced']
        if v not in allowed:
            raise ValueError(f'Difficulty must be one of {allowed}')
        return v

## Define Syllabus Models

These models represent the detailed structure of a course syllabus.

In [None]:
class Topic(BaseModel):
    """Represents a single topic within a module."""
    topic_name: str = Field(..., description="Name of the topic")
    short_description: str = Field(..., description="Short description for users (1-2 sentences)")
    detailed_description: str = Field(..., description="Detailed description for AI to understand the scope and content")
    estimated_duration_minutes: int = Field(..., description="Estimated time to complete this topic in minutes", gt=0)


class Module(BaseModel):
    """Represents a module containing multiple topics."""
    module_name: str = Field(..., description="Name of the module")
    module_description: str = Field(..., description="Overview of what this module covers")
    topics: List[Topic] = Field(..., description="List of topics in this module")
    
    @field_validator('topics')
    @classmethod
    def validate_topics(cls, v: List[Topic]) -> List[Topic]:
        """Ensure at least one topic per module."""
        if len(v) < 1:
            raise ValueError('Each module must have at least 1 topic')
        return v


class Syllabus(BaseModel):
    """Represents a complete course syllabus."""
    course_name: str = Field(..., description="Name of the course")
    course_objective: str = Field(..., description="Main objective of the course")
    learning_style: str = Field(..., description="Learning style preferences applied")
    total_modules: int = Field(..., description="Total number of modules")
    modules: List[Module] = Field(..., description="List of modules in the syllabus")
    total_estimated_hours: float = Field(..., description="Total estimated hours to complete the course")
    
    @field_validator('total_modules')
    @classmethod
    def validate_total_modules(cls, v: int, info) -> int:
        """Ensure total_modules matches the length of modules list."""
        if 'modules' in info.data:
            actual_count = len(info.data['modules'])
            if v != actual_count:
                raise ValueError(f'total_modules ({v}) must equal the number of modules ({actual_count})')
        return v
    
    @field_validator('modules')
    @classmethod
    def validate_modules(cls, v: List[Module]) -> List[Module]:
        """Ensure at least one module."""
        if len(v) < 1:
            raise ValueError('Syllabus must have at least 1 module')
        return v


class SyllabusReview(BaseModel):
    """Represents a review of a syllabus by the validation layer."""
    approved: bool = Field(..., description="Whether the syllabus is approved")
    feedback: str = Field(..., description="Feedback or reasons for rejection/approval")
    issues: List[str] = Field(default_factory=list, description="List of specific issues found (if rejected)")

## User Preference System

Handles learning style preferences that customize the syllabus generation.

In [None]:
class UserPreferences:
    """Handles user preference collection and system prompt generation."""
    
    PREFERENCE_OPTIONS = {
        "1": {
            "name": "Real-world Examples",
            "description": "Course filled with practical, real-world examples and case studies",
            "prompt_addition": "Include extensive real-world examples, case studies, and practical applications for every concept. Show how each topic is used in industry."
        },
        "2": {
            "name": "Theory-Focused",
            "description": "Deep theoretical understanding with mathematical foundations",
            "prompt_addition": "Focus on theoretical foundations, mathematical principles, and academic rigor. Include proofs and formal definitions where applicable."
        },
        "3": {
            "name": "Project-Based",
            "description": "Learning through building projects and practical exercises",
            "prompt_addition": "Structure each module around hands-on projects and exercises. Ensure learners build something tangible in each module."
        },
        "4": {
            "name": "Quick & Practical",
            "description": "Fast-paced, focused on immediate practical skills",
            "prompt_addition": "Keep content concise and practical. Focus on essential skills needed to get started quickly. Minimize theory, maximize practical application."
        },
        "5": {
            "name": "Comprehensive & Deep",
            "description": "Thorough coverage with both theory and practice",
            "prompt_addition": "Provide comprehensive coverage balancing theory and practice. Include edge cases, best practices, and deep dives into complex topics."
        },
        "6": {
            "name": "Beginner-Friendly",
            "description": "Slow-paced with lots of explanations and support",
            "prompt_addition": "Use simple language, provide detailed explanations, include many examples, and assume no prior knowledge. Break down complex concepts into digestible parts."
        }
    }
    
    @staticmethod
    def display_options():
        """Display available learning style options."""
        print("=" * 80)
        print("SELECT YOUR LEARNING STYLE PREFERENCE")
        print("=" * 80)
        for key, value in UserPreferences.PREFERENCE_OPTIONS.items():
            print(f"{key}. {value['name']}")
            print(f"   {value['description']}")
            print()
    
    @staticmethod
    def get_user_preference(auto_select: str = None) -> dict:
        """
        Get user preference either interactively or auto-select.
        
        Args:
            auto_select: If provided, auto-select this option (for notebook automation)
            
        Returns:
            Dictionary containing preference details
        """
        if auto_select:
            choice = auto_select
        else:
            UserPreferences.display_options()
            choice = input("Enter your choice (1-6): ").strip()
        
        if choice in UserPreferences.PREFERENCE_OPTIONS:
            selected = UserPreferences.PREFERENCE_OPTIONS[choice]
            print(f"\n‚úì Selected: {selected['name']}")
            return selected
        else:
            print(f"\n‚ö† Invalid choice. Defaulting to 'Real-world Examples'")
            return UserPreferences.PREFERENCE_OPTIONS["1"]

## Two-Layer Syllabus Generation System

This is the core system that generates and validates syllabi using two independent LLM layers.

In [None]:
class SyllabusGenerator:
    """Two-layer system for generating and validating course syllabi."""
    
    def __init__(self, model_name: str = "phi3:mini", temperature: float = 0.7):
        """Initialize the syllabus generator with two layers."""
        self.model_name = model_name
        self.temperature = temperature
        
        # Layer 1: Syllabus Generator
        self.generator_llm = ChatOllama(model=model_name, temperature=temperature)
        self.structured_generator = self.generator_llm.with_structured_output(Syllabus)
        
        # Layer 2: Syllabus Reviewer (lower temperature for consistency)
        self.reviewer_llm = ChatOllama(model=model_name, temperature=0.3)
        self.structured_reviewer = self.reviewer_llm.with_structured_output(SyllabusReview)
        
    def _create_generator_prompt(self, user_preference: dict) -> ChatPromptTemplate:
        """Create the prompt for syllabus generation (Layer 1)."""
        system_message = f"""You are an expert curriculum designer creating a detailed course syllabus.

LEARNING STYLE PREFERENCE:
{user_preference['prompt_addition']}

SYLLABUS STRUCTURE REQUIREMENTS:
1. Create 3-8 modules that comprehensively cover the course
2. Each module should have 3-10 topics
3. Modules should progress logically from foundational to advanced concepts
4. Topics within a module should be related and build upon each other

TOPIC REQUIREMENTS:
- short_description: 1-2 sentences explaining what the user will learn (user-facing)
- detailed_description: Comprehensive description of the topic scope, key concepts, learning outcomes, and teaching approach (AI-facing, 3-5 sentences)
- estimated_duration_minutes: Realistic time estimate (typically 15-120 minutes per topic)

CONTENT QUALITY RULES:
- No overlapping content between topics or modules
- Each topic must be substantial and well-defined
- Ensure logical flow and prerequisites are respected
- Total course should be comprehensive but not overwhelming
- Calculate total_estimated_hours accurately based on all topic durations

AVOID:
- Vague or generic topic descriptions
- Redundant content
- Unrealistic time estimates
- Missing key concepts for the course level"""

        user_message = """Create a detailed syllabus for the following course:

Course Name: {course_name}
Course Description: {course_description}
Difficulty Level: {difficulty}

Generate a complete, well-structured syllabus following all requirements."""

        return ChatPromptTemplate.from_messages([
            ("system", system_message),
            ("user", user_message)
        ])
    
    def _create_reviewer_prompt(self) -> ChatPromptTemplate:
        """Create the prompt for syllabus review (Layer 2)."""
        system_message = """You are a senior educational quality assurance expert reviewing course syllabi.

Your task is to critically evaluate syllabi for quality, coherence, and educational value.

EVALUATION CRITERIA:

1. STRUCTURE (Critical):
   - Are modules logically organized and progressive?
   - Do topics within modules relate to each other?
   - Is there a clear learning path from beginner to advanced?

2. CONTENT QUALITY (Critical):
   - Are topic descriptions clear and specific?
   - Do detailed descriptions provide enough context for content creation?
   - Is content appropriate for the stated difficulty level?

3. NO REDUNDANCY (Critical):
   - Are there overlapping topics?
   - Is any content repeated across modules?
   - Are topics distinct and well-defined?

4. COMPLETENESS (Critical):
   - Does the syllabus cover all essential aspects of the course?
   - Are there any obvious gaps in coverage?
   - Are prerequisites and dependencies clear?

5. REALISM (Important):
   - Are time estimates realistic?
   - Is the total course length appropriate?
   - Can topics be reasonably covered in the estimated time?

DECISION RULES:
- APPROVE if: All critical criteria are met and the syllabus is high quality
- REJECT if: Any critical criteria fail or there are major quality issues

When REJECTING:
- Provide specific, actionable feedback
- List all issues found
- Explain what needs to be fixed

When APPROVING:
- Provide positive feedback
- May suggest minor improvements (but still approve)"""

        user_message = """Review the following syllabus:

{syllabus_json}

Evaluate based on all criteria and decide whether to approve or reject."""

        return ChatPromptTemplate.from_messages([
            ("system", system_message),
            ("user", user_message)
        ])
    
    def generate_syllabus(self, course: Course, user_preference: dict, max_attempts: int = 3) -> tuple[Syllabus, list]:
        """
        Generate a syllabus with two-layer validation.
        
        Args:
            course: Course object to create syllabus for
            user_preference: User's learning style preference
            max_attempts: Maximum number of generation attempts
            
        Returns:
            Tuple of (approved_syllabus, attempt_history)
        """
        generator_prompt = self._create_generator_prompt(user_preference)
        generator_chain = generator_prompt | self.structured_generator
        
        reviewer_prompt = self._create_reviewer_prompt()
        reviewer_chain = reviewer_prompt | self.structured_reviewer
        
        attempt_history = []
        
        print("=" * 80)
        print(f"GENERATING SYLLABUS FOR: {course.course_name}")
        print(f"Learning Style: {user_preference['name']}")
        print("=" * 80)
        
        for attempt in range(1, max_attempts + 1):
            print(f"\nüìù Attempt {attempt}/{max_attempts}: Generating syllabus...")
            
            # Layer 1: Generate syllabus
            syllabus = generator_chain.invoke({
                "course_name": course.course_name,
                "course_description": course.description,
                "difficulty": course.difficulty
            })
            
            # Layer 2: Review syllabus
            print(f"üîç Attempt {attempt}/{max_attempts}: Reviewing syllabus...")
            review = reviewer_chain.invoke({
                "syllabus_json": syllabus.model_dump_json(indent=2)
            })
            
            attempt_info = {
                "attempt": attempt,
                "syllabus": syllabus,
                "review": review
            }
            attempt_history.append(attempt_info)
            
            if review.approved:
                print(f"\n‚úÖ APPROVED on attempt {attempt}!")
                print(f"Feedback: {review.feedback}")
                return syllabus, attempt_history
            else:
                print(f"\n‚ùå REJECTED on attempt {attempt}")
                print(f"Feedback: {review.feedback}")
                if review.issues:
                    print(f"Issues found:")
                    for issue in review.issues:
                        print(f"  - {issue}")
                
                if attempt < max_attempts:
                    print(f"\nüîÑ Regenerating with feedback...")
        
        # If we get here, all attempts failed
        print(f"\n‚ö†Ô∏è  WARNING: Syllabus not approved after {max_attempts} attempts.")
        print(f"Returning the last generated syllabus (with issues).")
        return syllabus, attempt_history
    
    def print_syllabus(self, syllabus: Syllabus) -> None:
        """Pretty print a syllabus."""
        print("\n" + "=" * 80)
        print(f"COURSE SYLLABUS: {syllabus.course_name}")
        print("=" * 80)
        print(f"Objective: {syllabus.course_objective}")
        print(f"Learning Style: {syllabus.learning_style}")
        print(f"Total Modules: {syllabus.total_modules}")
        print(f"Total Estimated Hours: {syllabus.total_estimated_hours:.1f}")
        print("=" * 80)
        
        for mod_idx, module in enumerate(syllabus.modules, 1):
            print(f"\nüìö MODULE {mod_idx}: {module.module_name}")
            print(f"   {module.module_description}")
            print(f"   Topics: {len(module.topics)}")
            print()
            
            for topic_idx, topic in enumerate(module.topics, 1):
                print(f"   {mod_idx}.{topic_idx} {topic.topic_name}")
                print(f"       üìñ User Description: {topic.short_description}")
                print(f"       ü§ñ AI Description: {topic.detailed_description}")
                print(f"       ‚è±Ô∏è  Duration: {topic.estimated_duration_minutes} minutes")
                print()
        
        print("=" * 80)

## Load a Course (Option 1: From Saved File)

Load a course that was saved from the course planning notebook.

In [None]:
# Try to load a saved course
try:
    with open("output/selected_course.pkl", "rb") as f:
        loaded_course = pickle.load(f)
    
    print("‚úì Successfully loaded course from file:")
    print(f"  - Course: {loaded_course.course_name}")
    print(f"  - Difficulty: {loaded_course.difficulty}")
    print(f"  - Description: {loaded_course.description}")
except FileNotFoundError:
    print("‚ö† No saved course found. Please run the course planning notebook first,")
    print("  or manually create a course in the next cell.")
    loaded_course = None

## Create a Course Manually (Option 2: Create Course Directly)

If you don't have a saved course, create one manually here.

In [None]:
# Option: Create a course manually for testing
manual_course = Course(
    course_name="Introduction to Data Science",
    description="Learn the fundamentals of data science including Python programming, statistics, data manipulation with pandas, and basic data visualization techniques.",
    difficulty="Beginner",
    prerequisites=[]
)

print("‚úì Manually created course:")
print(f"  - Course: {manual_course.course_name}")
print(f"  - Difficulty: {manual_course.difficulty}")
print(f"  - Description: {manual_course.description}")

# Use whichever course is available
selected_course = loaded_course if loaded_course else manual_course
print(f"\nüéØ Using course: {selected_course.course_name}")

## Select Learning Style Preference

In [None]:
# Display options and get user preference
# For notebook automation, we auto-select option "1" (Real-world Examples)
# In interactive mode, set auto_select=None

user_pref = UserPreferences.get_user_preference(auto_select="1")  # Change to None for interactive

## Generate Syllabus with Two-Layer Validation

In [None]:
# Initialize the syllabus generator
syllabus_gen = SyllabusGenerator(model_name="phi3:mini", temperature=0.7)

# Generate syllabus with validation
approved_syllabus, history = syllabus_gen.generate_syllabus(
    course=selected_course,
    user_preference=user_pref,
    max_attempts=3
)

# Display the final syllabus
syllabus_gen.print_syllabus(approved_syllabus)

## Review Attempt History

In [None]:
# Display the history of all attempts
print("=" * 80)
print("SYLLABUS GENERATION ATTEMPT HISTORY")
print("=" * 80)

for attempt_info in history:
    attempt = attempt_info['attempt']
    review = attempt_info['review']
    syllabus = attempt_info['syllabus']
    
    status = "‚úÖ APPROVED" if review.approved else "‚ùå REJECTED"
    print(f"\nAttempt {attempt}: {status}")
    print(f"Modules Generated: {syllabus.total_modules}")
    print(f"Total Topics: {sum(len(m.topics) for m in syllabus.modules)}")
    print(f"Estimated Hours: {syllabus.total_estimated_hours:.1f}")
    print(f"Review Feedback: {review.feedback}")
    
    if review.issues:
        print(f"Issues Found:")
        for issue in review.issues:
            print(f"  - {issue}")
    print("-" * 80)

## Export Syllabus to JSON

In [None]:
# Export the approved syllabus to JSON
syllabus_json = approved_syllabus.model_dump_json(indent=2)

print("SYLLABUS JSON OUTPUT:")
print("=" * 80)
print(syllabus_json)
print("=" * 80)

# Save to file
output_dir = "output"
os.makedirs(output_dir, exist_ok=True)

filename = f"{output_dir}/{approved_syllabus.course_name.replace(' ', '_')}_syllabus.json"
with open(filename, 'w', encoding='utf-8') as f:
    f.write(syllabus_json)

print(f"\n‚úì Syllabus saved to: {filename}")

## Try Different Learning Styles

Generate syllabi with different learning preferences to see how they differ.

In [None]:
# Try with a different learning style - Project-Based
project_pref = UserPreferences.get_user_preference(auto_select="3")

project_syllabus, project_history = syllabus_gen.generate_syllabus(
    course=selected_course,
    user_preference=project_pref,
    max_attempts=3
)

print("\n" + "=" * 80)
print("COMPARISON: Real-world Examples vs Project-Based")
print("=" * 80)
print(f"\nReal-world Examples Style:")
print(f"  - Modules: {approved_syllabus.total_modules}")
print(f"  - Topics: {sum(len(m.topics) for m in approved_syllabus.modules)}")
print(f"  - Hours: {approved_syllabus.total_estimated_hours:.1f}")

print(f"\nProject-Based Style:")
print(f"  - Modules: {project_syllabus.total_modules}")
print(f"  - Topics: {sum(len(m.topics) for m in project_syllabus.modules)}")
print(f"  - Hours: {project_syllabus.total_estimated_hours:.1f}")

## Summary: Two-Layer System Benefits

The two-layer validation system provides:

1. **Quality Assurance**: Second layer catches issues like:
   - Overlapping content
   - Incomplete coverage
   - Vague descriptions
   - Unrealistic time estimates

2. **Hallucination Prevention**: 
   - Detailed descriptions for AI ensure accurate content generation
   - Review layer validates factual accuracy and coherence

3. **Iterative Improvement**:
   - Automatic regeneration with feedback
   - Up to N attempts to get it right
   - Clear tracking of issues and improvements

4. **Flexibility**:
   - Multiple learning style options (6 choices)
   - Customizable prompts per preference
   - Variable module/topic structure (3-8 modules, 3-10 topics each)

5. **Production Ready**:
   - Complete error handling
   - JSON export for API integration
   - Detailed logging and history tracking
   - Dual descriptions (user-friendly + AI-detailed)