# Lesson 2: Advanced Agentic AI

## Grounding Foundation Models & Multi-Agent Committees

In Lesson 1, you learned that LLMs are powerful pattern predictors but have limitations:
- They can't access real-time information
- They don't know about YOUR specific data
- They work in isolation

Today, we'll solve these problems by:
1. **Grounding** LLMs with your own knowledge base
2. **Coordinating** multiple agents to solve complex tasks

# **Guiding Questions:**
1. How can we make LLMs answer questions about information they've never seen?
2. Can multiple AI agents work together better than one?

---

# Part 1: Grounding Foundation Models

## What Does "Grounding" Mean?

**Grounding** means connecting an LLM's responses to specific, verifiable sources of information.

### The Problem:
- LLMs only know what was in their training data (cutoff: early 2024 for most models)
- They hallucinate when they don't know something
- They can't access YOUR documents, databases, or private information

### The Solution: RAG (Retrieval-Augmented Generation)
**RAG** = Retrieve relevant information → Augment the prompt → Generate grounded responses

Think of it like an open-book exam vs. a closed-book exam!

## Setup: Import Libraries

Let's set up our environment for building a grounded agent.

In [None]:
import os
os.environ['HF_HUB_DISABLE_PROGRESS_BARS'] = '1'
os.environ['TRANSFORMERS_NO_ADVISORY_WARNINGS'] = '1'
os.environ['TOKENIZERS_PARALLELISM'] = 'false'

import asyncio
from dotenv import load_dotenv
from fairlib.modules.mal.huggingface_adapter import HuggingFaceAdapter
from fairlib.core.message import Message
from fairlib.core.tool import BaseTool, ToolRegistry
from fairlib.core.tool_executor import ToolExecutor
from fairlib.core.memory import WorkingMemory
from fairlib.prompting.role_definition import RoleDefinition
from fairlib.agents.simple_agent import SimpleAgent
from fairlib.planning.simple_react_planner import SimpleReActPlanner

# Load environment variables
load_dotenv()
token = os.getenv("HUGGING_FACE_HUB_TOKEN")

if not token:
    print("Warning: HUGGING_FACE_HUB_TOKEN not found in .env file!")
else:
    print("Token loaded successfully!")

## Experiment 1: The Hallucination Problem

Let's see what happens when we ask an LLM about information it doesn't have.

In [None]:
# Load a language model
print("Loading language model...")
llm = HuggingFaceAdapter(model_name="dolphin3-qwen25-3b", auth_token=token)
print("Model loaded!")

In [None]:
# Ask about fictional company policy
question = "What is Acme Corporation's policy on remote work?"

messages = [Message(role="user", content=question)]
response = llm.invoke(messages)

print(f"Question: {question}")
print(f"\nLLM Response:\n{response.content}")
print("\n⚠️ WARNING: This information might be completely made up! ⚠️")

### Reflection Question 1

**Did the LLM admit it doesn't know, or did it make something up?**

This is the hallucination problem: LLMs try to answer even when they don't have the information!

## Building a Knowledge Base

Let's create a simple document store that represents company knowledge.

In [None]:
# Create a fake knowledge base (in real systems, this would be a vector database)
KNOWLEDGE_BASE = {
    "remote_work": """
    Acme Corporation Remote Work Policy (Updated 2024):
    
    - All employees may work remotely up to 3 days per week
    - Managers must approve remote schedules in advance
    - Core collaboration hours are 10 AM - 3 PM EST
    - Video must be enabled during team meetings
    - Home office equipment reimbursement: up to $500/year
    """,
    
    "vacation": """
    Acme Corporation Vacation Policy:
    
    - New employees: 15 days PTO per year
    - After 3 years: 20 days PTO per year
    - After 7 years: 25 days PTO per year
    - Vacation requests must be submitted 2 weeks in advance
    - Maximum carryover: 5 unused days to next year
    """,
    
    "benefits": """
    Acme Corporation Benefits Package:
    
    - Health insurance: 80% employer-paid premium
    - 401(k) matching: 6% of salary
    - Professional development: $2,000/year
    - Gym membership: $50/month reimbursement
    - Parental leave: 12 weeks paid for primary caregiver
    """
}

print("Knowledge base created!")
print(f"Documents: {list(KNOWLEDGE_BASE.keys())}")

## Creating a Document Search Tool

Now we'll create a tool that can search this knowledge base.

In [None]:
class DocumentSearchTool(BaseTool):
    """A tool that searches company documents"""
    
    name = "search_documents"
    description = (
        "Searches Acme Corporation's internal documents. "
        "Input should be a search query or topic (e.g., 'remote work', 'vacation', 'benefits'). "
        "Returns relevant company policy information."
    )
    
    def __init__(self, knowledge_base):
        super().__init__()
        self.knowledge_base = knowledge_base
    
    def execute(self, tool_input: str) -> str:
        """Search for documents matching the query"""
        query = tool_input.lower()
        
        # Simple keyword matching (real systems use vector similarity)
        results = []
        for doc_key, doc_content in self.knowledge_base.items():
            if query in doc_key or any(word in doc_content.lower() for word in query.split()):
                results.append(doc_content)
        
        if results:
            return "\n\n---\n\n".join(results)
        else:
            return "No relevant documents found."

# Create the tool
doc_search = DocumentSearchTool(KNOWLEDGE_BASE)
print("Document search tool created!")

### Test the Search Tool

In [None]:
# Test the tool directly
result = doc_search.execute("remote work")
print("Search Results:")
print(result)

## Building a Grounded Agent

Now let's create an agent that can use this tool to provide grounded answers!

In [None]:
# Create a separate LLM instance for the agent
agent_llm = HuggingFaceAdapter(model_name="dolphin3-qwen25-3b", auth_token=token)

# Register the tool
tool_registry = ToolRegistry()
tool_registry.register(doc_search)

# Create executor, memory, and planner
executor = ToolExecutor(tool_registry)
memory = WorkingMemory()
planner = SimpleReActPlanner(agent_llm, tool_registry)

# Define the agent's role
planner.prompt_builder.role_definition = RoleDefinition(
    "You are Acme Corporation's HR assistant. "
    "Your job is to answer employee questions about company policies accurately. "
    "ALWAYS search the company documents before answering - never make up information. "
    "Use the search_documents tool to find relevant policies, then base your answer on those documents. "
    "If the information isn't in the documents, say so."
)

# Assemble the agent
grounded_agent = SimpleAgent(
    llm=agent_llm,
    planner=planner,
    tool_executor=executor,
    memory=memory,
    max_steps=5
)

print("Grounded agent created successfully!")

## Testing the Grounded Agent

Let's ask the same question - but now the agent has access to real documents!

In [None]:
async def test_grounded_agent(question):
    print(f"Question: {question}")
    print("\nAgent thinking...\n")
    
    response = await grounded_agent.arun(question)
    
    print(f"\nAgent Response:\n{response}")
    return response

In [None]:
# Test 1: Question about remote work
await test_grounded_agent("What is Acme Corporation's policy on remote work?")

In [None]:
# Test 2: Question about vacation
await test_grounded_agent("How many vacation days do employees get after 5 years?")

In [None]:
# Test 3: Question about something NOT in the documents
await test_grounded_agent("What is the dress code policy?")

### Reflection Question 2

**What's different between the pure LLM and the grounded agent?**

Notice:
- The grounded agent searches documents first
- Answers are based on actual policy text
- When info isn't available, it admits it (no hallucination!)

This is **Retrieval-Augmented Generation (RAG)** in action!

## Key Insights: Grounding

### Without Grounding (Pure LLM):
```
User: "What's the remote work policy?"
  ↓
LLM: [Predicts based on general patterns]
  ↓
LLM: "Most companies allow 2-3 days..." (HALLUCINATION!)
```

### With Grounding (RAG Agent):
```
User: "What's the remote work policy?"
  ↓
Agent: [Searches documents]
  ↓
Tool: Returns actual policy text
  ↓
Agent: [Generates answer based on retrieved text]
  ↓
Agent: "According to company policy, employees may work remotely up to 3 days per week..."
```

**Grounding = Factual, Verifiable, Trustworthy Answers**

---

# Part 2: Committees of AI Agents

## Why Use Multiple Agents?

Just like human teams, different agents can have different:
- **Roles** (specialist vs. generalist)
- **Tools** (some agents have access to certain resources)
- **Expertise** (some are better at specific tasks)

### Real-World Examples:
- **Code Review Committee**: One agent writes code, another reviews for bugs, another checks style
- **Essay Grading Team**: One checks grammar, another evaluates argument quality, another verifies facts
- **Research Team**: One searches papers, another summarizes, another synthesizes findings

## Example Task: Essay Evaluation

Let's build a committee to grade essays with three specialized agents:
1. **Grammar Agent**: Checks spelling, grammar, and clarity
2. **Content Agent**: Evaluates argument quality and evidence
3. **Coordinator Agent**: Synthesizes feedback and assigns final grade

## Sample Essay for Testing

In [None]:
sample_essay = """
The Impact of Artificial Intelligence on Education

Artificial intelligence is revolutionizing education in many ways. AI can personalize learning 
by adapting to each students pace and style. For example, intelligent tutoring systems can 
identify where a student struggles and provide targeted practice.

However, their are concerns about AI in education. Some worry that students might become to 
dependent on AI tools and not develop critical thinking skills. Others point out that AI 
systems can perpetuate biases if there trained on biased data.

Despite these challenges, the benefits outweigh the risks. AI can help teachers by automating 
administrative tasks, allowing them to focus on actual teaching. It can also make education 
more accessible to students in remote areas who otherwise wouldn't have access to quality 
instruction.

In conclusion, AI has the potential to transform education for the better, but we must 
implement it thoughtfully and address the ethical concerns. The future of education will 
likely involve a partnership between human teachers and AI systems, combining the best of both.
"""

print("Sample essay loaded!")
print(f"Length: {len(sample_essay.split())} words")

## Building Specialized Agent Tools

First, let's create tools that represent specialized capabilities.

In [None]:
class GrammarCheckTool(BaseTool):
    """Tool for checking grammar and writing quality"""
    
    name = "check_grammar"
    description = (
        "Analyzes text for grammar, spelling, and clarity issues. "
        "Input: essay text. Returns: detailed grammar feedback."
    )
    
    def execute(self, tool_input: str) -> str:
        """Simulate grammar checking"""
        issues = []
        
        # Simple checks (real systems use NLP libraries)
        if "their are" in tool_input.lower():
            issues.append("- Found 'their are' - should be 'there are'")
        if "to dependent" in tool_input.lower():
            issues.append("- Found 'to dependent' - should be 'too dependent'")
        if "students pace" in tool_input.lower():
            issues.append("- Found 'students pace' - missing apostrophe: 'student's pace'")
        if "there trained" in tool_input.lower():
            issues.append("- Found 'there trained' - should be 'they're trained'")
        
        if issues:
            return "Grammar Issues Found:\n" + "\n".join(issues)
        else:
            return "No major grammar issues detected."


class ContentAnalysisTool(BaseTool):
    """Tool for analyzing argument quality and evidence"""
    
    name = "analyze_content"
    description = (
        "Evaluates the quality of arguments and evidence in an essay. "
        "Input: essay text. Returns: content quality assessment."
    )
    
    def execute(self, tool_input: str) -> str:
        """Simulate content analysis"""
        text_lower = tool_input.lower()
        
        # Count evidence markers
        evidence_markers = ["for example", "research shows", "studies indicate", "data suggests"]
        evidence_count = sum(1 for marker in evidence_markers if marker in text_lower)
        
        # Check structure
        has_intro = "introduction" in text_lower or tool_input.startswith(" ")
        has_conclusion = "conclusion" in text_lower or "in summary" in text_lower
        
        # Check for counterarguments
        has_counterargument = any(word in text_lower for word in ["however", "although", "despite", "concern"])
        
        feedback = "Content Analysis:\n"
        feedback += f"- Evidence examples: {evidence_count}\n"
        feedback += f"- Has clear introduction: {has_intro}\n"
        feedback += f"- Has conclusion: {has_conclusion}\n"
        feedback += f"- Addresses counterarguments: {has_counterargument}\n"
        
        if evidence_count < 2:
            feedback += "\nSuggestion: Add more specific examples and evidence to support claims."
        
        return feedback


# Create the tools
grammar_tool = GrammarCheckTool()
content_tool = ContentAnalysisTool()

print("Specialized tools created!")

## Creating Specialized Agents

Now we'll create three agents with different roles and tools.

In [None]:
# Agent 1: Grammar Specialist
grammar_llm = HuggingFaceAdapter(model_name="dolphin3-qwen25-3b", auth_token=token)
grammar_registry = ToolRegistry()
grammar_registry.register(grammar_tool)
grammar_executor = ToolExecutor(grammar_registry)
grammar_memory = WorkingMemory()
grammar_planner = SimpleReActPlanner(grammar_llm, grammar_registry)

grammar_planner.prompt_builder.role_definition = RoleDefinition(
    "You are a Grammar Specialist. "
    "Your job is to evaluate the grammar, spelling, and writing clarity of essays. "
    "Use the check_grammar tool to analyze the text, then provide specific feedback. "
    "Rate grammar quality on a scale of 1-10 and explain your rating."
)

grammar_agent = SimpleAgent(
    llm=grammar_llm,
    planner=grammar_planner,
    tool_executor=grammar_executor,
    memory=grammar_memory,
    max_steps=5
)

print("✓ Grammar Agent created")

In [None]:
# Agent 2: Content Specialist
content_llm = HuggingFaceAdapter(model_name="dolphin3-qwen25-3b", auth_token=token)
content_registry = ToolRegistry()
content_registry.register(content_tool)
content_executor = ToolExecutor(content_registry)
content_memory = WorkingMemory()
content_planner = SimpleReActPlanner(content_llm, content_registry)

content_planner.prompt_builder.role_definition = RoleDefinition(
    "You are a Content Specialist. "
    "Your job is to evaluate the quality of arguments, evidence, and logical structure in essays. "
    "Use the analyze_content tool to assess the text, then provide feedback on argument strength. "
    "Rate content quality on a scale of 1-10 and explain your rating."
)

content_agent = SimpleAgent(
    llm=content_llm,
    planner=content_planner,
    tool_executor=content_executor,
    memory=content_memory,
    max_steps=5
)

print("✓ Content Agent created")

In [None]:
# Agent 3: Coordinator (no special tools, just synthesis)
coordinator_llm = HuggingFaceAdapter(model_name="dolphin3-qwen25-3b", auth_token=token)
coordinator_registry = ToolRegistry()  # No tools needed
coordinator_executor = ToolExecutor(coordinator_registry)
coordinator_memory = WorkingMemory()
coordinator_planner = SimpleReActPlanner(coordinator_llm, coordinator_registry)

coordinator_planner.prompt_builder.role_definition = RoleDefinition(
    "You are the Lead Evaluator and Coordinator. "
    "Your job is to synthesize feedback from the Grammar and Content specialists. "
    "Review their assessments, then provide: "
    "1) An overall grade (A-F) "
    "2) A summary of strengths "
    "3) Key areas for improvement "
    "Be fair but thorough in your evaluation."
)

coordinator_agent = SimpleAgent(
    llm=coordinator_llm,
    planner=coordinator_planner,
    tool_executor=coordinator_executor,
    memory=coordinator_memory,
    max_steps=5
)

print("✓ Coordinator Agent created")

## Running the Committee

Now let's coordinate the agents to evaluate our essay!

In [None]:
async def evaluate_essay_with_committee(essay_text):
    """Coordinate multiple agents to evaluate an essay"""
    
    print("="*60)
    print("ESSAY EVALUATION COMMITTEE")
    print("="*60)
    
    # Step 1: Grammar Agent evaluates
    print("\n[GRAMMAR SPECIALIST ANALYZING...]\n")
    grammar_feedback = await grammar_agent.arun(
        f"Please evaluate the grammar and writing quality of this essay:\n\n{essay_text}"
    )
    print(f"Grammar Specialist Report:\n{grammar_feedback}")
    
    # Step 2: Content Agent evaluates
    print("\n" + "="*60)
    print("\n[CONTENT SPECIALIST ANALYZING...]\n")
    content_feedback = await content_agent.arun(
        f"Please evaluate the content quality and argumentation of this essay:\n\n{essay_text}"
    )
    print(f"Content Specialist Report:\n{content_feedback}")
    
    # Step 3: Coordinator synthesizes
    print("\n" + "="*60)
    print("\n[LEAD EVALUATOR SYNTHESIZING...]\n")
    
    coordinator_prompt = f"""
    Please review the following specialist reports and provide a final evaluation:
    
    GRAMMAR SPECIALIST REPORT:
    {grammar_feedback}
    
    CONTENT SPECIALIST REPORT:
    {content_feedback}
    
    Provide:
    1. Overall Grade (A-F)
    2. Summary of Strengths
    3. Key Areas for Improvement
    """
    
    final_evaluation = await coordinator_agent.arun(coordinator_prompt)
    
    print("="*60)
    print("FINAL EVALUATION")
    print("="*60)
    print(final_evaluation)
    print("\n" + "="*60)
    
    return final_evaluation

In [None]:
# Run the committee evaluation!
await evaluate_essay_with_committee(sample_essay)

### Reflection Question 3

**What advantages does the committee approach have over a single agent?**

Think about:
- Specialization: Each agent focuses on what it does best
- Division of labor: Complex task broken into manageable pieces
- Checks and balances: Multiple perspectives reduce bias
- Transparency: You can see each agent's reasoning

## Understanding Multi-Agent Patterns

### Common Committee Structures:

**1. Sequential Pipeline** (what we just built):
```
Essay → Grammar Agent → Content Agent → Coordinator → Final Grade
```

**2. Parallel Processing**:
```
               ┌─→ Grammar Agent ─┐
    Essay ────┤                   ├──→ Coordinator → Result
               └─→ Content Agent ─┘
```

**3. Hierarchical Structure**:
```
    Manager Agent
         |
    ┌────┴────┐
  Worker 1  Worker 2
```

**4. Debate/Consensus**:
```
Agent A ←→ Agent B ←→ Agent C
     ↓         ↓         ↓
        Consensus Result
```

## Try It Yourself!

Test the committee on your own essay or writing sample:

In [None]:
your_essay = """
Paste your essay here!
"""

# Uncomment to test:
# await evaluate_essay_with_committee(your_essay)

---

## Key Insights: Multi-Agent Systems

### Why Committees Beat Single Agents:

**Specialization**:
- Each agent masters one skill
- Like human experts, focused agents perform better

**Scalability**:
- Add new specialists as needed
- Agents can work in parallel for speed

**Reliability**:
- Multiple perspectives catch more issues
- One agent's weakness is another's strength

**Transparency**:
- See each agent's reasoning
- Easier to debug and improve

### When to Use Committees:
- ✅ Complex tasks requiring multiple skills
- ✅ Quality-critical applications (grading, review, analysis)
- ✅ Tasks that benefit from different perspectives
- ❌ Simple queries (overkill)
- ❌ Extremely time-sensitive tasks (coordination adds latency)

---

## Combining Grounding + Committees

The most powerful systems combine BOTH techniques:

```
Research Team Example:

Search Agent (grounded on academic papers)
     ↓
Summarization Agent (processes search results)
     ↓
Synthesis Agent (combines insights)
     ↓
Final Report
```

Each agent is:
- **Grounded** (has access to relevant knowledge)
- **Specialized** (focuses on one task)
- **Coordinated** (works with other agents)

## Final Reflection

### 1. How does grounding solve the hallucination problem?
### 2. When would you use RAG vs. fine-tuning a model?
### 3. What are the tradeoffs between single agents and committees?
### 4. Can you think of a real-world task that would benefit from BOTH grounding AND multiple agents?
### 5. How might you apply these techniques to your final project?

## Congratulations!

You've learned advanced agentic AI techniques:

### Grounding Foundation Models:
- ✓ Why LLMs hallucinate
- ✓ How RAG works
- ✓ Building document-grounded agents
- ✓ Retrieval → Augmentation → Generation pattern

### Multi-Agent Committees:
- ✓ Agent specialization and roles
- ✓ Coordinating multiple agents
- ✓ Sequential and parallel workflows
- ✓ Synthesizing diverse feedback

### What's Next?
- Experiment with different committee structures
- Build agents grounded on YOUR data
- Combine techniques for your final projects
- Explore advanced patterns (debate, voting, hierarchies)

---

## Additional Challenges (Optional)

If you want to explore further, try these:

### Challenge 1: Improve the Knowledge Base
Add more documents to the knowledge base and test different search strategies.

### Challenge 2: Add More Specialists
Create a third specialist agent (e.g., "Citation Checker" or "Originality Detector").

### Challenge 3: Parallel Evaluation
Modify the committee to run grammar and content agents in parallel using `asyncio.gather()`.

### Challenge 4: Voting System
Create multiple grading agents that vote on the final grade rather than having one coordinator.

### Challenge 5: Your Own Committee
Design a multi-agent system for a task relevant to your final project!