<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/116_TxtSummarizerAgent_Claude_02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Claude Code Starter Notebook

This notebook is a clean template for working with **Claude** (Anthropic's models) in Colab.

It supports:
- Loading your API key from a `.env` file
- A helper function `ask_claude` for single-turn Q&A
- A simple **conversation manager** to keep history across multiple turns
- Running shell commands via `!` or `%%bash`


## 1. Install dependencies

In [2]:
!pip -q install anthropic python-dotenv rich openai

## 2. Load API key

In [3]:
import os
import re
import time
import inspect
import textwrap
from dataclasses import dataclass
from typing import Callable, Optional

# External Libraries
from dotenv import load_dotenv
from openai import OpenAI

# Load secrets from .env — avoid hardcoding API keys!
load_dotenv("/content/API_KEYS.env")
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    raise RuntimeError("OPENAI_API_KEY not found. Please check your .env file.")

# Initialize OpenAI client
client = OpenAI(api_key=api_key)

anthropic_key = os.getenv("ANTHROPIC_API_KEY")
if not anthropic_key:
    raise RuntimeError("Missing ANTHROPIC_API_KEY in /content/API_KEYS.env")

print("✅ OpenAI & Anthropic keys loaded")

✅ OpenAI & Anthropic keys loaded


## 3. Import libraries and set up client

In [4]:
from anthropic import Anthropic, APIError
from rich.console import Console
from rich.markdown import Markdown
import textwrap

console = Console()
client = Anthropic(api_key=anthropic_key)

# Default to Claude 3.5 Haiku for speed & low cost
MODEL_NAME = os.environ.get("CLAUDE_MODEL", "claude-3-5-haiku-latest")

## 4. Conversation manager for multi-turn chats

In [5]:
conversation = []
console = Console()

def smart_print_markdown(output: str, width: int = 100):
    """
    Wrap plain text, preserve fenced code blocks.
    """
    in_code = False
    para_buf = []

    def flush_paragraph():
        if para_buf:
            text = " ".join(para_buf)
            print(textwrap.fill(text, width=width, replace_whitespace=False))
            print()
            para_buf.clear()

    for line in output.splitlines():
        fence = line.strip().startswith("```")
        if fence:
            # Finish any pending wrapped paragraph before toggling code
            flush_paragraph()
            print(line)
            in_code = not in_code
            continue

        if in_code:
            # Inside code block -> print verbatim
            print(line)
        else:
            # Outside code block -> buffer/wrap paragraphs
            if line.strip() == "":
                flush_paragraph()
            else:
                para_buf.append(line)

    flush_paragraph()

def chat_with_claude(
    prompt: str,
    system: str = "You are a helpful coding assistant.",
    render: str = "markdown",      # 'markdown' | 'wrapped' | 'none'
    return_text: bool = False,
    wrap_width: int = 100,
) -> str | None:
    """
    Send a prompt with conversation memory.
    - render='markdown'  -> pretty Markdown rendering (code blocks look great)
    - render='wrapped'   -> wrap only plain text, preserve code fences
    - render='none'      -> print nothing (use return_text=True if you need the string)
    """
    if not anthropic_key:
        raise RuntimeError("Missing ANTHROPIC_API_KEY.")

    conversation.append({"role": "user", "content": prompt})

    try:
        msg = client.messages.create(
            model=MODEL_NAME,
            max_tokens=3000,
            temperature=0.2,
            system=system,
            messages=conversation,
        )
        parts = [b.text for b in msg.content if getattr(b, "type", None) == "text"]
        output = "\n\n".join(parts).strip() or "(No text)"

        if render == "markdown":
            console.print(Markdown(output))
        elif render == "wrapped":
            smart_print_markdown(output, width=wrap_width)
        # render == 'none' -> no printing

        conversation.append({"role": "assistant", "content": output})
        return output if return_text else None

    except APIError as e:
        print("Anthropic API error:", e)
        raise

# Optional helpers
def reset_conversation():
    conversation.clear()

def last_reply() -> str | None:
    for m in reversed(conversation):
        if m["role"] == "assistant":
            return m["content"]
    return None


## Design Scaffold

In [17]:
# Read a file from Colab
with open('/content/_Agent_03_Recipe.txt', 'r') as f:
    doc_content = f.read()

prompt='''
Ok this was a valuable test. I think using OpenAI for these first two steps make sense. We may want to give
Claude a second chance in later steps. For now lets work on fleshing out the scaffold of the agent now that we have
our first two agent steps:Goal, and Step design. We can begin creating the tools and formalizing the steps without
writing the finsihed code. Lets refer to the Recipe to make sure we are on track.
'''
# chat_with_claude(prompt)
chat_with_claude(f"{prompt} {doc_content}")

In [18]:
prompt='''
Great can you set up our agent scaffold next so we can review it and make sure it includes all the tools and functions we will need?
This will basically be our dress rehersal for the final agent implementation where make sure we have all the components
in place before we start writing the code.
'''
# chat_with_claude(prompt)
chat_with_claude(prompt)

## Script I

In [19]:
agent_script = r'''
from typing import List, Dict, Any, Optional
from enum import Enum
from dataclasses import dataclass, field
import time

# 1. Define Purpose
GOAL = """
Extract and synthesize key concepts from text documents,
creating high-quality, educational summaries that enhance reader understanding.

Constraints:
- Maintain academic integrity
- Preserve original context
- Generate concise, clear summaries
"""

# 2. Tool Categories
class ToolCategory(Enum):
    PREPROCESSING = "preprocessing"
    ANALYSIS = "analysis"
    SYNTHESIS = "synthesis"
    VALIDATION = "validation"
    OUTPUT = "output"

# 3. Tool Dependencies and Context
@dataclass
class DocumentProcessingContext:
    # Shared context for document processing
    input_document: str = ""
    document_metadata: Dict[str, Any] = field(default_factory=dict)
    processing_config: Dict[str, Any] = field(default_factory=lambda: {
        "max_summary_length": 500,
        "complexity_level": "intermediate",
        "language": "english"
    })
    extracted_concepts: List[Dict[str, Any]] = field(default_factory=list)
    summary_drafts: List[str] = field(default_factory=list)
    validation_results: List[Dict[str, Any]] = field(default_factory=list)

# 4. Tool Definitions
@dataclass
class Tool:
    name: str
    category: ToolCategory
    description: str
    required_dependencies: List[str]

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        raise NotImplementedError("Subclasses must implement execution")

# Specific Tool Implementations (Scaffold)
class TextPreprocessingTool(Tool):
    def __init__(self):
        super().__init__(
            name="text_preprocessor",
            category=ToolCategory.PREPROCESSING,
            description="Clean and normalize input text",
            required_dependencies=["text_cleaning_library"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for text preprocessing logic
        return {
            "status": "success",
            "processed_text": context.input_document.lower().strip(),
            "preprocessing_steps": [
                "lowercase conversion",
                "whitespace removal",
                "special character handling"
            ]
        }

class ConceptExtractionTool(Tool):
    def __init__(self):
        super().__init__(
            name="concept_extractor",
            category=ToolCategory.ANALYSIS,
            description="Extract key concepts and their relationships",
            required_dependencies=["nlp_library", "concept_mapping_tool"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for concept extraction logic
        return {
            "status": "success",
            "extracted_concepts": [
                {
                    "term": "key concept",
                    "definition": "brief explanation",
                    "importance_score": 0.8
                }
            ],
            "concept_relationships": []
        }

class SummarizationTool(Tool):
    def __init__(self):
        super().__init__(
            name="summarization_tool",
            category=ToolCategory.SYNTHESIS,
            description="Generate educational summary from extracted concepts",
            required_dependencies=["language_model", "summary_generation_library"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for summary generation logic
        return {
            "status": "success",
            "summary_draft": "Synthesized summary based on extracted concepts",
            "summary_metadata": {
                "length": 250,
                "complexity_level": context.processing_config.get("complexity_level")
            }
        }

class ValidationTool(Tool):
    def __init__(self):
        super().__init__(
            name="summary_validator",
            category=ToolCategory.VALIDATION,
            description="Validate generated summary for quality and accuracy",
            required_dependencies=["validation_library", "fact_checking_service"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for summary validation logic
        return {
            "status": "success",
            "validation_results": [
                {
                    "aspect": "factual_accuracy",
                    "score": 0.9,
                    "recommendations": []
                },
                {
                    "aspect": "readability",
                    "score": 0.85,
                    "recommendations": []
                }
            ]
        }

class OutputGenerationTool(Tool):
    def __init__(self):
        super().__init__(
            name="output_generator",
            category=ToolCategory.OUTPUT,
            description="Generate final formatted output",
            required_dependencies=["formatting_library"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for final output generation
        return {
            "status": "success",
            "output_formats": ["markdown", "html", "plain_text"],
            "final_output": "# Summary\n\nDetailed educational summary..."
        }

# 5. Tool Registry
class ToolRegistry:
    def __init__(self):
        self.tools: Dict[str, Tool] = {}

    def register_tool(self, tool: Tool):
        self.tools[tool.name] = tool

    def get_tool(self, name: str) -> Tool:
        return self.tools.get(name)

    def get_tools_by_category(self, category: ToolCategory) -> List[Tool]:
        return [tool for tool in self.tools.values() if tool.category == category]

# 6. Agent Capabilities
class Capability:
    def pre_process(self, context: DocumentProcessingContext):
        pass

    def post_process(self, context: DocumentProcessingContext):
        pass

class QualityAssuranceCapability(Capability):
    def post_process(self, context: DocumentProcessingContext):
        # Additional quality checks
        pass

class ProgressTrackingCapability(Capability):
    def pre_process(self, context: DocumentProcessingContext):
        context.document_metadata['start_time'] = time.time()

    def post_process(self, context: DocumentProcessingContext):
        context.document_metadata['end_time'] = time.time()
        context.document_metadata['total_processing_time'] = (
            context.document_metadata['end_time'] -
            context.document_metadata['start_time']
        )

# 7. Agent Framework
class DocumentProcessingAgent:
    def __init__(
        self,
        tool_registry: ToolRegistry,
        capabilities: Optional[List[Capability]] = None
    ):
        self.tool_registry = tool_registry
        self.capabilities = capabilities or []

    def process_document(self, document: str, config: Dict[str, Any] = None) -> DocumentProcessingContext:
        # Create processing context
        context = DocumentProcessingContext(input_document=document)

        # Apply pre-processing capabilities
        for capability in self.capabilities:
            capability.pre_process(context)

        # Define processing pipeline
        pipeline_order = [
            "text_preprocessor",
            "concept_extractor",
            "summarization_tool",
            "summary_validator",
            "output_generator"
        ]

        # Execute tools in sequence
        for tool_name in pipeline_order:
            tool = self.tool_registry.get_tool(tool_name)
            result = tool.execute(context)
            # Handle tool execution results

        # Apply post-processing capabilities
        for capability in self.capabilities:
            capability.post_process(context)

        return context

# 8. Initialization and Wiring
def setup_document_processing_agent() -> DocumentProcessingAgent:
    # Create tool registry
    registry = ToolRegistry()

    # Register tools
    registry.register_tool(TextPreprocessingTool())
    registry.register_tool(ConceptExtractionTool())
    registry.register_tool(SummarizationTool())
    registry.register_tool(ValidationTool())
    registry.register_tool(OutputGenerationTool())

    # Create capabilities
    capabilities = [
        QualityAssuranceCapability(),
        ProgressTrackingCapability()
    ]

    # Create and return agent
    return DocumentProcessingAgent(registry, capabilities)

# Example Usage
def main():
    agent = setup_document_processing_agent()
    sample_document = """Your sample text goes here..."""

    result = agent.process_document(sample_document)
    print(result.summary_drafts)
    print(result.document_metadata)

if __name__ == "__main__":
    main()


'''

with open("agent_script.py", "w") as file:
    file.write(agent_script)

print("Script successfully written to agent_script.py")

Script successfully written to agent_script.py


In [20]:
prompt='''
Ok now that we have the code lets test it out. I saved everything to a script called
agent_script.py ("/content/agent_script.py"). I want to make the code portable so it
can be used in any notebook. What do think of that idea? What is the best way to work
the script - and save it for future use?
'''
chat_with_claude(prompt)

In [22]:
prompt='''
 Can you help me understand all the options you provided? I am a data scientist and not a
 software developer so a lot of this is new to me, but i want to learn. Part of why i want
 to use Claud code is the opportunity to build my skills and this is exaclty what i am looking
 for. You provided a number of options. Are those each stand alone options, or do i need to implement
 all of them together?

 Maybe lets start with the last bit of code - can we run the agent with just this code:

 import sys
 sys.path.append('/content/agent_script.py')  # Add script directory to path

 from agent_script import initialize_agent, process_document

 # Initialize agent
 agent = initialize_agent()

 # Process multiple documents
 path = '/content/files'
 documents = [doc1, doc2, doc3]
 results = [process_document(agent, doc) for doc in documents]
 '''
chat_with_claude(prompt)

In [23]:
# Read a file from Colab
with open('/content/agent_script.py', 'r') as f:
    doc_content = f.read()

prompt='''
Ok a couple of questions. First do we need to refactor the script in order to use it or can we
run it as is?
'''
# chat_with_claude(prompt)
chat_with_claude(f"{prompt} {doc_content}")

##Script II

In [30]:
agent_script = r'''
from typing import List, Dict, Any, Optional
from enum import Enum
from dataclasses import dataclass, field
import time
import logging

# 1. Define Purpose
GOAL = """
Extract and synthesize key concepts from text documents,
creating high-quality, educational summaries that enhance reader understanding.

Constraints:
- Maintain academic integrity
- Preserve original context
- Generate concise, clear summaries
"""

# 2. Tool Categories
class ToolCategory(Enum):
    PREPROCESSING = "preprocessing"
    ANALYSIS = "analysis"
    SYNTHESIS = "synthesis"
    VALIDATION = "validation"
    OUTPUT = "output"

# 3. Tool Dependencies and Context
@dataclass
class DocumentProcessingContext:
    # Shared context for document processing
    input_document: str = ""
    document_metadata: Dict[str, Any] = field(default_factory=dict)
    processing_config: Dict[str, Any] = field(default_factory=lambda: {
        "max_summary_length": 500,
        "complexity_level": "intermediate",
        "language": "english"
    })
    extracted_concepts: List[Dict[str, Any]] = field(default_factory=list)
    summary_drafts: List[str] = field(default_factory=list)
    validation_results: List[Dict[str, Any]] = field(default_factory=list)

# 4. Tool Definitions
@dataclass
class Tool:
    name: str
    category: ToolCategory
    description: str
    required_dependencies: List[str]

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        raise NotImplementedError("Subclasses must implement execution")

# Specific Tool Implementations (Scaffold)
class TextPreprocessingTool(Tool):
    def __init__(self):
        super().__init__(
            name="text_preprocessor",
            category=ToolCategory.PREPROCESSING,
            description="Clean and normalize input text",
            required_dependencies=["text_cleaning_library"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for text preprocessing logic
        return {
            "status": "success",
            "processed_text": context.input_document.lower().strip(),
            "preprocessing_steps": [
                "lowercase conversion",
                "whitespace removal",
                "special character handling"
            ]
        }

class ConceptExtractionTool(Tool):
    def __init__(self):
        super().__init__(
            name="concept_extractor",
            category=ToolCategory.ANALYSIS,
            description="Extract key concepts and their relationships",
            required_dependencies=["nlp_library", "concept_mapping_tool"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for concept extraction logic
        return {
            "status": "success",
            "extracted_concepts": [
                {
                    "term": "key concept",
                    "definition": "brief explanation",
                    "importance_score": 0.8
                }
            ],
            "concept_relationships": []
        }

# class SummarizationTool(Tool):
#     def __init__(self):
#         super().__init__(
#             name="summarization_tool",
#             category=ToolCategory.SYNTHESIS,
#             description="Generate educational summary from extracted concepts",
#             required_dependencies=["language_model", "summary_generation_library"]
#         )

#     def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
#         # Placeholder for summary generation logic
#         return {
#             "status": "success",
#             "summary_draft": "Synthesized summary based on extracted concepts",
#             "summary_metadata": {
#                 "length": 250,
#                 "complexity_level": context.processing_config.get("complexity_level")
#             }
#         }

class SummarizationTool(Tool):
    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # More meaningful placeholder summary generation
        summary = f"Summary of document (length: {len(context.input_document)} characters):\n\n"
        summary += "Key Points:\n"
        summary += "1. Document provides insights into tool design best practices\n"
        summary += "2. Focuses on creating modular and flexible agent systems\n"
        summary += "3. Emphasizes the importance of separation of concerns"

        context.summary_drafts.append(summary)

        return {
            "status": "success",
            "summary_draft": summary,
            "summary_metadata": {
                "length": len(summary),
                "complexity_level": context.processing_config.get("complexity_level")
            }
        }

class ValidationTool(Tool):
    def __init__(self):
        super().__init__(
            name="summary_validator",
            category=ToolCategory.VALIDATION,
            description="Validate generated summary for quality and accuracy",
            required_dependencies=["validation_library", "fact_checking_service"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for summary validation logic
        return {
            "status": "success",
            "validation_results": [
                {
                    "aspect": "factual_accuracy",
                    "score": 0.9,
                    "recommendations": []
                },
                {
                    "aspect": "readability",
                    "score": 0.85,
                    "recommendations": []
                }
            ]
        }

class OutputGenerationTool(Tool):
    def __init__(self):
        super().__init__(
            name="output_generator",
            category=ToolCategory.OUTPUT,
            description="Generate final formatted output",
            required_dependencies=["formatting_library"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for final output generation
        return {
            "status": "success",
            "output_formats": ["markdown", "html", "plain_text"],
            "final_output": "# Summary\n\nDetailed educational summary..."
        }

# 5. Tool Registry
class ToolRegistry:
    def __init__(self):
        self.tools: Dict[str, Tool] = {}

    def register_tool(self, tool: Tool):
        self.tools[tool.name] = tool

    def get_tool(self, name: str) -> Tool:
        return self.tools.get(name)

    def get_tools_by_category(self, category: ToolCategory) -> List[Tool]:
        return [tool for tool in self.tools.values() if tool.category == category]

# 6. Agent Capabilities
class Capability:
    def pre_process(self, context: DocumentProcessingContext):
        pass

    def post_process(self, context: DocumentProcessingContext):
        pass

class QualityAssuranceCapability(Capability):
    def post_process(self, context: DocumentProcessingContext):
        # Additional quality checks
        pass

class ProgressTrackingCapability(Capability):
    def pre_process(self, context: DocumentProcessingContext):
        context.document_metadata['start_time'] = time.time()

    def post_process(self, context: DocumentProcessingContext):
        context.document_metadata['end_time'] = time.time()
        context.document_metadata['total_processing_time'] = (
            context.document_metadata['end_time'] -
            context.document_metadata['start_time']
        )

# 7. Agent Framework
class DocumentProcessingAgent:
    def __init__(
        self,
        tool_registry: ToolRegistry,
        capabilities: Optional[List[Capability]] = None
    ):
        self.tool_registry = tool_registry
        self.capabilities = capabilities or []

    def process_document(self, document: str, config: Dict[str, Any] = None) -> DocumentProcessingContext:
        # Create processing context
        context = DocumentProcessingContext(input_document=document)

        # Apply pre-processing capabilities
        for capability in self.capabilities:
            capability.pre_process(context)

        # Define processing pipeline
        pipeline_order = [
            "text_preprocessor",
            "concept_extractor",
            "summarization_tool",
            "summary_validator",
            "output_generator"
        ]

        # Execute tools in sequence
        for tool_name in pipeline_order:
            tool = self.tool_registry.get_tool(tool_name)
            result = tool.execute(context)
            # Handle tool execution results

        # Apply post-processing capabilities
        for capability in self.capabilities:
            capability.post_process(context)

        return context

# 8. Initialization and Wiring
def setup_document_processing_agent() -> DocumentProcessingAgent:
    # Create tool registry
    registry = ToolRegistry()

    # Register tools
    registry.register_tool(TextPreprocessingTool())
    registry.register_tool(ConceptExtractionTool())
    registry.register_tool(SummarizationTool())
    registry.register_tool(ValidationTool())
    registry.register_tool(OutputGenerationTool())

    # Create capabilities
    capabilities = [
        QualityAssuranceCapability(),
        ProgressTrackingCapability()
    ]

    # Create and return agent
    return DocumentProcessingAgent(registry, capabilities)

# ============ Logging =========== #

# Configure logging at the module level
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("/content/agent_processing.log"),  # Log to file
        logging.StreamHandler()  # Also print to console
    ]
)

# Create a logger for the module
logger = logging.getLogger(__name__)

# Example in a tool
class TextPreprocessingTool(Tool):
    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        logger.info(f"Starting text preprocessing for document of length {len(context.input_document)}")
        try:
            # Preprocessing logic
            processed_text = context.input_document.lower().strip()
            logger.info("Text preprocessing completed successfully")
            return {
                "status": "success",
                "processed_text": processed_text,
                "preprocessing_steps": ["lowercase", "strip"]
            }
        except Exception as e:
            logger.error(f"Error in text preprocessing: {e}")
            return {
                "status": "error",
                "error_message": str(e)
            }


# Add this at the end of the script or in a separate main block
def main():
    # Create the agent
    agent = setup_document_processing_agent()

    # Sample document for testing
    sample_document = """
    Machine learning is a subset of artificial intelligence that focuses on the use of data
    and algorithms to imitate the way that humans learn, gradually improving its accuracy.
    Unlike traditional programming, machine learning allows computers to learn from and make
    predictions or decisions based on data.
    """

    # Process the document
    result = agent.process_document(sample_document)

    # Print out results
    print("Summary Drafts:", result.summary_drafts)
    print("\nDocument Metadata:")
    for key, value in result.document_metadata.items():
        print(f"{key}: {value}")

    print("\nExtracted Concepts:")
    print(result.extracted_concepts)

# Ensure the main block runs
if __name__ == "__main__":
    main()


'''

with open("agent_script.py", "w") as file:
    file.write(agent_script)

print("Script successfully written to agent_script.py")

Script successfully written to agent_script.py


### Run Script II

In [25]:

# Import the entire script
import sys
sys.path.append('/content')  # Adjust path as needed

from agent_script import setup_document_processing_agent

# Create agent
agent = setup_document_processing_agent()

# Process a document
sample_document ="/files/_Agent_04_Tool_Design_Best_Practices.txt"
result = agent.process_document(sample_document)

In [26]:
prompt = '''
ok i added the def main function to the script and saved it.
then i added the file path to the code like this:

# Import the entire script
import sys
sys.path.append('/content')  # Adjust path as needed

from agent_script import setup_document_processing_agent

# Create agent
agent = setup_document_processing_agent()

# Process a document
sample_document ="/files/_Agent_04_Tool_Design_Best_Practices.txt"
result = agent.process_document(sample_document)

do i need to do anything else? I don't get any return. Are we saving the summary to a file?
'''
chat_with_claude(prompt)


In [28]:
prompt='''
does logging go in the script or the code to run the script?
'''
chat_with_claude(prompt)

## Run Script II

In [34]:
# Import the entire script
import sys
import os
import logging

sys.path.append('/content')  # Adjust path as needed

from agent_script import setup_document_processing_agent

# Create agent
agent = setup_document_processing_agent()

# Configure additional logging for the running script
logging.basicConfig(
    level=logging.DEBUG,  # More detailed logging
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("/content/script_execution.log"),
        logging.StreamHandler()
    ]
)

# Create a logger for the script
script_logger = logging.getLogger('document_processing_script')

try:
    # Import and setup
    sys.path.append('/content')
    from agent_script import setup_document_processing_agent

    script_logger.info("Starting document processing")

    # Create agent
    agent = setup_document_processing_agent()
    script_logger.info("Agent initialized successfully")

    # Read and process document
    file_path = "files/_Agent_04_Tool_Design_Best_Practices.txt"

    script_logger.info(f"Attempting to read document from {file_path}")
    with open(file_path, 'r', encoding='utf-8') as file:
        sample_document = file.read()

    script_logger.info(f"Document read successfully. Length: {len(sample_document)} characters")

    # Process the document
    result = agent.process_document(sample_document)

    script_logger.info("Document processing completed")

    # Save summary
    output_path = "/content/summary.txt"
    with open(output_path, 'w', encoding='utf-8') as output_file:
        output_file.write(result.summary_drafts[0] if result.summary_drafts else "No summary generated")

    script_logger.info(f"Summary saved to {output_path}")

except Exception as e:
    script_logger.error(f"An error occurred during document processing: {e}", exc_info=True)


In [35]:
prompt = '''
I updated the script and then ran this code but nothing is returned?

# Import the entire script
import sys
import os
import logging

sys.path.append('/content')  # Adjust path as needed

from agent_script import setup_document_processing_agent

# Create agent
agent = setup_document_processing_agent()

# Configure additional logging for the running script
logging.basicConfig(
    level=logging.DEBUG,  # More detailed logging
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("/content/script_execution.log"),
        logging.StreamHandler()
    ]
)

# Create a logger for the script
script_logger = logging.getLogger('document_processing_script')

try:
    # Import and setup
    sys.path.append('/content')
    from agent_script import setup_document_processing_agent

    script_logger.info("Starting document processing")

    # Create agent
    agent = setup_document_processing_agent()
    script_logger.info("Agent initialized successfully")

    # Read and process document
    file_path = "files/_Agent_04_Tool_Design_Best_Practices.txt"

    script_logger.info(f"Attempting to read document from {file_path}")
    with open(file_path, 'r', encoding='utf-8') as file:
        sample_document = file.read()

    script_logger.info(f"Document read successfully. Length: {len(sample_document)} characters")

    # Process the document
    result = agent.process_document(sample_document)

    script_logger.info("Document processing completed")

    # Save summary
    output_path = "/content/summary.txt"
    with open(output_path, 'w', encoding='utf-8') as output_file:
        output_file.write(result.summary_drafts[0] if result.summary_drafts else "No summary generated")

    script_logger.info(f"Summary saved to {output_path}")

except Exception as e:
    script_logger.error(f"An error occurred during document processing: {e}", exc_info=True)

'''
chat_with_claude(prompt)

## Script III

In [66]:
agent_script = r'''
from typing import List, Dict, Any, Optional
from enum import Enum
from dataclasses import dataclass, field
import time
import logging
import re

# 1. Define Purpose
GOAL = """
Extract and synthesize key concepts from text documents,
creating high-quality, educational summaries that enhance reader understanding.

Constraints:
- Maintain academic integrity
- Preserve original context
- Generate concise, clear summaries
"""

# 2. Tool Categories
class ToolCategory(Enum):
    PREPROCESSING = "preprocessing"
    ANALYSIS = "analysis"
    SYNTHESIS = "synthesis"
    VALIDATION = "validation"
    OUTPUT = "output"

# 3. Tool Dependencies and Context
@dataclass
class DocumentProcessingContext:
    # Shared context for document processing
    input_document: str = ""
    document_metadata: Dict[str, Any] = field(default_factory=dict)
    processing_config: Dict[str, Any] = field(default_factory=lambda: {
        "max_summary_length": 500,
        "complexity_level": "intermediate",
        "language": "english"
    })
    extracted_concepts: List[Dict[str, Any]] = field(default_factory=list)
    summary_drafts: List[str] = field(default_factory=list)
    validation_results: List[Dict[str, Any]] = field(default_factory=list)

# 4. Tool Definitions
@dataclass
class Tool:
    name: str
    category: ToolCategory
    description: str
    required_dependencies: List[str]

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        raise NotImplementedError("Subclasses must implement execution")

# Specific Tool Implementations (Scaffold)
class TextPreprocessingTool(Tool):
    def __init__(self):
        super().__init__(
            name="text_preprocessor",
            category=ToolCategory.PREPROCESSING,
            description="Clean and normalize input text",
            required_dependencies=["text_cleaning_library"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for text preprocessing logic
        return {
            "status": "success",
            "processed_text": context.input_document.lower().strip(),
            "preprocessing_steps": [
                "lowercase conversion",
                "whitespace removal",
                "special character handling"
            ]
        }

# class ConceptExtractionTool(Tool):
#     def __init__(self):
#         super().__init__(
#             name="concept_extractor",
#             category=ToolCategory.ANALYSIS,
#             description="Extract key concepts and their relationships",
#             required_dependencies=["nlp_library", "concept_mapping_tool"]
#         )

#     def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
#         # Placeholder for concept extraction logic
#         return {
#             "status": "success",
#             "extracted_concepts": [
#                 {
#                     "term": "key concept",
#                     "definition": "brief explanation",
#                     "importance_score": 0.8
#                 }
#             ],
#             "concept_relationships": []
#         }

class ConceptExtractionTool(Tool):
    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        try:
            # More sophisticated concept extraction
            if not context.input_document:
                logger.warning("Input document is empty")
                return {
                    "status": "error",
                    "error_message": "Input document is empty"
                }

            # Basic keyword extraction (very simple implementation)
            words = context.input_document.lower().split()
            word_freq = {}

            for word in words:
                if len(word) > 3:  # Filter out very short words
                    word_freq[word] = word_freq.get(word, 0) + 1

            # Sort words by frequency
            top_concepts = sorted(word_freq.items(), key=lambda x: x[1], reverse=True)[:5]

            # Convert to concept format
            concepts = [
                {
                    "term": concept,
                    "frequency": freq,
                    "importance_score": freq / len(words)
                }
                for concept, freq in top_concepts
            ]

            # Store concepts in context
            context.extracted_concepts = concepts

            return {
                "status": "success",
                "extracted_concepts": concepts,
                "concept_count": len(concepts)
            }
        except Exception as e:
            logger.error(f"Concept extraction error: {e}")
            return {
                "status": "error",
                "error_message": str(e)
            }


# class SummarizationTool(Tool):
#     def __init__(self):
#         super().__init__(
#             name="summarization_tool",
#             category=ToolCategory.SYNTHESIS,
#             description="Generate educational summary from extracted concepts",
#             required_dependencies=["language_model", "summary_generation_library"]
#         )
#     def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
#         # More meaningful placeholder summary generation
#         summary = f"Summary of document (length: {len(context.input_document)} characters):\n\n"
#         summary += "Key Points:\n"
#         summary += "1. Document provides insights into tool design best practices\n"
#         summary += "2. Focuses on creating modular and flexible agent systems\n"
#         summary += "3. Emphasizes the importance of separation of concerns"

#         context.summary_drafts.append(summary)

#         return {
#             "status": "success",
#             "summary_draft": summary,
#             "summary_metadata": {
#                 "length": len(summary),
#                 "complexity_level": context.processing_config.get("complexity_level")
#             }
#         }


class SummarizationTool(Tool):
    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        try:
            # Log input document details
            logger.info(f"Summarization input document length: {len(context.input_document)}")

            # More robust summary generation
            if not context.input_document:
                logger.warning("Input document is empty")
                return {
                    "status": "error",
                    "error_message": "Input document is empty"
                }

            # Improved sentence splitting
            sentences = re.split(r'(?<=[.!?])\s+', context.input_document)

            # Generate a summary
            summary = f"Summary of document (length: {len(context.input_document)} characters):\n\n"
            summary += "Key Points:\n"

            # Take first 3 non-empty sentences
            key_points = [s.strip() for s in sentences if s.strip()][:3]

            for i, sentence in enumerate(key_points, 1):
                summary += f"{i}. {sentence}\n"

            # Append summary to context
            context.summary_drafts.append(summary)

            logger.info(f"Generated summary: {summary}")

            return {
                "status": "success",
                "summary_draft": summary,
                "summary_metadata": {
                    "length": len(summary),
                    "complexity_level": context.processing_config.get("complexity_level"),
                    "sentences_used": len(key_points)
                }
            }
        except Exception as e:
            logger.error(f"Error in summarization: {e}")
            return {
                "status": "error",
                "error_message": str(e)
            }


class ValidationTool(Tool):
    def __init__(self):
        super().__init__(
            name="summary_validator",
            category=ToolCategory.VALIDATION,
            description="Validate generated summary for quality and accuracy",
            required_dependencies=["validation_library", "fact_checking_service"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for summary validation logic
        return {
            "status": "success",
            "validation_results": [
                {
                    "aspect": "factual_accuracy",
                    "score": 0.9,
                    "recommendations": []
                },
                {
                    "aspect": "readability",
                    "score": 0.85,
                    "recommendations": []
                }
            ]
        }

class OutputGenerationTool(Tool):
    def __init__(self):
        super().__init__(
            name="output_generator",
            category=ToolCategory.OUTPUT,
            description="Generate final formatted output",
            required_dependencies=["formatting_library"]
        )

    def execute(self, context: DocumentProcessingContext) -> Dict[str, Any]:
        # Placeholder for final output generation
        return {
            "status": "success",
            "output_formats": ["markdown", "html", "plain_text"],
            "final_output": "# Summary\n\nDetailed educational summary..."
        }

# 5. Tool Registry
class ToolRegistry:
    def __init__(self):
        self.tools: Dict[str, Tool] = {}

    def register_tool(self, tool: Tool):
        self.tools[tool.name] = tool

    def get_tool(self, name: str) -> Tool:
        return self.tools.get(name)

    def get_tools_by_category(self, category: ToolCategory) -> List[Tool]:
        return [tool for tool in self.tools.values() if tool.category == category]

# 6. Agent Capabilities
class Capability:
    def pre_process(self, context: DocumentProcessingContext):
        pass

    def post_process(self, context: DocumentProcessingContext):
        pass

class QualityAssuranceCapability(Capability):
    def post_process(self, context: DocumentProcessingContext):
        # Additional quality checks
        pass

class ProgressTrackingCapability(Capability):
    def pre_process(self, context: DocumentProcessingContext):
        context.document_metadata['start_time'] = time.time()

    def post_process(self, context: DocumentProcessingContext):
        context.document_metadata['end_time'] = time.time()
        context.document_metadata['total_processing_time'] = (
            context.document_metadata['end_time'] -
            context.document_metadata['start_time']
        )

# 7. Agent Framework
class DocumentProcessingAgent:
    def __init__(
        self,
        tool_registry: ToolRegistry,
        capabilities: Optional[List[Capability]] = None
    ):
        self.tool_registry = tool_registry
        self.capabilities = capabilities or []

    def process_document(self, document: str, config: Dict[str, Any] = None) -> DocumentProcessingContext:
        # Create processing context
        context = DocumentProcessingContext(input_document=document)

        # Use provided config or default
        if config:
            context.processing_config.update(config)

        # Apply pre-processing capabilities
        for capability in self.capabilities:
            try:
                capability.pre_process(context)
            except Exception as e:
                logger.error(f"Error in pre-processing capability: {e}")

        # Define processing pipeline
        pipeline_order = [
            "text_preprocessor",
            "concept_extractor",
            "summarization_tool",
            "summary_validator",
            "output_generator"
        ]

        # Execute tools in sequence
        for tool_name in pipeline_order:
            tool = self.tool_registry.get_tool(tool_name)
            if tool is None:
                logger.error(f"Tool {tool_name} not found in registry")
                continue

            try:
                result = tool.execute(context)

                # Log tool results
                logger.info(f"Tool {tool_name} result: {result}")
                print(f"Tool {tool_name} result: {result}")  # Also print for console visibility
            except Exception as e:
                logger.error(f"Error executing tool {tool_name}: {e}")

        # Apply post-processing capabilities
        for capability in self.capabilities:
            try:
                capability.post_process(context)
            except Exception as e:
                logger.error(f"Error in post-processing capability: {e}")

        # Ensure summary_drafts is not empty
        if not context.summary_drafts:
            context.summary_drafts.append("No summary generated")

        return context

# 8. Initialization and Wiring
def setup_document_processing_agent() -> DocumentProcessingAgent:
    # Create tool registry
    registry = ToolRegistry()

    # Register tools
    registry.register_tool(TextPreprocessingTool())
    registry.register_tool(ConceptExtractionTool())
    registry.register_tool(SummarizationTool())
    registry.register_tool(ValidationTool())
    registry.register_tool(OutputGenerationTool())

    # Create capabilities
    capabilities = [
        QualityAssuranceCapability(),
        ProgressTrackingCapability()
    ]

    # Create and return agent
    return DocumentProcessingAgent(registry, capabilities)


# Logging Configuration
import logging
import sys

def setup_logging():
    # Create logger
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)

    # Create handlers
    c_handler = logging.StreamHandler(sys.stdout)  # Console handler
    f_handler = logging.FileHandler("/content/agent_processing.log")

    # Create formatters and add it to handlers
    log_format = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    c_handler.setFormatter(log_format)
    f_handler.setFormatter(log_format)

    # Add handlers to the logger
    logger.addHandler(c_handler)
    logger.addHandler(f_handler)

    return logger

# Call this at the start of your script
logger = setup_logging()

# Main function
def main():
    # Create the agent
    agent = setup_document_processing_agent()

    # Sample documents for testing
    sample_documents = [
        """Machine learning is a subset of artificial intelligence that focuses on the use of data
        and algorithms to imitate the way that humans learn, gradually improving its accuracy.
        Unlike traditional programming, machine learning allows computers to learn from and make
        predictions or decisions based on data.""",

        """Natural language processing (NLP) is a branch of artificial intelligence that helps
        computers understand, interpret, and manipulate human language. It bridges the communication
        gap between computers and humans by enabling machines to process and analyze large amounts
        of natural language data."""
    ]

    # Process multiple documents
    results = []
    for i, document in enumerate(sample_documents, 1):
        print(f"\n--- Processing Document {i} ---")
        result = agent.process_document(document)
        results.append(result)

        # Print out results
        print(f"Summary Drafts (Document {i}):", result.summary_drafts)
        print(f"Document Metadata (Document {i}):")
        for key, value in result.document_metadata.items():
            print(f"{key}: {value}")

        print(f"Extracted Concepts (Document {i}):")
        print(result.extracted_concepts)

    return results

# Optional: Add error handling wrapper
def run_main():
    try:
        results = main()
        return results
    except Exception as e:
        logger.error(f"Error in main execution: {e}")
        print(f"An error occurred: {e}")
        return None

# Ensure the main block runs
if __name__ == "__main__":
    run_main()

'''

with open("agent_script.py", "w") as file:
    file.write(agent_script)

print("Script successfully written to agent_script.py")

Script successfully written to agent_script.py


In [62]:
# Read a file from Colab
with open('/content/agent_script.py', 'r') as f:
    doc_content = f.read()

prompt='''
I made the modifications. Can you review the agent_script now please?
'''
# chat_with_claude(prompt)
chat_with_claude(f"{prompt} {doc_content}")

### Run Script III

In [69]:
prompt = '''
I ran this code but gothe return was minimal:

 import os
 import logging

 # Configure logging to see more details
 logging.basicConfig(level=logging.DEBUG)

 # Import the agent
 from agent_script import setup_document_processing_agent

 # Create the agent
 agent = setup_document_processing_agent()

 # Process a file
 file_path = "/content/files/_Agent_04_Tool_Design_Best_Practices.txt"

 # Read the file
 with open(file_path, 'r', encoding='utf-8') as file:
     document_text = file.read()

 # Print document length for verification
 print("Document Length:", len(document_text))
 print("First 200 characters:", document_text[:200])

 # Process the document
 result = agent.process_document(document_text)

 # Detailed debugging
 print("\n--- Debugging Information ---")
 print("Summary Drafts:", result.summary_drafts)
 print("Extracted Concepts:", result.extracted_concepts)
 print("Processing Time:", result.document_metadata.get('total_processing_time'))

 # Check each tool's execution
 print("\n--- Tool Execution Logs ---")
 # You might want to add more logging in your agent_script.py to capture tool execution details

 return =------------

 Document Length: 1127
First 200 characters:
## **Chapter 4: Tool Design Best Practices**

### Concept Summary

Tools are the **hands of your agent** — the way it interacts with the world. Good design makes the difference between fragile hacks

--- Debugging Information ---
Summary Drafts: []
Extracted Concepts: []
Processing Time: 5.8650970458984375e-05

--- Tool Execution Logs ---
'''
chat_with_claude(prompt)

In [68]:
 import os
 import logging

 # Configure logging to see more details
 logging.basicConfig(level=logging.DEBUG)

 # Import the agent
 from agent_script import setup_document_processing_agent

 # Create the agent
 agent = setup_document_processing_agent()

 # Process a file
 file_path = "/content/files/_Agent_04_Tool_Design_Best_Practices.txt"

 # Read the file
 with open(file_path, 'r', encoding='utf-8') as file:
     document_text = file.read()

 # Print document length for verification
 print("Document Length:", len(document_text))
 print("First 200 characters:", document_text[:200])

 # Process the document
 result = agent.process_document(document_text)

 # Detailed debugging
 print("\n--- Debugging Information ---")
 print("Summary Drafts:", result.summary_drafts)
 print("Extracted Concepts:", result.extracted_concepts)
 print("Processing Time:", result.document_metadata.get('total_processing_time'))

 # Check each tool's execution
 print("\n--- Tool Execution Logs ---")
 # You might want to add more logging in your agent_script.py to capture tool execution details

Document Length: 1127
First 200 characters: 
## **Chapter 4: Tool Design Best Practices**

### Concept Summary

Tools are the **hands of your agent** — the way it interacts with the world. Good design makes the difference between fragile hacks 

--- Debugging Information ---
Summary Drafts: []
Extracted Concepts: []
Processing Time: 5.8650970458984375e-05

--- Tool Execution Logs ---


In [60]:
 import os
 import logging

 # Configure logging to see more details
 logging.basicConfig(level=logging.DEBUG)

 # Import the agent
 from agent_script import setup_document_processing_agent

 # Create the agent
 agent = setup_document_processing_agent()

 # Process a file
 file_path = "/content/files/_Agent_04_Tool_Design_Best_Practices.txt"

 # Read the file
 with open(file_path, 'r', encoding='utf-8') as file:
     document_text = file.read()

 # Process the document
 result = agent.process_document(document_text)

 # Display results
 print("Summary:", result.summary_drafts)
 print("Processing Time:", result.document_metadata.get('total_processing_time'))
 print("Extracted Concepts:", result.extracted_concepts)


Summary: []
Processing Time: 5.364418029785156e-05
Extracted Concepts: []


In [53]:
# Import specific components
from agent_script import (
    setup_document_processing_agent,
    DocumentProcessingContext
)

# Create the agent
agent = setup_document_processing_agent()

# Define your document
sample_document = """
Your text document goes here. This could be a long piece of text
that you want to process and summarize.
"""

# Process the document
result = agent.process_document(sample_document)

# Access different parts of the result
print("Summary:", result.summary_drafts)
print("Metadata:", result.document_metadata)
print("Extracted Concepts:", result.extracted_concepts)

Summary: []
Metadata: {'start_time': 1757108917.048882, 'end_time': 1757108917.0489202, 'total_processing_time': 3.814697265625e-05}
Extracted Concepts: []


In [54]:
 import os

 # Import the agent
 from agent_script import setup_document_processing_agent

 # Create the agent
 agent = setup_document_processing_agent()

 # Directory containing documents
 document_dir = "/content/files/"

 # Process all text files in the directory
 results = {}
 for filename in os.listdir(document_dir):
     if filename.endswith('.txt'):
         file_path = os.path.join(document_dir, filename)

         # Read the file
         with open(file_path, 'r', encoding='utf-8') as file:
             document_text = file.read()

         # Process the document
         result = agent.process_document(document_text)

         # Store results
         results[filename] = result

 # Analyze results
 for filename, result in results.items():
     print(f"\nFile: {filename}")
     print("Summary:", result.summary_drafts)
     print("Processing Time:", result.document_metadata.get('total_processing_time'))


File: _Agent_04_Tool_Design_Best_Practices.txt
Summary: []
Processing Time: 4.9114227294921875e-05

File: _Agent_05_GAME_Framework.txt
Summary: []
Processing Time: 8.726119995117188e-05

File: _Agent_06_Modular_Agent_Design.txt
Summary: []
Processing Time: 1.049041748046875e-05
