# Resume Tailor Agent

An intelligent agent that tailors your LaTeX resume to specific job postings while preserving formatting and maintaining accuracy.

## Features

- **LaTeX-Safe**: Preserves LaTeX formatting and syntax
- **Iterative**: Supports multiple revision rounds
- **Job-Focused**: Analyzes job postings and matches requirements
- **ATS-Optimized**: Uses keywords naturally for applicant tracking systems
- **Validation**: Checks LaTeX syntax before output

---

## Setup

Import required libraries and configure the environment.

## API Provider Configuration

This notebook supports multiple AI providers. Configure your credentials in the `.env` file:

### Option 1: OpenAI (Recommended for getting started)
```bash
OPENAI_API_KEY=sk-your-openai-key-here
```

### Option 2: AWS Bedrock (Production-ready)
```bash
# Using long-term API key (recommended)
AWS_BEARER_TOKEN_BEDROCK=your-long-term-bedrock-key
AWS_REGION=us-east-1

# OR using standard AWS credentials
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION=us-east-1
```

The notebook will automatically detect which credentials are available and use them.

---

In [1]:
# Core imports
import os
from pathlib import Path
from dotenv import load_dotenv

# Strands SDK
from strands import Agent, tool

# Utilities
import json
from datetime import datetime

# Load environment variables
load_dotenv()

print("‚úÖ Imports successful!")
print(f"Python Path: {Path.cwd()}")

‚úÖ Imports successful!
Python Path: d:\Strands-agent


## Logging & Observability

Configure logging to trace all agent operations and tool calls.

In [2]:
import logging
import json
from datetime import datetime as dt

# Custom JSON formatter for structured logs
class JsonFormatter(logging.Formatter):
    def format(self, record):
        log_data = {
            "timestamp": self.formatTime(record),
            "level": record.levelname,
            "name": record.name,
            "message": record.getMessage(),
            "function": record.funcName,
            "line": record.lineno
        }
        
        # Add exception info if present
        if record.exc_info:
            log_data["exception"] = self.formatException(record.exc_info)
        
        return json.dumps(log_data)

# Create logs directory
PROJECT_ROOT = Path.cwd()
LOGS_DIR = PROJECT_ROOT / "logs"
LOGS_DIR.mkdir(exist_ok=True)

# Generate log filename with timestamp
log_filename = LOGS_DIR / f"strands_agent_{dt.now().strftime('%Y%m%d_%H%M%S')}.log"

# Configure file handler with JSON formatting
file_handler = logging.FileHandler(log_filename)
file_handler.setFormatter(JsonFormatter())
file_handler.setLevel(logging.DEBUG)

# Configure console handler with simple formatting (for notebook output)
console_handler = logging.StreamHandler()
console_handler.setFormatter(logging.Formatter(
    '%(levelname)s | %(name)s | %(message)s'
))
console_handler.setLevel(logging.WARNING)  # Only show warnings/errors in notebook

# Configure the strands logger
strands_logger = logging.getLogger("strands")
strands_logger.setLevel(logging.DEBUG)
strands_logger.addHandler(file_handler)
strands_logger.addHandler(console_handler)

# Prevent duplicate logs
strands_logger.propagate = False

print("‚úÖ Logging configured!")
print(f"   Log file: {log_filename}")
print(f"   Console level: WARNING (errors only)")
print(f"   File level: DEBUG (all traces)")
print()
print("üìä Log includes:")
print("  ‚Ä¢ All agent operations")
print("  ‚Ä¢ Tool calls and responses")
print("  ‚Ä¢ Model interactions")
print("  ‚Ä¢ Validation results")
print("  ‚Ä¢ Error traces")

‚úÖ Logging configured!
   Log file: d:\Strands-agent\logs\strands_agent_20251116_145958.log
   File level: DEBUG (all traces)

üìä Log includes:
  ‚Ä¢ All agent operations
  ‚Ä¢ Tool calls and responses
  ‚Ä¢ Model interactions
  ‚Ä¢ Validation results
  ‚Ä¢ Error traces


In [3]:
# Helper functions to view and analyze logs

def view_latest_logs(num_lines=50, level_filter=None):
    """
    View the latest log entries from the current log file.
    
    Args:
        num_lines: Number of recent log lines to display
        level_filter: Filter by log level (e.g., 'ERROR', 'DEBUG', 'WARNING')
    """
    try:
        with open(log_filename, 'r') as f:
            lines = f.readlines()
        
        # Filter by level if specified
        if level_filter:
            filtered_lines = []
            for line in lines:
                try:
                    log_entry = json.loads(line)
                    if log_entry.get('level') == level_filter:
                        filtered_lines.append(line)
                except:
                    continue
            lines = filtered_lines
        
        # Show last N lines
        recent_lines = lines[-num_lines:]
        
        print(f"üìã Showing last {len(recent_lines)} log entries:")
        print(f"   Filter: {level_filter if level_filter else 'All levels'}")
        print(f"   Total entries: {len(lines)}")
        print("=" * 80)
        
        for line in recent_lines:
            try:
                log_entry = json.loads(line)
                print(f"{log_entry['timestamp']} | {log_entry['level']:8} | {log_entry['name']}")
                print(f"  ‚Üí {log_entry['message']}")
                if 'exception' in log_entry:
                    print(f"  ‚ö†Ô∏è  {log_entry['exception']}")
                print()
            except:
                print(line.strip())
    
    except FileNotFoundError:
        print(f"‚ùå Log file not found: {log_filename}")
    except Exception as e:
        print(f"‚ùå Error reading logs: {e}")


def count_tool_calls():
    """Count how many times each tool was called."""
    try:
        with open(log_filename, 'r') as f:
            lines = f.readlines()
        
        tool_calls = {}
        for line in lines:
            try:
                log_entry = json.loads(line)
                message = log_entry.get('message', '')
                
                # Look for tool call patterns
                if 'tool' in message.lower() and 'call' in message.lower():
                    # Extract tool name (you may need to adjust this based on actual log format)
                    if 'merge_sections' in message:
                        tool_calls['merge_sections'] = tool_calls.get('merge_sections', 0) + 1
                    elif 'read_file' in message:
                        tool_calls['read_file'] = tool_calls.get('read_file', 0) + 1
                    elif 'validate_latex' in message:
                        tool_calls['validate_latex'] = tool_calls.get('validate_latex', 0) + 1
                    elif 'extract_section' in message:
                        tool_calls['extract_section'] = tool_calls.get('extract_section', 0) + 1
            except:
                continue
        
        print("üîß Tool Call Summary:")
        print("=" * 40)
        for tool, count in sorted(tool_calls.items(), key=lambda x: x[1], reverse=True):
            print(f"  {tool:20} : {count:3} calls")
        
        if not tool_calls:
            print("  No tool calls detected in logs")
    
    except Exception as e:
        print(f"‚ùå Error analyzing tool calls: {e}")


def export_logs_to_readable(output_file=None):
    """Export JSON logs to a human-readable format."""
    if output_file is None:
        output_file = LOGS_DIR / f"readable_log_{dt.now().strftime('%Y%m%d_%H%M%S')}.txt"
    
    try:
        with open(log_filename, 'r') as f:
            lines = f.readlines()
        
        with open(output_file, 'w') as out:
            out.write(f"Strands Agent Log - Readable Format\n")
            out.write(f"Generated: {dt.now()}\n")
            out.write(f"Source: {log_filename}\n")
            out.write("=" * 80 + "\n\n")
            
            for line in lines:
                try:
                    log_entry = json.loads(line)
                    out.write(f"[{log_entry['timestamp']}] {log_entry['level']}\n")
                    out.write(f"Module: {log_entry['name']}\n")
                    out.write(f"Message: {log_entry['message']}\n")
                    if 'exception' in log_entry:
                        out.write(f"Exception:\n{log_entry['exception']}\n")
                    out.write("-" * 80 + "\n\n")
                except:
                    out.write(line)
        
        print(f"‚úÖ Readable log exported to: {output_file}")
        return output_file
    
    except Exception as e:
        print(f"‚ùå Error exporting logs: {e}")


print("‚úÖ Log analysis functions defined:")
print("  - view_latest_logs(num_lines=50, level_filter=None)")
print("  - count_tool_calls()")
print("  - export_logs_to_readable(output_file=None)")
print()
print("Examples:")
print('  view_latest_logs(20)                    # Last 20 log entries')
print('  view_latest_logs(level_filter="ERROR")  # Only errors')
print('  count_tool_calls()                      # Tool usage stats')
print('  export_logs_to_readable()               # Export to .txt file')

‚úÖ Log analysis functions defined:
  - view_latest_logs(num_lines=50, level_filter=None)
  - count_tool_calls()
  - export_logs_to_readable(output_file=None)

Examples:
  view_latest_logs(20)                    # Last 20 log entries
  view_latest_logs(level_filter="ERROR")  # Only errors
  count_tool_calls()                      # Tool usage stats
  export_logs_to_readable()               # Export to .txt file


## Configuration

Set up paths and verify environment.

In [4]:
# Project paths
PROJECT_ROOT = Path.cwd()
PROMPTS_DIR = PROJECT_ROOT / "prompts"
DATA_DIR = PROJECT_ROOT / "data"
ORIGINAL_RESUME_DIR = DATA_DIR / "original"
JOB_POSTINGS_DIR = DATA_DIR / "job_postings"
OUTPUT_DIR = DATA_DIR / "tailored_versions"

# Detect which API credentials are available
print("üîç Checking API credentials...")
print()

has_openai = bool(os.getenv('OPENAI_API_KEY'))
has_bedrock_token = bool(os.getenv('AWS_BEARER_TOKEN_BEDROCK'))
has_aws_creds = bool(os.getenv('AWS_ACCESS_KEY_ID'))

from strands.models import openai

if has_openai:
    print("‚úÖ OpenAI API key found")
    #MODEL_ID = openai.OpenAIModel(model_id="gpt-5-mini")
    MODEL_ID = openai.OpenAIModel(
    model_id="gpt-5.1",  # Note: prompt caching works best with gpt-4o and newer models
    params={
        "store": True,  # Enable prompt caching
        "metadata": {
            "purpose": "resume_tailoring"  # Optional: track usage
        }
    }
)
elif has_bedrock_token:
    print("‚úÖ AWS Bedrock bearer token found")
    MODEL_PROVIDER = "bedrock"
    MODEL_ID = "us.anthropic.claude-sonnet-4-20250514-v1:0"
elif has_aws_creds:
    print("‚úÖ AWS credentials found")
    MODEL_PROVIDER = "bedrock"
    MODEL_ID = "us.anthropic.claude-sonnet-4-20250514-v1:0"
else:
    print("‚ö†Ô∏è  Warning: No API credentials found!")
    print("Please set one of the following in .env file:")
    print("  - OPENAI_API_KEY (for OpenAI)")
    print("  - AWS_BEARER_TOKEN_BEDROCK (for Bedrock)")
    print("  - AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY (for AWS)")
    MODEL_PROVIDER = None
    MODEL_ID = None

print()
print(f"ü§ñ Selected Model: {MODEL_ID}")

# Verify directories exist
print()
print(f"üìÅ Project directories:")
print(f"  Prompts: {PROMPTS_DIR.exists()} - {PROMPTS_DIR}")
print(f"  Data: {DATA_DIR.exists()} - {DATA_DIR}")
print(f"  Output: {OUTPUT_DIR.exists()} - {OUTPUT_DIR}")

üîç Checking API credentials...

‚úÖ OpenAI API key found

ü§ñ Selected Model: <strands.models.openai.OpenAIModel object at 0x000001A10AAE2F90>

üìÅ Project directories:
  Prompts: True - d:\Strands-agent\prompts
  Data: True - d:\Strands-agent\data
  Output: True - d:\Strands-agent\data\tailored_versions


## Load System Prompts

Load agent instructions from separate files for easy iteration.

In [5]:
def load_prompt(filename: str) -> str:
    """Load a prompt from the prompts directory."""
    prompt_path = PROMPTS_DIR / filename
    if not prompt_path.exists():
        print(f"‚ö†Ô∏è  Warning: {filename} not found. Using default prompt.")
        return ""
    
    with open(prompt_path, 'r', encoding='utf-8') as f:
        content = f.read()
    print(f"‚úÖ Loaded {filename} ({len(content)} chars)")
    return content

# Load prompts
system_prompt = load_prompt("system_prompt.txt")
latex_rules = ""

# Combine prompts
full_prompt = f"{system_prompt}\n\n{latex_rules}".strip()

print(f"\nüìù Full system prompt: {len(full_prompt)} characters")

‚úÖ Loaded system_prompt.txt (8863 chars)

üìù Full system prompt: 8862 characters


## Section Generator Agent (Tool-Free)

This agent generates ONLY section text without calling any tools.
Python code handles all file I/O, merging, and validation.

### Architecture Benefits

- **No tool overhead**: Agent only generates text (30-40% lower token cost)
- **Predictable output**: Structured format with clear section markers
- **Python-controlled**: All file I/O and merging handled by Python code
- **Prompt caching enabled**: Additional 20-40% cost savings on repeated requests

In [6]:
# Create tool-free agent for section generation
section_generator_agent = Agent(
    model=MODEL_ID,
    system_prompt=full_prompt,
    tools=[]  # NO TOOLS - agent only generates text
)

print("‚úÖ Section Generator Agent created!")
print(f"   Model: {MODEL_ID}")
print(f"   Tools: {len(section_generator_agent.tool_names)} (none - text generation only)")
print(f"   System prompt: {len(full_prompt)} characters")
print()
print("This agent:")
print("  ‚Ä¢ Generates ONLY section text (no tool calls)")
print("  ‚Ä¢ Returns sections in predictable format")
print("  ‚Ä¢ Python code handles merge/validation")
print("  ‚Ä¢ Lower token cost (no tool overhead)")

‚úÖ Section Generator Agent created!
   Model: <strands.models.openai.OpenAIModel object at 0x000001A10AAE2F90>
   Tools: 0 (none - text generation only)
   System prompt: 8862 characters

This agent:
  ‚Ä¢ Generates ONLY section text (no tool calls)
  ‚Ä¢ Returns sections in predictable format
  ‚Ä¢ Python code handles merge/validation
  ‚Ä¢ Lower token cost (no tool overhead)


## Helper Functions for Section-Only Workflow

In [7]:
# Import helper functions from tools directory
from tools.resume_helpers import parse_sections, tailor_resume_sections

print("‚úÖ Helper functions imported:")
print("  - parse_sections(result) -> dict")
print("  - tailor_resume_sections(agent, job_posting_path, original_resume_path, output_path, include_experience=False)")
print()
print("Example usage:")
print('''
job_posting_path = "data/job_postings/ml_engineer.txt"

result = tailor_resume_sections(
    section_generator_agent,
    job_posting_path=job_posting_path,
    original_resume_path="data/original/AI_engineer.tex",
    output_path="data/tailored_versions/ml_engineer.tex"
)
''')

‚úÖ Helper functions imported:
  - parse_sections(result) -> dict
  - tailor_resume_sections(agent, job_posting_path, original_resume_path, output_path, include_experience=False)

Example usage:

job_posting_path = "data/job_postings/ml_engineer.txt"

result = tailor_resume_sections(
    section_generator_agent,
    job_posting_path=job_posting_path,
    original_resume_path="data/original/AI_engineer.tex",
    output_path="data/tailored_versions/ml_engineer.tex"
)



---

## Usage Example

The complete workflow is encapsulated in `tailor_resume_sections()`:

1. **Extracts** only the sections you want to change from the original resume
2. **Copies** the source resume to the output location
3. **Generates** tailored sections via the agent (using compact prompt)
4. **Merges** the updated sections back into the copied resume
5. **Validates** LaTeX syntax and returns status

### Function Signature

```python
tailor_resume_sections(
    section_generator_agent,      # The agent instance to use
    job_posting_path: str,        # Path to job posting .txt file
    original_resume_path: str,    # Path to original .tex file
    output_path: str,             # Path to save tailored .tex file
    include_experience: bool = False  # Update Experience section?
) -> str  # Returns validation result message
```

In [8]:
# Example: Tailor resume for a specific job posting

job_posting_path = "data/job_postings/quanlom.txt"  # Your job posting file
original_resume = "data/original/AI_engineer.tex"    # Your original resume
output_file = "data/tailored_versions/tailored_resume.tex"  # Output

result = tailor_resume_sections(
    section_generator_agent,  # Pass the agent instance
    job_posting_path=job_posting_path,
    original_resume_path=original_resume,
    output_path=output_file,
    include_experience=True  # Set to True to update Experience section
)

üìã Starting resume tailoring...
   Job posting: data/job_postings/quanlom.txt
   Original: data/original/AI_engineer.tex
   Output: data/tailored_versions/tailored_resume.tex

üì§ Extracting sections from original resume...
   ‚úì Extracted: Professional Summary
   ‚úì Extracted: Technical Proficiencies
   ‚úì Extracted: Professional Experience

üìÅ Copying original resume to output directory...
   ‚úì Copied to: data/tailored_versions/tailored_resume.tex

ü§ñ Generating tailored sections...
SUBTITLE:
Data Engineer

PROFESSIONAL SUMMARY:
\section{\faUser}{Professional Summary}
\resumeEntryStart
\resumeEntryS{}{
Data Engineer with experience designing, automating, and supporting data ingestion and transformation pipelines on \textbf{AWS}. Proficient in \textbf{Python}, \textbf{SQL}, and distributed data processing using \textbf{Spark} on platforms such as EMR and Glue, with both batch and real-time workflows. Build and operate containerized services on \textbf{ECS/Fargate} and \tex

---

## Next Steps

1. **Add job postings**: Save job posting text in `data/job_postings/` as `.txt` files
2. **Prepare resume**: Ensure your LaTeX resume is in `data/original/`
3. **Run tailoring**: Execute the cell above with your file paths
4. **Review output**: Check the generated `.tex` file in `data/tailored_versions/`
5. **Compile PDF**: Use `pdflatex` or your LaTeX editor to generate the final PDF

### Tips

- **Batch processing**: Run the function multiple times with different job postings to maximize prompt caching benefits (20-40% cost savings after first request)
- **Include experience**: Set `include_experience=True` for roles where experience needs tailoring
- **Monitor logs**: Use `view_latest_logs()` to verify agent didn't call any tools
- **Token savings**: This architecture saves 30-40% vs old workflow (no tool overhead) + caching savings