feat: package restructure with backward compatibility and error handling #2

allenday · 2026-01-24T09:37:44Z

This PR implements the package restructuring (PR1) with critical error
handling improvements added for production resilience.

Package Structure & Backward Compatibility

New Package Structure

Moved all modules to agentproxy/ package directory
Created pyproject.toml with entry points for pa and pa-server
Added agentproxy/__init__.py with proper exports
Added agentproxy/__main__.py for module execution

Backward Compatibility (CRITICAL)

cli.py - Shim for old python cli.py entry point
server.py - Shim for old python server.py entry point
All old entry points continue to work without modification
New entry points available after pip install -e .:
- pa command
- pa-server command
- python -m agentproxy
- python -m agentproxy.server

Tests (29 passing)

tests/test_baseline_compatibility.py - 23 tests covering:
- Old entry points (python cli.py, python server.py)
- New entry points (pa, pa-server, python -m agentproxy)
- Package imports and structure
- Core components
tests/unit/test_gemini_error_handling.py - 4 unit tests
tests/integration/test_integration_error_handling.py - 2 integration tests

Gemini Error Handling & Resilience (Critical Addition)

Problem Solved

Previous implementation had no retry logic or error recovery, causing:

Lost instruction context during transient API errors
Abrupt failures on rate limits or network issues
No graceful degradation

Solution Implemented

1. Retry Logic with Exponential Backoff (gemini_client.py)

GeminiAPIError exception with retry metadata
Automatic retry on 5xx errors (max 3 attempts)
Exponential backoff: 1s, 2s, 4s
No retry on 4xx client errors (won't succeed)
Structured error messages for upstream handling

2. Error Recovery & Context Preservation (pa_agent.py)

Track consecutive errors (resets on success)
First 2 errors: Return NO_OP to preserve instruction context
3rd consecutive error: Trigger SAVE_SESSION for graceful exit
Parse error strings to extract type, status code, message
Increment error counter on both API and parse failures

3. Instruction Preservation (pa.py)

Track _original_task and _last_valid_instruction
NO_OP returns empty string (signals "no change")
Main loop keeps last valid instruction during errors
Prevents sending vague "Continue" during error states

4. Session Save Function (function_executor.py)

New SAVE_SESSION function for graceful exit
Saves session state with error context
Returns session ID for resumption
Metadata includes error type, status code, reason

Error Flow

Gemini API Error
  ↓
Retry 3 times with exponential backoff
  ↓
If all retries fail → Return [GEMINI_ERROR:type:code:message]
  ↓
PA Agent detects error string
  ↓
Error #1-2: Return NO_OP (preserve instruction)
  ↓
Error #3: Trigger SAVE_SESSION (graceful exit)

Testing

Unit tests verify error detection, counter increment, session save
Integration tests verify instruction preservation across error cycles
All 29 tests passing

Files Changed

New Files

cli.py - Backward compatibility shim
server.py - Backward compatibility shim
tests/test_baseline_compatibility.py - Package structure tests
tests/unit/test_gemini_error_handling.py - Error handling unit tests
tests/integration/test_integration_error_handling.py - Integration tests
tests/__init__.py, tests/unit/__init__.py, tests/integration/__init__.py

Modified Files

agentproxy/gemini_client.py - Retry logic, GeminiAPIError exception
agentproxy/pa_agent.py - Error detection, consecutive error tracking
agentproxy/pa.py - Instruction preservation during errors
agentproxy/function_executor.py - SAVE_SESSION function
.gitignore - Updated patterns

Deleted Files

test_state_detection.py - Obsolete test file

Verification

All entry points tested and working:

✅ python cli.py --help
✅ python server.py --help
✅ pa --help
✅ pa-server --help
✅ python -m agentproxy --help
✅ python -m agentproxy.server --help
✅ Package imports work correctly
✅ All 29 tests passing

Breaking Changes

None - full backward compatibility maintained.

- Restructure project into agentproxy package - Add OpenTelemetry support with traces and metrics - Add --claude-bin CLI option for custom Claude binary path - Fix .env loading path to project root - Add security fixes for command injection in BrowserVerifier - Add OTEL stack examples with Grafana dashboards - Add pyproject.toml for proper packaging Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>

- Add build artifacts (*.egg-info, build/, dist/) to gitignore - Add vim undo files (*.un~) to gitignore - Add sandbox/ directory to gitignore - Include claude-code-otel.sh wrapper for testing upstream telemetry - Include implementation plan document Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>

This PR implements the package restructuring (PR1) with critical error handling improvements added for production resilience. ## Package Structure & Backward Compatibility ### New Package Structure - Moved all modules to `agentproxy/` package directory - Created `pyproject.toml` with entry points for `pa` and `pa-server` - Added `agentproxy/__init__.py` with proper exports - Added `agentproxy/__main__.py` for module execution ### Backward Compatibility (CRITICAL) - **cli.py** - Shim for old `python cli.py` entry point - **server.py** - Shim for old `python server.py` entry point - All old entry points continue to work without modification - New entry points available after `pip install -e .`: - `pa` command - `pa-server` command - `python -m agentproxy` - `python -m agentproxy.server` ### Tests (29 passing) - **tests/test_baseline_compatibility.py** - 23 tests covering: - Old entry points (python cli.py, python server.py) - New entry points (pa, pa-server, python -m agentproxy) - Package imports and structure - Core components - **tests/unit/test_gemini_error_handling.py** - 4 unit tests - **tests/integration/test_integration_error_handling.py** - 2 integration tests ## Gemini Error Handling & Resilience (Critical Addition) ### Problem Solved Previous implementation had no retry logic or error recovery, causing: - Lost instruction context during transient API errors - Abrupt failures on rate limits or network issues - No graceful degradation ### Solution Implemented #### 1. Retry Logic with Exponential Backoff (gemini_client.py) - **GeminiAPIError** exception with retry metadata - Automatic retry on 5xx errors (max 3 attempts) - Exponential backoff: 1s, 2s, 4s - No retry on 4xx client errors (won't succeed) - Structured error messages for upstream handling #### 2. Error Recovery & Context Preservation (pa_agent.py) - Track consecutive errors (resets on success) - **First 2 errors**: Return NO_OP to preserve instruction context - **3rd consecutive error**: Trigger SAVE_SESSION for graceful exit - Parse error strings to extract type, status code, message - Increment error counter on both API and parse failures #### 3. Instruction Preservation (pa.py) - Track `_original_task` and `_last_valid_instruction` - NO_OP returns empty string (signals "no change") - Main loop keeps last valid instruction during errors - Prevents sending vague "Continue" during error states #### 4. Session Save Function (function_executor.py) - New SAVE_SESSION function for graceful exit - Saves session state with error context - Returns session ID for resumption - Metadata includes error type, status code, reason ### Error Flow ``` Gemini API Error ↓ Retry 3 times with exponential backoff ↓ If all retries fail → Return [GEMINI_ERROR:type:code:message] ↓ PA Agent detects error string ↓ Error #1-2: Return NO_OP (preserve instruction) ↓ Error aertoria#3: Trigger SAVE_SESSION (graceful exit) ``` ### Testing - Unit tests verify error detection, counter increment, session save - Integration tests verify instruction preservation across error cycles - All 29 tests passing ## Files Changed ### New Files - `cli.py` - Backward compatibility shim - `server.py` - Backward compatibility shim - `tests/test_baseline_compatibility.py` - Package structure tests - `tests/unit/test_gemini_error_handling.py` - Error handling unit tests - `tests/integration/test_integration_error_handling.py` - Integration tests - `tests/__init__.py`, `tests/unit/__init__.py`, `tests/integration/__init__.py` ### Modified Files - `agentproxy/gemini_client.py` - Retry logic, GeminiAPIError exception - `agentproxy/pa_agent.py` - Error detection, consecutive error tracking - `agentproxy/pa.py` - Instruction preservation during errors - `agentproxy/function_executor.py` - SAVE_SESSION function - `.gitignore` - Updated patterns ### Deleted Files - `test_state_detection.py` - Obsolete test file ## Verification All entry points tested and working: - ✅ `python cli.py --help` - ✅ `python server.py --help` - ✅ `pa --help` - ✅ `pa-server --help` - ✅ `python -m agentproxy --help` - ✅ `python -m agentproxy.server --help` - ✅ Package imports work correctly - ✅ All 29 tests passing ## Breaking Changes None - full backward compatibility maintained. Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>

- Add CLAUDE_BIN env var support for custom Claude binary paths - Add automatic .env file loading in CLI and server - Update .env.example with CLAUDE_BIN documentation These changes enable using local Claude builds with stream-json support and simplify configuration management. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Fixed multiple critical bugs preventing PA from terminating correctly: 1. Path duplication in verification (./sandbox/./sandbox/file.py) - Use os.path.relpath() since cwd is already set 2. Gemini retry logic not respecting retryable flag - Check both is_client_error and retryable flag 3. Missing DONE state in ControllerState enum - Added DONE state for proper termination - PA checks for DONE state and breaks iteration loop 4. PA hallucinating requirements from task breakdown - Clarified breakdown is GUIDE not REQUIREMENTS - PA now validates against ORIGINAL TASK only - Updated prompts to prevent adding hallucinated requirements 5. Verification results not in PA reasoning context - PA now sees verification results when making decisions - Includes [VERIFICATION RESULT] in context passed to run_iteration 6. AttributeError accessing event.source - Use event.metadata.get('source') instead These fixes enable PA to properly terminate after task completion instead of continuing to iterate unnecessarily. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

allenday and others added 6 commits January 24, 2026 15:51

chore(docs): .env.example

bc83358

allenday force-pushed the feat/package-restructure branch from f17f63e to ea5502e Compare January 25, 2026 09:47

allenday mentioned this pull request Jan 25, 2026

feat: persist PA state for supervisor/worker composability #4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: package restructure with backward compatibility and error handling #2

feat: package restructure with backward compatibility and error handling #2

Uh oh!

allenday commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: package restructure with backward compatibility and error handling #2

Are you sure you want to change the base?

feat: package restructure with backward compatibility and error handling #2

Uh oh!

Conversation

allenday commented Jan 24, 2026

Package Structure & Backward Compatibility

New Package Structure

Backward Compatibility (CRITICAL)

Tests (29 passing)

Gemini Error Handling & Resilience (Critical Addition)

Problem Solved

Solution Implemented

1. Retry Logic with Exponential Backoff (gemini_client.py)

2. Error Recovery & Context Preservation (pa_agent.py)

3. Instruction Preservation (pa.py)

4. Session Save Function (function_executor.py)

Error Flow

Testing

Files Changed

New Files

Modified Files

Deleted Files

Verification

Breaking Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant