Skip to content

refactor(agentkit): engineering-standards uplift (logging / exceptions / timeouts)#104

Merged
cuericlee merged 4 commits into
mainfrom
feat/agentkit-engineering-standards
Jul 1, 2026
Merged

refactor(agentkit): engineering-standards uplift (logging / exceptions / timeouts)#104
cuericlee merged 4 commits into
mainfrom
feat/agentkit-engineering-standards

Conversation

@tiantt

@tiantt tiantt commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

What

Raises the core HTTP / credential / SDK inner-loop to one consistent engineering standard across three dimensions, plus the underlying transient-retry fix.

Foundations (new leaves): agentkit/errors.py (single stdlib-only root AgentKitError), utils/http_defaults.py (one source of truth for timeout/retry env+defaults), utils/redact.py + a RedactionFilter on every log handler.

  • Logging: get_logger(__name__) now actually used in the scoped modules (previously 0 callers); library stays silent-by-default.
  • Exceptions: transport errors wrapped into NetworkError; typed ApiError (with error_code); from e chaining; raise eraise; runtime-validation asserts→raises.
  • Secret leaks closed: API key (identity/auth), full response (request.py, cr.py debug, ve_sign.check_error). print(1)/print(2) removed. request() now re-raises instead of silently returning None.
  • Timeouts: base_service_client reads http_timeout()/http_retries() instead of hardcoded 30/30; cli_mount urlopen bounded.
  • Transient retry (base commit): OpenAPI calls get a bounded timeout + conservative retry (connection errors + 429/503 only, honors Retry-After, never retries read-timeouts — safe for non-idempotent Create*).

Tests

+22 offline unit tests (hierarchy, http_defaults clamping, RedactionFilter, _signed_request/_invoke_api error paths). Backward-compatible: except AuthError still catches the same set; no code did except AgentKitError before.

Stack

Base of the stack. Followed by the test PR (test/agentkit-apps-pipeline-ut) then the fix PR (fix/agentkit-latent-bugs).

🤖 Generated with Claude Code

tiantt and others added 3 commits July 1, 2026 10:47
…/exception/timeout standard

Establishes one standard for each of three engineering dimensions on the
inner-loop HTTP/credential/SDK layer, and applies it. Good building blocks
already existed but were unused (get_logger had 0 callers; redact()/mask()
were auth-only).

Foundations (new, dependency-light leaves):
- agentkit/errors.py: single stdlib-only root AgentKitError. toolkit.errors
  .AgentKitError and auth.errors.AuthError both reparent onto it, so a caller
  can `except agentkit.errors.AgentKitError` to catch any AgentKit failure
  while agentkit.auth stays free of any agentkit.toolkit import.
- agentkit/utils/http_defaults.py: single source of truth for timeout/retry
  defaults + env vars (AGENTKIT_HTTP_TIMEOUT/RETRIES/STREAM_TIMEOUT), clamped.
- agentkit/utils/redact.py: redact()/mask() moved here; auth/_redact.py is now
  a re-export shim. RedactionFilter installed on every log handler.

Logging: swap logging.getLogger()/idioms for get_logger(__name__) in the
scoped core modules; library stays silent-by-default.

Exceptions: wrap transport errors into NetworkError at the HTTP-owning layer;
typed ApiError (with optional error_code) for backend failures; always chain
with `from e`; `raise e` -> bare `raise`; runtime-validation asserts -> raises.

Secret-leak fixes: drop raw response/secret interpolation from identity/auth
(API key), utils/request, cr.py debug log, ve_sign check_error. request() now
re-raises instead of silently returning None on failure.

Timeouts: base_service_client reads http_timeout()/http_retries() instead of
hardcoded 30/30; cli_mount urlopen bounded via http_timeout().

Tests: +22 offline tests (hierarchy, http_defaults clamping, RedactionFilter,
_signed_request/_invoke_api error paths). Full suite: only the 1 pre-existing
failure on main (test_cli_add_harness env leak) remains; 0 new failures.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Targets the two highest-risk under-tested areas found by a coverage audit:
the deployable app server surfaces (apps/, ~9.9% -> 76.3%) and the deploy
pipeline logic (toolkit runners/executors/strategies). Overall line coverage
45.9% -> 50.9%. +258 tests, all offline (no network/docker/uvicorn).

apps/ (Starlette TestClient + asyncio.run, telemetry singletons spied):
- simple_app: health/readiness/liveness + /ping wiring + ping() arg guard
- simple_app_handlers: InvokeHandler signature dispatch + response matrix,
  _convert_to_sse, AsyncTaskHandler, PingHandler
- a2a_app / mcp_app: decorator guards, ping endpoint, env-detect routes/tools
- agent_server: AgentKitAgentLoader + ctor guards; telemetry latency gating
  and error_type formatting; ASGI middleware header-exclusion + finish gating

pipeline (fakes injected via the existing DI seams):
- hybrid_strategy: _should_push_to_cr / _validate_cr_image_url error mapping,
  build/deploy config-updates assembly, invoke/status/destroy passthrough
- base_executor: _classify_error / _handle_exception (pins the ErrorCode
  mapping introduced by the engineering-standards work), preflight handling
- init_executor: render context, template/name validation, project scaffold
- runners/base + ve_agentkit: pure payload/detection helpers, _needs_runtime_
  update env diffing, deploy guard/dispatch, destroy/status mapping

Pins (NOT fixes — this branch is tests-only) three latent source bugs to
their current behavior with # NOTE comments, for a follow-up fix:
- mcp_app tool()/agent_as_a_tool() wrappers: on a tool exception the finally
  passes func_result=result while result is unbound -> UnboundLocalError masks
  the real error and skips telemetry (all 4 wrapper branches).
- simple_app_handlers PingHandler._format_ping_status: self.func.__name__
  raises AttributeError when no ping func is registered.
- simple_app_handlers AsyncTaskHandler.handle: signature omits request,
  violating the BaseHandler.handle(request) contract.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…st surface)

The prior batch raised the coverage number but left the product's core
orchestration ~0%. This batch tests that spine directly with proper fakes,
asserting behavior (error codes, response shapes, collaborator calls),
lifting the critical functions from ~0% to tested:

  ve_pipeline.build (build->push->deploy spine)   1.9% -> 100%
  ve_pipeline._upload_to_tos (bucket-ownership gate) 0% -> 70%
  ve_agentkit._create_new_runtime                    3% -> 91%
  ve_agentkit._update_existing_runtime             1.8% -> 88%
  ve_agentkit._wait_for_runtime_status             9.5% -> 100%
  runners/base._http_post_invoke (invoke transport)  1% -> 91%
  builders/local_docker.build                      0.7% -> 86%
  agent_server_app._invoke_compat (/invoke)          0% -> 89%

Overall coverage 45.9% -> 53.7%. +81 tests, all offline (no network/docker/
uvicorn; polling loops patched). No source modified.

Pins three more latent source bugs to current behavior (# NOTE, for follow-up):
- ve_pipeline._upload_to_tos: the documented ValueError for an unrendered
  {{template}} bucket name is caught by the broad `except Exception` tail and
  re-raised as a generic Exception, so callers can never catch ValueError.
- ve_agentkit._update_existing_runtime: client_token=generate_client_token()
  is passed to UpdateRuntimeRequest, which has no such field and extra="ignore",
  so the idempotency token is silently dropped (Create keeps it, Update sends none).
- builders/local_docker.build: BuildResult.build_logs is typed List[str] but
  the raw str from DockerManager.build_image is assigned directly on both the
  success and failure paths.

Also hardens test_ve_pipeline_upload_tos's AccountDisable assertion to be
cloud-provider-independent (console host varies volcengine vs byteplus and
provider global state leaks across the full suite).

Remaining gap: _execute_build's poll loop is still thin (mocked out to isolate
the build spine) — a follow-up target.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@tiantt tiantt force-pushed the feat/agentkit-engineering-standards branch from fedc9c8 to ad13eaf Compare July 1, 2026 02:48

@cuericlee cuericlee left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Overall: LGTM

This is a well-structured refactoring PR. Key strengths:

Category | Grade -- | -- assert → raise | ✅ Eliminates -O-dependent behavior across the SDK Exception hierarchy | ✅ AgentKitError root enables uniform error handling Redact extraction | ✅ Clean module relocation with backward-compatible shim HTTP defaults consolidation | ✅ DRY improvement Logging fixes | ✅ Exc_info, lazy formatting, and critical raise bugfix in request.py Test coverage | ✅ New dedicated test files; latent-bug-pinning pattern Backward compatibility | ✅ All public APIs preserved

One note: The raise addition in agentkit/utils/request.py (Theme 6) changes the function's contract from "may return None on error" to "always raises on error." This is the right fix, but verify that all call sites handle the exception rather than relying on a None-check pattern.

@cuericlee cuericlee merged commit 5fb9f81 into main Jul 1, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants