Skip to content

feat(builder-sidecar): log API calls for run-pov, apply-patch-build, run-test #148

@0xdkay

Description

@0xdkay

Problem

There's no way to count how many times a CRS called builder sidecar APIs (run-pov, apply-patch-build, run-test) or their success/failure rates during a trial. This data is needed to evaluate each CRS's behavior — understanding how many build iterations, POV verifications, and test runs it performs, how often they succeed, and how the CRS iterates through build-test-verify cycles.

Currently the sidecar processes requests and returns results, but doesn't persist any record of API calls. The only trace is Docker service stdout, which is lost if the container is SIGKILL'd.

Proposal

Register a log directory via libCRS register-log-dir at sidecar startup and write a structured JSONL log of every API call with its result. This ensures the log is persisted to the host-mounted LOG_DIR and survives container termination.

Each API call should log: timestamp, API name, key parameters (harness, build_id), exit code, success/failure, and duration.

Motivation

Evaluating AI-agent CRSes (claude-code, codex, copilot-cli, gemini-cli) requires understanding their interaction patterns with the builder sidecar — how efficiently they iterate, their build success rate, how many POV attempts they make before finding a crash, etc. This applies to both bug-finding and bug-fixing modes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions