Skip to content

Add support for multiple service endpoints and process monitoring modes#785

Merged
thinkingfish merged 5 commits intomainfrom
claude/rezolus-service-lifecycle-NfIs4
Apr 13, 2026
Merged

Add support for multiple service endpoints and process monitoring modes#785
thinkingfish merged 5 commits intomainfrom
claude/rezolus-service-lifecycle-NfIs4

Conversation

@thinkingfish
Copy link
Copy Markdown
Member

Problem

The rezolus-capture scripts currently support only a single service metrics endpoint and require a fixed duration for capture. This limits flexibility for:

  • Monitoring multiple services simultaneously
  • Capturing metrics while running a command or monitoring an existing process
  • Validating that endpoint/source pairs are properly configured

Solution

Enhanced both scripts/rezolus-capture and docker/rezolus-capture with the following changes:

  1. Multiple Service Endpoints: Changed ENDPOINT and SOURCE from single variables to arrays (ENDPOINTS and SOURCES), allowing users to specify multiple --endpoint and --source pairs. Each endpoint is recorded to a separate temporary parquet file.

  2. Flexible Capture Modes: Replaced the required --duration flag with three mutually exclusive capture modes:

    • --duration <DURATION>: Capture for a fixed time period (original behavior)
    • --command <CMD...>: Run a command and capture while it executes
    • --pid <PID>: Monitor an existing process and capture until it exits
  3. Improved Validation:

    • Enforce exactly one capture mode is specified
    • Validate endpoint/source pairing (each endpoint must have a corresponding source)
    • Check for duplicate source names
    • Verify target process exists when using --pid
  4. Enhanced Recording Management:

    • Track all recorder processes in RECORDER_PIDS array
    • Added stop_recorders() helper function for graceful shutdown
    • Properly handle cleanup of multiple service parquet files
    • Support command execution with proper signal handling and exit code tracking
  5. Flexible Output Combination:

    • Dynamically combine all non-empty parquet files (system + all services)
    • Handle single file case by copying instead of combining
    • Improved error handling when no recordings are produced
  6. Documentation Updates:

    • Updated help text to reflect new capture modes and multiple endpoint support
    • Added examples for multiple endpoints, command execution, and PID monitoring
    • Clarified that sources must be specified once per endpoint in order

Result

Users can now:

  • Monitor multiple services in a single capture session
  • Capture metrics while running benchmarks or tests with --command
  • Attach to running processes with --pid for dynamic monitoring
  • Specify multiple endpoints with proper validation of source pairing
  • Benefit from improved error messages and validation

The changes maintain backward compatibility for the --duration mode while adding powerful new capabilities for flexible metric collection scenarios.

https://claude.ai/code/session_01WV3WMHXPjqKaWrNo6KJN2n

claude added 5 commits April 13, 2026 09:19
…o capture scripts

Both scripts now support multiple --endpoint/--source pairs for recording
from several Prometheus-compatible services simultaneously. The main script
gains --command and --pid as alternatives to --duration, tying the capture
lifecycle to a running process. The Docker variant gains --pid support.
After recording, copy each source's parquet to the output directory
with descriptive names (agent-<port>.parquet, <source>-<port>.parquet)
before combining. This lets users view each source independently.
When an endpoint URL has no explicit port (e.g., http://host/metrics),
the service parquet is named <source>-<pid>.parquet using the monitored
PID instead. extract_port now validates the result is numeric.
When running under sudo, output files were owned by root. Now detects
the real user via SUDO_USER and chowns the output directory and all
parquet files to that user. Sets file mode to 755.
@thinkingfish thinkingfish merged commit 0ca8da7 into main Apr 13, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants