Skip to content

Gateway Remote Debug#1233

Open
renuka-fernando wants to merge 5 commits intowso2:mainfrom
renuka-fernando:gateway
Open

Gateway Remote Debug#1233
renuka-fernando wants to merge 5 commits intowso2:mainfrom
renuka-fernando:gateway

Conversation

@renuka-fernando
Copy link
Contributor

@renuka-fernando renuka-fernando commented Feb 18, 2026

Purpose

fix #1232

Summary

  • Enable remote debugging of gateway components (controller, runtime, builder) via Delve, with Docker Compose and VS Code launch configurations
  • Make the control plane host optional so the gateway runs in standalone mode when no platform API is configured
  • Remove redundant build-local Makefile targets across all gateway components

Changes

Remote Debug Infrastructure

  • Add docker-compose.debug.yaml with debug variants of gateway-controller and gateway-runtime services that expose Delve ports (2345/2346) and disable compiler optimizations
  • Add gateway/gateway-runtime/docker-entrypoint-debug.sh — entrypoint for debug runtime image; starts the policy engine under Delve, then launches Envoy
  • Update gateway-controller/Dockerfile and gateway-runtime/Dockerfile with a debug build stage that compiles binaries with -gcflags="all=-N -l" and embeds Delve
  • Add make build-debug and make run-debug targets to root gateway/Makefile, gateway-controller/Makefile, and gateway-runtime/Makefile
  • Add VS Code launch configurations (.vscode/launch.json) for attaching to all three gateway components via remote Delve
  • Extend gateway-builder compilation options to pass through build flags (-gcflags, -ldflags) for debug policy-engine builds

Standalone Gateway Mode

  • Make controlplane.host optional in gateway-controller config — an empty host now means standalone mode (no platform API connection); a token without a host is treated as a misconfiguration
  • Update docker-compose.yaml to omit the control plane host by default, allowing the gateway to start without a running platform API

Build Cleanup

  • Remove build-local targets from root gateway/Makefile and from gateway-controller, gateway-builder, and gateway-runtime component Makefiles — these targets used DOCKER_BUILDKIT=1 docker build but invoked the same BuildKit daemon as docker buildx build, making them redundant

Documentation

  • Expand gateway/DEBUG_GUIDE.md with a full remote debug walkthrough: building debug images, starting the debug Compose stack, and attaching VS Code to each component

Summary by CodeRabbit

Release Notes

  • New Features

    • Added remote debugging support for gateway controller and policy engine components with Delve debugger integration.
    • Introduced two debugging modes: Remote Debug (all components in Docker) and Local Process Debug (selective local execution).
    • Added VS Code debug configurations for seamless debugger attachment.
  • Documentation

    • Expanded debugging guide with comprehensive setup instructions and troubleshooting for both debug modes.

The build-local targets used `DOCKER_BUILDKIT=1 docker build` with the
claim of being faster due to "no buildx", but both invocations use the
same underlying BuildKit daemon. The distinction was misleading and the
targets were redundant with the standard `build` targets.

Removes build-local, build-local-controller, build-local-gateway-builder,
and build-local-gateway-runtime from the root gateway Makefile and from
the gateway-controller, gateway-builder, and gateway-runtime component
Makefiles. Updates the gateway/it/Makefile error message to reference
`make build` instead of `make build-local`.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Walkthrough

This PR introduces remote debugging capabilities for gateway components using Delve (dlv), including VS Code launch configurations, dual debugging options in documentation, docker-compose debug overlays, debug-enabled Dockerfiles with multi-stage builds, updated build targets across modules, and a new entrypoint script orchestrating Policy Engine and Envoy in debug mode. Additionally, control plane configuration gains standalone mode support.

Changes

Cohort / File(s) Summary
VS Code & Documentation
.vscode/launch.json, gateway/DEBUG_GUIDE.md
Added remote debug configurations for Gateway Controller (port 2345) and Policy Engine (port 2346) with path substitutions. Replaced single debug guide with dual-option approach: Remote Debug (all in Docker with dlv) and Local Process Debug (controller/engine as local processes, router in Docker).
Docker Compose
gateway/docker-compose.debug.yaml, gateway/docker-compose.yaml
Introduced new debug compose overlay (docker-compose.debug.yaml) with gateway-controller and gateway-runtime services exposing dlv ports 2345/2346, SYS_PTRACE capabilities, and debug images. Updated existing compose to reorganize and document port mappings for gateway-runtime (Envoy, Policy Engine, metrics).
Build System — Gateway
gateway/Makefile, gateway/gateway-builder/Makefile, gateway/it/Makefile
Replaced build-local targets with build-debug targets across modules. Updated fallback error messages in integration test Makefile to reference build instead of build-local.
Build System — Gateway Controller
gateway/gateway-controller/Makefile, gateway/gateway-controller/Dockerfile
Added build-debug target to controller Makefile with docker buildx invocation and ENABLE_DEBUG build arg. Updated Dockerfile with conditional debug path, ENABLE_DEBUG flag, separate debug stage with Delve, and port 2345 exposure for dlv attachment.
Build System — Gateway Runtime
gateway/gateway-runtime/Makefile, gateway/gateway-runtime/Dockerfile
Added build-debug target with multi-context buildx workflow and ENABLE_DEBUG flag. Expanded Dockerfile with debug flag propagation, debug-dlv-builder stage installing Delve, separate debug runtime stage (Envoy-based) with port 2346, and supporting tooling (tini, curl, ca-certificates, net utilities).
Compilation Support
gateway/gateway-builder/internal/compilation/compiler.go, gateway/gateway-builder/internal/compilation/options.go, gateway/gateway-builder/internal/compilation/options_test.go, gateway/gateway-builder/pkg/types/policy.go
Added EnableDebug boolean field to CompilationOptions; updated generateLDFlags signature to accept enableDebug parameter; modified runGoBuild to append -gcflags all=-N -l when debug enabled; updated test calls to pass new parameter. Preserves debug symbols when coverage or debug mode active.
Configuration
gateway/gateway-controller/pkg/config/config.go, gateway/gateway-controller/pkg/config/config_test.go
Changed default controlplane.host from "localhost:9243" to empty string enabling standalone mode. Updated validateControlPlaneConfig to support standalone operation (no host required unless token is set); added token validation requiring host when token present. Added token field to test cases.
Debug Entrypoint & Orchestration
gateway/gateway-runtime/docker-entrypoint-debug.sh
Introduced comprehensive Bash entrypoint script that orchestrates Policy Engine under dlv (port 2346) and Envoy router, with Unix socket coordination, graceful shutdown handling (SIGTERM/SIGINT/SIGQUIT), process monitoring, and structured logging with prefixes for each component ([pol], [rtr], [ent]).

Sequence Diagram

sequenceDiagram
    participant VSCode as VS Code<br/>(Debugger Client)
    participant Dlv as Delve<br/>(2345/2346)
    participant Container as Docker Container
    participant PolicyEngine as Policy Engine<br/>(Debug Process)
    participant Envoy as Envoy Router
    participant Entrypoint as docker-entrypoint-debug.sh

    VSCode->>VSCode: Load remote debug config
    VSCode->>Dlv: Attach to port 2345/2346
    
    Container->>Entrypoint: Start container
    Entrypoint->>Entrypoint: Parse args & init defaults
    Entrypoint->>PolicyEngine: Start under dlv headless mode
    PolicyEngine->>Dlv: Initialize debugger session
    Dlv->>Dlv: Listen on port 2346 (Policy Engine)<br/>or 2345 (Controller)
    
    Entrypoint->>Entrypoint: Wait for Unix socket creation
    Entrypoint->>Envoy: Start Envoy with generated config
    Envoy->>Envoy: Begin routing traffic
    
    VSCode->>Dlv: Set breakpoints & inspect state
    Dlv->>PolicyEngine: Execute/pause at breakpoints
    PolicyEngine-->>Dlv: Return execution context
    Dlv-->>VSCode: Display debug info
    
    rect rgba(255, 100, 100, 0.5)
    Entrypoint->>Entrypoint: Receive SIGTERM/SIGINT
    Entrypoint->>PolicyEngine: Send SIGTERM
    Entrypoint->>Envoy: Send SIGTERM
    PolicyEngine-->>Entrypoint: Exit gracefully
    Envoy-->>Entrypoint: Exit gracefully
    Entrypoint->>Entrypoint: Cleanup socket & resources
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

Poem

🐰 A rabbit's ode to debugging dreams,
Where dlv and breakpoints rule the streams,
Remote ports 2345 glow bright,
VS Code peers into the night,
Gateway dreams become crystal sight! ✨🔍

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Gateway Remote Debug' accurately describes the primary feature introduced in this PR: remote debugging support for gateway components via Delve and VS Code.
Description check ✅ Passed The PR description comprehensively covers purpose, summary, and detailed changes organized into logical sections (Remote Debug Infrastructure, Standalone Gateway Mode, Build Cleanup, Documentation), meeting the template requirements.
Linked Issues check ✅ Passed The PR fully addresses the requirements from issue #1232 by implementing remote debugging capability for all gateway components, enabling standalone mode, and removing redundant build targets.
Out of Scope Changes check ✅ Passed All code changes are directly aligned with the linked issue #1232: remote debugging support, standalone gateway mode, and cleanup of redundant build targets; no out-of-scope modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
gateway/gateway-builder/internal/compilation/options_test.go (1)

123-166: 🛠️ Refactor suggestion | 🟠 Major

Missing test coverage for enableDebug=true path in generateLDFlags and BuildOptions.

Every updated generateLDFlags call passes false for the third argument. There is no test that exercises enableDebug=true, so:

  • It is not verified that -s -w is absent when debug is enabled (required for dlv to have DWARF info).
  • BuildOptions is not tested for a ENABLE_DEBUG env var or the resulting EnableDebug field.
  • TestBuildOptions_Default never asserts opts.EnableDebug == false.
🧪 Suggested test additions
+func TestGenerateLDFlags_WithDebug(t *testing.T) {
+    metadata := &types.BuildMetadata{
+        Version:   "v1.0.0",
+        GitCommit: "abc123",
+        Timestamp: time.Date(2025, 6, 15, 14, 30, 0, 0, time.UTC),
+    }
+
+    ldflags := generateLDFlags(metadata, false, true)
+
+    // Debug builds must NOT strip symbols/DWARF (dlv requires them)
+    assert.NotContains(t, ldflags, "-s -w")
+    assert.Contains(t, ldflags, "-X main.Version=v1.0.0")
+}
+
+func TestBuildOptions_WithDebugEnabled(t *testing.T) {
+    os.Setenv("ENABLE_DEBUG", "true")
+    defer os.Unsetenv("ENABLE_DEBUG")
+
+    metadata := &types.BuildMetadata{Version: "v1.0.0", GitCommit: "abc", Timestamp: time.Now()}
+    opts := BuildOptions("/output/binary", metadata)
+
+    assert.True(t, opts.EnableDebug)
+    assert.NotContains(t, opts.LDFlags, "-s -w")
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/compilation/options_test.go` around lines
123 - 166, Add tests that exercise the enableDebug=true path: create a
TestGenerateLDFlags_DebugEnabled that calls generateLDFlags(metadata, false,
true) and assert that "-s -w" is NOT present while the -X
main.Version/GitCommit/BuildDate entries are present; add
TestBuildOptions_EnableDebug which sets the ENABLE_DEBUG env var (true/1) and
verifies BuildOptions()/opts.EnableDebug==true (and unset/restore env
afterwards); and update TestBuildOptions_Default to explicitly assert
opts.EnableDebug==false to cover the default behavior.
🧹 Nitpick comments (8)
gateway/gateway-controller/pkg/config/config.go (1)

519-526: Breaking-change risk: default controlplane.host changed to empty string.

Any deployment that previously relied on the compiled-in default (former "localhost:9243") without explicitly overriding controlplane.host in config/env will now silently enter standalone mode after upgrading — no control-plane connection, no error. Operators need to explicitly set APIP_GW_CONTROLPLANE_HOST (or equivalent) to preserve previous connected-mode behaviour.

Consider adding a prominent note in the migration guide / DEBUG_GUIDE.md or CHANGELOG about this default change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/pkg/config/config.go` around lines 519 - 526, The
default for ControlPlane in the config was changed to an empty Host which
silently forces standalone mode; restore the previous compiled-in default host
(set ControlPlane.Host back to "localhost:9243") in the ControlPlaneConfig
default block (the ControlPlane literal in config.go) or, alternatively, add a
startup warning/error when ControlPlane.Host is empty so operators are informed;
update the ControlPlaneConfig default initialization (refer to
ControlPlaneConfig and the ControlPlane default struct) to implement one of
these fixes and ensure tests/config docs are updated accordingly.
gateway/gateway-controller/pkg/config/config_test.go (1)

643-665: Consider adding a test case for the valid "token + host" combination.

The test suite covers:

  • ✅ host set, no token (line 643)
  • ✅ no host, no token — standalone (line 652)
  • ✅ no host, token set — misconfiguration (line 659)
  • ❌ host set, token set — should be valid, but not explicitly tested

Without this case, a future regression that accidentally rejects host + token would go undetected.

🧪 Suggested additional test case
 {
     name:             "Missing host (standalone mode)",
     ...
 },
+{
+    name:             "Token set with host (valid)",
+    host:             "localhost",
+    token:            "some-token",
+    reconnectInitial: 1 * time.Second,
+    reconnectMax:     30 * time.Second,
+    pollingInterval:  5 * time.Second,
+    wantErr:          false,
+},
 {
     name:        "Token set without host",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/pkg/config/config_test.go` around lines 643 - 665,
Add a table-driven test case that verifies the valid "host + token" combination
is accepted: insert an entry in the existing test cases with name: "Host and
token set (valid)", host: "localhost", token: "some-token", reconnectInitial:
1*time.Second, reconnectMax: 30*time.Second, pollingInterval: 5*time.Second,
wantErr: false (no errContains). Ensure the new case follows the same struct
shape used by the other entries (fields name, host, token, reconnectInitial,
reconnectMax, pollingInterval, wantErr, errContains) so the existing test runner
(the Test... function that iterates over the table) will execute it.
gateway/gateway-runtime/Makefile (1)

124-129: clean target doesn't remove debug images.

The clean target removes the regular and coverage images but not the debug variants, leaving stale debug images around.

Proposed fix
 clean: ## Clean Docker images
 	`@echo` "Cleaning Docker images..."
 	`@docker` rmi $(IMAGE_NAME):$(VERSION) 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME):latest 2>/dev/null || true
 	`@docker` rmi $(IMAGE_NAME)-coverage:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:$(VERSION) 2>/dev/null || true
+	`@docker` rmi $(IMAGE_NAME)-debug:latest 2>/dev/null || true
 	`@rm` -rf target
 	`@echo` "Clean completed"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Makefile` around lines 124 - 129, The clean target in
the Makefile currently removes normal and coverage images but not the debug
variants; update the clean recipe (target "clean") to also remove the debug
image tags using the same variables (e.g. docker rmi
$(IMAGE_NAME)-debug:$(VERSION) and docker rmi $(IMAGE_NAME)-debug:latest),
ensuring you silence failures with "2>/dev/null || true" just like the other rmi
lines so stale debug images are removed safely; keep IMAGE_NAME and VERSION
variable usage consistent with the existing lines.
gateway/DEBUG_GUIDE.md (1)

121-148: Option 2 instructions reference the new port layout but ask users to hand-edit docker-compose.yaml.

The instructions tell users to modify docker-compose.yaml to comment out Policy Engine ports and change GATEWAY_CONTROLLER_HOST. Consider providing a dedicated docker-compose.local-debug.yaml override file (or an env-file) instead, so users don't have to edit — and potentially forget to revert — the checked-in compose file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/DEBUG_GUIDE.md` around lines 121 - 148, The current guide asks users
to edit gateway/docker-compose.yaml directly (service gateway-runtime) to change
GATEWAY_CONTROLLER_HOST and comment out Policy Engine ports, which is
error-prone; instead add a local override file (e.g.,
docker-compose.local-debug.yaml) or an env-file that sets
GATEWAY_CONTROLLER_HOST=host.docker.internal and redefines the gateway-runtime
ports block to omit the Policy Engine ports (9002, 9003), then document using
docker-compose -f docker-compose.yaml -f docker-compose.local-debug.yaml up (or
exporting the env-file) so users can apply local debug changes without modifying
the checked-in compose file.
gateway/gateway-runtime/docker-entrypoint-debug.sh (1)

121-144: Shutdown handler has no timeout — could hang if a process ignores SIGTERM.

The bare wait on line 137 blocks indefinitely. In a debug context this is usually fine, but a kill -9 fallback after a grace period would make the script more robust.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh` around lines 121 - 144,
The shutdown() handler can hang because the bare wait may block forever if a
process ignores SIGTERM; modify shutdown to implement a configurable grace
period (e.g., GRACE_PERIOD or a hardcoded few seconds) after sending SIGTERM to
PE_PID and ENVOY_PID, poll/wait for their termination during that grace window,
and if either process still exists after the timeout send SIGKILL (kill -KILL)
as a fallback; ensure you still remove the socket (${POLICY_ENGINE_SOCKET}) and
exit with appropriate status after cleanup. Reference symbols: shutdown(),
PE_PID, ENVOY_PID, POLICY_ENGINE_SOCKET.
gateway/docker-compose.debug.yaml (1)

29-59: Consider adding the sample-backend service or a note about it.

The main docker-compose.yaml includes sample-backend which the debug guide's test steps rely on (curl http://localhost:8080/petstore/v1/pets). Without it, developers using only this file will need to start the backend separately. A brief comment or inclusion of the sample-backend service would improve the out-of-box debug experience.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/docker-compose.debug.yaml` around lines 29 - 59, The debug compose is
missing the sample-backend used in the guide; add a minimal service named
sample-backend (matching the main compose's service name) that exposes port 8080
and uses the same image/ports used by the main docker-compose (or include a
clear comment near the gateway-controller service mentioning developers must
start the sample-backend separately) so the guide's test command (curl
http://localhost:8080/petstore/v1/pets) works out-of-the-box; if you add the
service ensure it joins gateway-network and any required volumes/envs match the
main setup.
gateway/gateway-controller/Dockerfile (1)

79-80: Pin the Delve version for reproducible debug builds.

Using @latest means each image build may pull a different dlv version, potentially introducing debugging inconsistencies or breaking changes. The current stable version is v1.26.0.

Suggested fix
-RUN go install github.com/go-delve/delve/cmd/dlv@latest
+RUN go install github.com/go-delve/delve/cmd/dlv@v1.26.0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/Dockerfile` around lines 79 - 80, Replace the
non-deterministic Delve install that uses "@latest" with a pinned stable release
to ensure reproducible builds: update the Dockerfile RUN that calls "go install
github.com/go-delve/delve/cmd/dlv@latest" to use the known stable tag (e.g.,
"@v1.26.0") so the "dlv" binary version is fixed across image builds.
gateway/gateway-runtime/Dockerfile (1)

129-163: Consider extracting a shared base stage to reduce duplication with the production stage.

Lines 130-144 (base image, apt packages) and 146 (mkdir) are nearly identical to the production stage (lines 166-185). A shared intermediate stage would keep both in sync and reduce maintenance drift.

♻️ Sketch of a shared base stage
+# Stage 3: Common runtime base
+FROM envoyproxy/envoy:v1.35.3 AS runtime-base
+USER root
+RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
+    --mount=type=cache,target=/var/lib/apt/lists,sharing=locked \
+    apt-get update && apt-get install -y --no-install-recommends \
+    tini gettext-base ca-certificates curl \
+    netcat-openbsd dnsutils iproute2 iputils-ping net-tools
+RUN mkdir -p /app /var/run/api-platform /coverage
+
 # Stage 3a: dlv installer
 FROM golang:1.25.7-bookworm AS debug-dlv-builder
 RUN go install github.com/go-delve/delve/cmd/dlv@latest
 
-# Stage 3b: Debug Runtime
-FROM envoyproxy/envoy:v1.35.3 AS debug
-USER root
-RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
-    ...
-RUN mkdir -p /app /var/run/api-platform /coverage
+# Stage 3b: Debug Runtime
+FROM runtime-base AS debug
 ...
 
-# Stage 4: Runtime (production)
-FROM envoyproxy/envoy:v1.35.3
-USER root
-RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
-    ...
-RUN mkdir -p /app /var/run/api-platform /coverage
+# Stage 4: Runtime (production)
+FROM runtime-base
 ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Dockerfile` around lines 129 - 163, Extract the
repeated base setup (the FROM envoyproxy/envoy:v1.35.3 image, apt-get install of
packages, and the mkdir /app /var/run/api-platform /coverage) into a single
intermediate stage (e.g., a runtime-base stage) and have both the debug stage
and the production/runtime stage use COPY --from=runtime-base instead of
duplicating the same RUN/USER/ENV setup; keep stage-specific actions (like COPY
--from=debug-dlv-builder /go/bin/dlv and adding dlv, or any debug-specific
EXPOSE/ENTRYPOINT changes) in the debug stage (the stage currently named debug)
and keep policy-engine and other runtime-only copies (COPY
--from=policy-compiler /workspace/output/gateway-runtime/policy-engine) in the
final stages, ensuring file permissions (chmod +x ...) and ENV/EXPOSE/ENTRYPOINT
are applied in the appropriate final stages rather than the shared base.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/DEBUG_GUIDE.md`:
- Around line 22-25: The header comment inside docker-compose.debug.yaml
currently suggests using it as an overlay ("Usage: docker compose -f
docker-compose.yaml -f docker-compose.debug.yaml up"), which conflicts with the
README command in DEBUG_GUIDE.md that runs docker-compose.debug.yaml standalone;
update the comment in docker-compose.debug.yaml (the "Usage: docker compose -f
docker-compose.yaml -f docker-compose.debug.yaml up" line) to reflect standalone
usage (e.g., instruct to run "docker compose -f docker-compose.debug.yaml up")
or otherwise clarify that this file contains full service definitions and must
not be used as an override, so the two docs are consistent.

In `@gateway/docker-compose.debug.yaml`:
- Around line 19-20: The header comment ("Compose overlay for remote debugging
gateway components with dlv.") and the usage line are incorrect — this YAML is a
standalone docker-compose file, not an overlay; update the top comment and the
usage example to state it is a standalone compose configuration for remote
debugging and show the correct command (e.g., "docker compose -f
docker-compose.debug.yaml up") so users don't mistakenly combine it with
docker-compose.yaml and cause duplicate bindings; keep the description
mentioning dlv and debugging gateway components but remove or replace "overlay"
language and the "-f docker-compose.yaml -f" usage pattern.

In `@gateway/gateway-builder/internal/compilation/compiler.go`:
- Around line 130-136: Add a unit test that mirrors
TestGenerateLDFlags_WithCoverage but sets enableDebug=true to assert that
generateLDFlags does not include the stripped flags "-s -w"; specifically create
TestGenerateLDFlags_WithDebug (or TestBuildOptions_WithDebugEnabled) which calls
generateLDFlags with enableDebug=true (and enableCoverage=false) and asserts the
returned ldflags string/array does not contain "-s -w" while preserving other
expected flags — reference generateLDFlags and the existing
TestGenerateLDFlags_WithCoverage for test structure and assertions.

In `@gateway/gateway-builder/internal/compilation/options.go`:
- Around line 39-43: Replace the generic DEBUG env var lookup with ENABLE_DEBUG:
change the call that assigns debugEnv := os.Getenv("DEBUG") to use
os.Getenv("ENABLE_DEBUG") and keep the strings.EqualFold check and enableDebug
variable logic intact (symbols: enableDebug, debugEnv, os.Getenv,
strings.EqualFold). Also audit sibling coverage handling and ensure any similar
lookup uses "ENABLE_COVERAGE" instead of "COVERAGE" so env names align with the
Dockerfile (symbols to check: any COVERAGE variable, os.Getenv calls that
reference "COVERAGE", and the corresponding enableCoverage variable).

In `@gateway/gateway-controller/Makefile`:
- Line 31: The Copilot instructions reference a removed Makefile target
(`build-local`) so update .github/copilot-instructions.md to instruct users to
run the current target (`make build`) or `make build-debug` for debug workflows
instead; search for the text "rebuild Docker images using `cd gateway && make
build-local`" and replace it with a short, explicit recommendation to use `cd
gateway && make build` (or `make build-debug`), and ensure the guidance still
references the same file globs (`gateway/**/*.{go,yaml,yml,Dockerfile}`) so
automated prompts remain accurate.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh`:
- Line 32: The script uses set -e which causes the monitor block using wait -n
to abort before cleanup (exit code capture, sibling termination, socket cleanup)
when a child exits non-zero; either disable errexit before the monitor block
(use set +e before the wait -n loop and restore set -e after) or change the wait
invocation to capture the child exit status inline (e.g., run wait -n and assign
its exit code to a variable without letting it trigger errexit), ensuring the
subsequent logic that sets rc, kills siblings, and removes sockets (the
monitor/cleanup block around wait -n and the exit code capture) always runs.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 125-127: The Dockerfile uses the build stage named
debug-dlv-builder and runs "go install
github.com/go-delve/delve/cmd/dlv@latest", which pulls a moving target; replace
the `@latest` suffix with a specific Delve release tag to ensure reproducible
debug builds (e.g., use a chosen semver tag like `@vX.Y.Z`) so the
debug-dlv-builder stage always installs the same dlv binary.

---

Outside diff comments:
In `@gateway/gateway-builder/internal/compilation/options_test.go`:
- Around line 123-166: Add tests that exercise the enableDebug=true path: create
a TestGenerateLDFlags_DebugEnabled that calls generateLDFlags(metadata, false,
true) and assert that "-s -w" is NOT present while the -X
main.Version/GitCommit/BuildDate entries are present; add
TestBuildOptions_EnableDebug which sets the ENABLE_DEBUG env var (true/1) and
verifies BuildOptions()/opts.EnableDebug==true (and unset/restore env
afterwards); and update TestBuildOptions_Default to explicitly assert
opts.EnableDebug==false to cover the default behavior.

---

Nitpick comments:
In `@gateway/DEBUG_GUIDE.md`:
- Around line 121-148: The current guide asks users to edit
gateway/docker-compose.yaml directly (service gateway-runtime) to change
GATEWAY_CONTROLLER_HOST and comment out Policy Engine ports, which is
error-prone; instead add a local override file (e.g.,
docker-compose.local-debug.yaml) or an env-file that sets
GATEWAY_CONTROLLER_HOST=host.docker.internal and redefines the gateway-runtime
ports block to omit the Policy Engine ports (9002, 9003), then document using
docker-compose -f docker-compose.yaml -f docker-compose.local-debug.yaml up (or
exporting the env-file) so users can apply local debug changes without modifying
the checked-in compose file.

In `@gateway/docker-compose.debug.yaml`:
- Around line 29-59: The debug compose is missing the sample-backend used in the
guide; add a minimal service named sample-backend (matching the main compose's
service name) that exposes port 8080 and uses the same image/ports used by the
main docker-compose (or include a clear comment near the gateway-controller
service mentioning developers must start the sample-backend separately) so the
guide's test command (curl http://localhost:8080/petstore/v1/pets) works
out-of-the-box; if you add the service ensure it joins gateway-network and any
required volumes/envs match the main setup.

In `@gateway/gateway-controller/Dockerfile`:
- Around line 79-80: Replace the non-deterministic Delve install that uses
"@latest" with a pinned stable release to ensure reproducible builds: update the
Dockerfile RUN that calls "go install github.com/go-delve/delve/cmd/dlv@latest"
to use the known stable tag (e.g., "@v1.26.0") so the "dlv" binary version is
fixed across image builds.

In `@gateway/gateway-controller/pkg/config/config_test.go`:
- Around line 643-665: Add a table-driven test case that verifies the valid
"host + token" combination is accepted: insert an entry in the existing test
cases with name: "Host and token set (valid)", host: "localhost", token:
"some-token", reconnectInitial: 1*time.Second, reconnectMax: 30*time.Second,
pollingInterval: 5*time.Second, wantErr: false (no errContains). Ensure the new
case follows the same struct shape used by the other entries (fields name, host,
token, reconnectInitial, reconnectMax, pollingInterval, wantErr, errContains) so
the existing test runner (the Test... function that iterates over the table)
will execute it.

In `@gateway/gateway-controller/pkg/config/config.go`:
- Around line 519-526: The default for ControlPlane in the config was changed to
an empty Host which silently forces standalone mode; restore the previous
compiled-in default host (set ControlPlane.Host back to "localhost:9243") in the
ControlPlaneConfig default block (the ControlPlane literal in config.go) or,
alternatively, add a startup warning/error when ControlPlane.Host is empty so
operators are informed; update the ControlPlaneConfig default initialization
(refer to ControlPlaneConfig and the ControlPlane default struct) to implement
one of these fixes and ensure tests/config docs are updated accordingly.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh`:
- Around line 121-144: The shutdown() handler can hang because the bare wait may
block forever if a process ignores SIGTERM; modify shutdown to implement a
configurable grace period (e.g., GRACE_PERIOD or a hardcoded few seconds) after
sending SIGTERM to PE_PID and ENVOY_PID, poll/wait for their termination during
that grace window, and if either process still exists after the timeout send
SIGKILL (kill -KILL) as a fallback; ensure you still remove the socket
(${POLICY_ENGINE_SOCKET}) and exit with appropriate status after cleanup.
Reference symbols: shutdown(), PE_PID, ENVOY_PID, POLICY_ENGINE_SOCKET.

In `@gateway/gateway-runtime/Dockerfile`:
- Around line 129-163: Extract the repeated base setup (the FROM
envoyproxy/envoy:v1.35.3 image, apt-get install of packages, and the mkdir /app
/var/run/api-platform /coverage) into a single intermediate stage (e.g., a
runtime-base stage) and have both the debug stage and the production/runtime
stage use COPY --from=runtime-base instead of duplicating the same RUN/USER/ENV
setup; keep stage-specific actions (like COPY --from=debug-dlv-builder
/go/bin/dlv and adding dlv, or any debug-specific EXPOSE/ENTRYPOINT changes) in
the debug stage (the stage currently named debug) and keep policy-engine and
other runtime-only copies (COPY --from=policy-compiler
/workspace/output/gateway-runtime/policy-engine) in the final stages, ensuring
file permissions (chmod +x ...) and ENV/EXPOSE/ENTRYPOINT are applied in the
appropriate final stages rather than the shared base.

In `@gateway/gateway-runtime/Makefile`:
- Around line 124-129: The clean target in the Makefile currently removes normal
and coverage images but not the debug variants; update the clean recipe (target
"clean") to also remove the debug image tags using the same variables (e.g.
docker rmi $(IMAGE_NAME)-debug:$(VERSION) and docker rmi
$(IMAGE_NAME)-debug:latest), ensuring you silence failures with "2>/dev/null ||
true" just like the other rmi lines so stale debug images are removed safely;
keep IMAGE_NAME and VERSION variable usage consistent with the existing lines.

Comment on lines +22 to +25
```bash
cd gateway
docker compose -f docker-compose.debug.yaml up
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Usage command here conflicts with the header comment in docker-compose.debug.yaml.

The guide correctly runs the debug compose file standalone (-f docker-compose.debug.yaml), but docker-compose.debug.yaml line 20 says:

# Usage: docker compose -f docker-compose.yaml -f docker-compose.debug.yaml up

Since the debug file defines complete service definitions (not partial overrides), running it as an overlay on top of docker-compose.yaml would cause port-binding conflicts. One of the two should be updated for consistency — the guide appears correct here, so the comment in docker-compose.debug.yaml should be fixed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/DEBUG_GUIDE.md` around lines 22 - 25, The header comment inside
docker-compose.debug.yaml currently suggests using it as an overlay ("Usage:
docker compose -f docker-compose.yaml -f docker-compose.debug.yaml up"), which
conflicts with the README command in DEBUG_GUIDE.md that runs
docker-compose.debug.yaml standalone; update the comment in
docker-compose.debug.yaml (the "Usage: docker compose -f docker-compose.yaml -f
docker-compose.debug.yaml up" line) to reflect standalone usage (e.g., instruct
to run "docker compose -f docker-compose.debug.yaml up") or otherwise clarify
that this file contains full service definitions and must not be used as an
override, so the two docs are consistent.

Comment on lines +19 to +20
# Compose overlay for remote debugging gateway components with dlv.
# Usage: docker compose -f docker-compose.yaml -f docker-compose.debug.yaml up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the usage comment — this file is standalone, not an overlay.

As noted in the debug guide review, this file defines complete service definitions. Running it as an overlay on docker-compose.yaml would cause duplicate port bindings and conflicting configs.

Proposed fix
 # Compose overlay for remote debugging gateway components with dlv.
-# Usage: docker compose -f docker-compose.yaml -f docker-compose.debug.yaml up
+# Usage: docker compose -f docker-compose.debug.yaml up
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Compose overlay for remote debugging gateway components with dlv.
# Usage: docker compose -f docker-compose.yaml -f docker-compose.debug.yaml up
# Compose overlay for remote debugging gateway components with dlv.
# Usage: docker compose -f docker-compose.debug.yaml up
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/docker-compose.debug.yaml` around lines 19 - 20, The header comment
("Compose overlay for remote debugging gateway components with dlv.") and the
usage line are incorrect — this YAML is a standalone docker-compose file, not an
overlay; update the top comment and the usage example to state it is a
standalone compose configuration for remote debugging and show the correct
command (e.g., "docker compose -f docker-compose.debug.yaml up") so users don't
mistakenly combine it with docker-compose.yaml and cause duplicate bindings;
keep the description mentioning dlv and debugging gateway components but remove
or replace "overlay" language and the "-f docker-compose.yaml -f" usage pattern.

Comment on lines +130 to +136
// Add debug flags if enabled (no optimizations/inlining for dlv)
if options.EnableDebug {
slog.Info("Building with debug flags (no optimizations/inlining)",
"step", "build",
"phase", "compilation")
args = append(args, "-gcflags", "all=-N -l")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

fd -t f 'options.go' gateway/gateway-builder/

Repository: wso2/api-platform

Length of output: 116


🏁 Script executed:

fd -t f 'options.go' gateway/gateway-builder/ -exec cat -n {} \;

Repository: wso2/api-platform

Length of output: 291


🏁 Script executed:

fd -t f 'options_test.go' gateway/gateway-builder/ -exec cat -n {} \;

Repository: wso2/api-platform

Length of output: 291


🏁 Script executed:

cat gateway/gateway-builder/internal/compilation/options.go

Repository: wso2/api-platform

Length of output: 3064


🏁 Script executed:

cat gateway/gateway-builder/internal/compilation/options_test.go

Repository: wso2/api-platform

Length of output: 5887


Add test coverage for enableDebug=true ldflags behavior.

The implementation in generateLDFlags correctly omits -s -w when enableDebug=true (line 73: if !enableCoverage && !enableDebug), ensuring DWARF debug info is preserved for dlv. Combined with the -gcflags all=-N -l in compiler.go, debug builds are properly configured.

However, there is no test case asserting this behavior. All existing tests call generateLDFlags with enableDebug=false. Add a TestGenerateLDFlags_WithDebug (or TestBuildOptions_WithDebugEnabled) test to explicitly verify that -s -w is absent when enableDebug=true, matching the existing TestGenerateLDFlags_WithCoverage pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/compilation/compiler.go` around lines 130 -
136, Add a unit test that mirrors TestGenerateLDFlags_WithCoverage but sets
enableDebug=true to assert that generateLDFlags does not include the stripped
flags "-s -w"; specifically create TestGenerateLDFlags_WithDebug (or
TestBuildOptions_WithDebugEnabled) which calls generateLDFlags with
enableDebug=true (and enableCoverage=false) and asserts the returned ldflags
string/array does not contain "-s -w" while preserving other expected flags —
reference generateLDFlags and the existing TestGenerateLDFlags_WithCoverage for
test structure and assertions.

Comment on lines +39 to +43
// Check for debug mode from environment variable
enableDebug := false
if debugEnv := os.Getenv("DEBUG"); strings.EqualFold(debugEnv, "true") {
enableDebug = true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check what env var names and ARG names are used for debug configuration in gateway
rg -n 'ENABLE_DEBUG|"DEBUG"' gateway/ --type=go --type=dockerfile -g 'Dockerfile*' -g '*.sh' -g 'Makefile' --max-count=50

Repository: wso2/api-platform

Length of output: 95


🏁 Script executed:

# Also check for ARG definitions in Dockerfiles in gateway
find gateway -name 'Dockerfile*' -type f | head -20 | xargs cat | grep -A 5 -B 5 'ARG.*DEBUG\|ARG.*ENABLE_DEBUG'

Repository: wso2/api-platform

Length of output: 656


🏁 Script executed:

# Check the current state of the options.go file to verify the exact code
cat -n gateway/gateway-builder/internal/compilation/options.go | sed -n '39,43p'

Repository: wso2/api-platform

Length of output: 266


🏁 Script executed:

# Check if there's a Makefile target 'build-local' vs 'build' in gateway
grep -n "build-local\|^build:" gateway/Makefile 2>/dev/null || echo "Checking gateway structure..."
ls -la gateway/ | grep -i make

Repository: wso2/api-platform

Length of output: 223


🏁 Script executed:

# Check how ENABLE_DEBUG is passed through in the Dockerfile - look for ENV statements
grep -n 'ENV.*DEBUG\|ARG.*DEBUG' gateway/Dockerfile

Repository: wso2/api-platform

Length of output: 112


🏁 Script executed:

# Get more context on the Dockerfile build process to see how args are passed
cat gateway/Dockerfile | head -100

Repository: wso2/api-platform

Length of output: 111


🏁 Script executed:

# Check the full options.go file to understand the context better
cat -n gateway/gateway-builder/internal/compilation/options.go | head -80

Repository: wso2/api-platform

Length of output: 3280


🏁 Script executed:

# Check for build-local target in Makefile
grep -n "build-local" gateway/Makefile

Repository: wso2/api-platform

Length of output: 43


🏁 Script executed:

# Find all Dockerfiles in gateway directory
find gateway -name 'Dockerfile*' -type f

Repository: wso2/api-platform

Length of output: 303


🏁 Script executed:

# Find where ENABLE_DEBUG is actually used/referenced
rg -n 'ENABLE_DEBUG' gateway/ -g '*.Dockerfile' -g 'Makefile' -g '*.sh' --max-count=20

Repository: wso2/api-platform

Length of output: 203


🏁 Script executed:

# Check the entire gateway Makefile for build targets
cat gateway/Makefile | grep -A 2 "^build"

Repository: wso2/api-platform

Length of output: 3064


🏁 Script executed:

# Search in gateway-builder specifically for how env vars are used
rg -n 'os.Getenv.*DEBUG|os.Getenv.*COVERAGE' gateway/gateway-builder/ --max-count=20

Repository: wso2/api-platform

Length of output: 335


🏁 Script executed:

# Check gateway-builder Dockerfile to see if ENV is set from ARG
cat gateway/gateway-builder/Dockerfile | grep -A 5 -B 5 'ENV.*DEBUG\|ARG.*DEBUG'

Repository: wso2/api-platform

Length of output: 43


🏁 Script executed:

# Check all Dockerfiles for ENV/ARG debug mappings
for f in gateway/gateway-builder/Dockerfile gateway/gateway-controller/Dockerfile gateway/gateway-runtime/Dockerfile; do
  echo "=== $f ==="
  grep -n 'DEBUG\|COVERAGE' "$f" | head -20
done

Repository: wso2/api-platform

Length of output: 784


🏁 Script executed:

# Check if DEBUG/COVERAGE env vars are set in any of the Dockerfiles
rg -n 'ENV DEBUG|ENV COVERAGE' gateway/ --max-count=20

Repository: wso2/api-platform

Length of output: 43


🏁 Script executed:

# Check the build process in a component's Makefile to see how these args are passed
cat gateway/gateway-controller/Makefile | grep -A 10 "build-arg ENABLE_DEBUG"

Repository: wso2/api-platform

Length of output: 353


🏁 Script executed:

# Get full gateway-builder Dockerfile to see build process
cat gateway/gateway-builder/Dockerfile

Repository: wso2/api-platform

Length of output: 3267


🏁 Script executed:

# Check gateway-controller Dockerfile more completely for ENV setup
cat gateway/gateway-controller/Dockerfile | head -150

Repository: wso2/api-platform

Length of output: 5185


🏁 Script executed:

# Check the complete build step in gateway-controller where it might export DEBUG
cat gateway/gateway-controller/Dockerfile | grep -A 30 "RUN.*go build"

Repository: wso2/api-platform

Length of output: 43


🏁 Script executed:

# Also check if there's a separate Makefile in gateway-controller that might set these vars
grep -n "DEBUG\|COVERAGE" gateway/gateway-controller/Makefile | head -30

Repository: wso2/api-platform

Length of output: 137


🏁 Script executed:

# Get the full gateway-controller/Dockerfile to see how build is invoked
cat gateway/gateway-controller/Dockerfile

Repository: wso2/api-platform

Length of output: 5185


🏁 Script executed:

# Also check if there's an entrypoint or CMD that shows how the builder is invoked
grep -n "ENTRYPOINT\|CMD\|RUN.*gateway-builder" gateway/gateway-controller/Dockerfile

Repository: wso2/api-platform

Length of output: 154


Change environment variable name from DEBUG to ENABLE_DEBUG for consistency and to avoid naming collisions.

The Dockerfile ARG names are ENABLE_DEBUG and ENABLE_COVERAGE, but the code reads generic env vars DEBUG and COVERAGE. Using the specific names avoids potential conflicts with other tools (Node.js debug module, shell utilities, etc.) and aligns with the existing Dockerfile conventions.

Proposed fix
 	// Check for debug mode from environment variable
 	enableDebug := false
-	if debugEnv := os.Getenv("DEBUG"); strings.EqualFold(debugEnv, "true") {
+	if debugEnv := os.Getenv("ENABLE_DEBUG"); strings.EqualFold(debugEnv, "true") {
 		enableDebug = true
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check for debug mode from environment variable
enableDebug := false
if debugEnv := os.Getenv("DEBUG"); strings.EqualFold(debugEnv, "true") {
enableDebug = true
}
// Check for debug mode from environment variable
enableDebug := false
if debugEnv := os.Getenv("ENABLE_DEBUG"); strings.EqualFold(debugEnv, "true") {
enableDebug = true
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/compilation/options.go` around lines 39 -
43, Replace the generic DEBUG env var lookup with ENABLE_DEBUG: change the call
that assigns debugEnv := os.Getenv("DEBUG") to use os.Getenv("ENABLE_DEBUG") and
keep the strings.EqualFold check and enableDebug variable logic intact (symbols:
enableDebug, debugEnv, os.Getenv, strings.EqualFold). Also audit sibling
coverage handling and ensure any similar lookup uses "ENABLE_COVERAGE" instead
of "COVERAGE" so env names align with the Dockerfile (symbols to check: any
COVERAGE variable, os.Getenv calls that reference "COVERAGE", and the
corresponding enableCoverage variable).

IMAGE_NAME := $(DOCKER_REGISTRY)/gateway-controller

.PHONY: help generate build build-local test generate-listener-certs push build-coverage-image
.PHONY: help generate build test generate-listener-certs push build-coverage-image build-debug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

build-local removal leaves the project's Copilot instructions outdated.

.github/copilot-instructions.md currently instructs: "rebuild Docker images using cd gateway && make build-local", but build-local is removed across all component Makefiles in this PR. That instruction will be stale for everyone using Copilot/the repo's coding guidelines after this merges.

Consider updating .github/copilot-instructions.md in the same PR to reference make build (or make build-debug for debug workflows). Based on learnings, the existing instruction covers gateway/**/*.{go,yaml,yml,Dockerfile}.

Would you like me to draft the updated copilot-instructions.md snippet? I can open a tracking issue if preferred.

🧰 Tools
🪛 checkmake (0.2.2)

[warning] 31-31: Missing required phony target "all"

(minphony)


[warning] 31-31: Missing required phony target "clean"

(minphony)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/Makefile` at line 31, The Copilot instructions
reference a removed Makefile target (`build-local`) so update
.github/copilot-instructions.md to instruct users to run the current target
(`make build`) or `make build-debug` for debug workflows instead; search for the
text "rebuild Docker images using `cd gateway && make build-local`" and replace
it with a short, explicit recommendation to use `cd gateway && make build` (or
`make build-debug`), and ensure the guidance still references the same file
globs (`gateway/**/*.{go,yaml,yml,Dockerfile}`) so automated prompts remain
accurate.

# Examples:
# docker run gateway-runtime-debug --rtr.component-log-level upstream:debug --pol.log-format text

set -e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

set -e will skip cleanup when a child process exits with non-zero.

Under set -e, if wait -n (line 201) returns a non-zero exit code (i.e., either child crashed), the script aborts immediately — lines 202–221 (exit code capture, sibling termination, socket cleanup) are never reached.

Either disable set -e before the monitor block, or capture the exit code inline:

Proposed fix
 # Monitor both processes - exit if either dies
-wait -n "$PE_PID" "$ENVOY_PID"
-EXIT_CODE=$?
+wait -n "$PE_PID" "$ENVOY_PID" || true
+EXIT_CODE=$?

Also applies to: 200-202

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/docker-entrypoint-debug.sh` at line 32, The script
uses set -e which causes the monitor block using wait -n to abort before cleanup
(exit code capture, sibling termination, socket cleanup) when a child exits
non-zero; either disable errexit before the monitor block (use set +e before the
wait -n loop and restore set -e after) or change the wait invocation to capture
the child exit status inline (e.g., run wait -n and assign its exit code to a
variable without letting it trigger errexit), ensuring the subsequent logic that
sets rc, kills siblings, and removes sockets (the monitor/cleanup block around
wait -n and the exit code capture) always runs.

Comment on lines +125 to +127
# Stage 3a: dlv installer (builds dlv for the debug stage)
FROM golang:1.25.7-bookworm AS debug-dlv-builder
RUN go install github.com/go-delve/delve/cmd/dlv@latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

go-delve/delve latest release 2025 2026

💡 Result:

As of February 18, 2026, the latest go-delve/delve release is:

  • v1.26.0 — released December 19, 2025 (marked “Latest” on GitHub Releases). [1][2]

Delve releases during 2025 (highest patch per series shown):

  • v1.24.1 — March 6, 2025 [1]
  • v1.24.2 — April 10, 2025 [1]
  • v1.25.0 — June 16, 2025 [1]
  • v1.25.1 — July 18, 2025 [1]
  • v1.25.2 — August 27, 2025 [1]
  • v1.26.0 — December 19, 2025 [1][2]

No 2026 Delve release is shown on the GitHub Releases page up to today. [1]

Sources:
[1] GitHub Releases — go-delve/delve: Releases page (shows v1.26.0 as Latest, plus 2025 releases)
[2] pkg.go.dev — module page (shows v1.26.0 “Latest”, published Dec 19, 2025)


Pin the Delve version for reproducible debug builds.

@latest means each build may pull a different dlv binary, which can lead to subtle behavioral differences when debugging. Pin it to a specific release tag.

♻️ Suggested fix
 # Stage 3a: dlv installer (builds dlv for the debug stage)
 FROM golang:1.25.7-bookworm AS debug-dlv-builder
-RUN go install github.com/go-delve/delve/cmd/dlv@latest
+RUN go install github.com/go-delve/delve/cmd/dlv@v1.26.0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Stage 3a: dlv installer (builds dlv for the debug stage)
FROM golang:1.25.7-bookworm AS debug-dlv-builder
RUN go install github.com/go-delve/delve/cmd/dlv@latest
# Stage 3a: dlv installer (builds dlv for the debug stage)
FROM golang:1.25.7-bookworm AS debug-dlv-builder
RUN go install github.com/go-delve/delve/cmd/dlv@v1.26.0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Dockerfile` around lines 125 - 127, The Dockerfile
uses the build stage named debug-dlv-builder and runs "go install
github.com/go-delve/delve/cmd/dlv@latest", which pulls a moving target; replace
the `@latest` suffix with a specific Delve release tag to ensure reproducible
debug builds (e.g., use a chosen semver tag like `@vX.Y.Z`) so the
debug-dlv-builder stage always installs the same dlv binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway Remote Debug

1 participant

Comments