Python Policy Support via Python Runtime by sehan-dissanayake · Pull Request #1343 · wso2/api-platform

sehan-dissanayake · 2026-03-12T08:20:13Z

Related Discussion

Refers to [Proposal] Python Policy Support: Enable ML/AI-Powered Policies via Python Runtime #1103
Resolve [Feature]: Python Policy Support in the Policy Engine #1683

Purpose

The WSO2 API Gateway currently supports only Go-based policies. This limits policy authors to Go, which has a steep learning curve and lacks the rich machine-learning and data-processing ecosystem available in Python. API developers and AI/ML practitioners need to write gateway policies using Python — for example, to apply NLP transformations, call ML model APIs, manipulate structured data with libraries like pandas, or perform text processing with spacy, slugify, etc.

This PR introduces a Python Policy Runtime that allows developers to write, package, and deploy gateway policies in Python while integrating seamlessly with the existing Go-based Policy Engine. Python policies follow the same lifecycle model as Go policies (one instance per route, factory pattern, on_request/on_response/close) and participate in the same policy chain execution.

Goals

Introduce a Python Policy SDK (sdk/policy.py) that mirrors the Go policy/v1alpha interface, exposing the same context types (RequestContext, ResponseContext, SharedContext), action types (UpstreamRequestModifications, ImmediateResponse, UpstreamResponseModifications), and processing mode enums.
Introduce a Python Executor — a standalone gRPC server that loads and executes Python policy instances. It communicates with the Go Policy Engine via a bidirectional gRPC stream over a Unix Domain Socket (UDS), providing high-throughput concurrent policy execution using asyncio and a thread pool.
Introduce a Go-side Python Bridge (PythonBridge, BridgeFactory, StreamManager) that integrates Python policies transparently into the existing Policy Engine kernel — no kernel changes required.
Extend the Gateway Builder to auto-detect Python policies (by runtime), generate the python_policy_registry.py, merge requirements.txt files, copy Python policy sources, and register Python policies via BridgeFactory in the generated plugin_registry.go.
Support pip-distributed Python policies via a new pipPackage field in build.yaml, enabling teams to publish and consume Python policies from PyPI or private indices.
Provide a canonical reference policy (slugify-body-python) that demonstrates the full Python policy lifecycle: factory, request-phase body transformation, graceful error handling, and close().
Expose Python Executor health in the Policy Engine admin health endpoint so operators can monitor executor readiness and loaded policy count.

Approach

The implementation follows a co-process sidecar pattern:

┌─────────────────────────────────────────────────────────┐
│ gateway-runtime container                               │
│                                                         │
│  tini (PID 1)                                           │
│    └─ docker-entrypoint.sh                              │
│         ├─ Router (Envoy)          [rtr]                │
│         ├─ Go Policy Engine        [pol]                │
│         └─ Python Policy Executor  [py-pol]  ← NEW      │
│                                                         │
│  /app/policy-engine.sock      ← Router ↔ Go PE          │
│  /app/python-policy.sock      ← Go PE ↔ Python PE       │
└─────────────────────────────────────────────────────────┘

New Components

1. Python Executor (`gateway/gateway-runtime/python-executor/`)

A Python gRPC server that:

Exposes four RPCs: InitPolicy, DestroyPolicy, ExecuteStream (bidirectional), and HealthCheck.
InitPolicy: Called once per route during chain building. Invokes the get_policy(metadata, params) factory and stores the instance with a UUID instance_id.
DestroyPolicy: Called when a route is removed. Calls instance.close() and removes it from the store.
ExecuteStream: Hot-path bidirectional stream. Uses an asyncio reader task + ThreadPoolExecutor for concurrent policy execution. Responses are out-of-order; the Go side correlates by request_id.
PolicyLoader: Imports python_policy_registry.py (generated by Gateway Builder) and caches get_policy() factory functions.
PolicyInstanceStore: Thread-safe dict mapping instance_id → Policy.
Translator: Converts between protobuf messages and Python SDK types.

2. Python Policy SDK (`gateway/gateway-runtime/python-executor/sdk/policy.py`)

Mirrors Go's policy/v1alpha interface:

Policy ABC with abstract on_request(), on_response(), and default no-op close().
Context dataclasses: SharedContext, RequestContext, ResponseContext, Body.
Action types: UpstreamRequestModifications, ImmediateResponse, UpstreamResponseModifications.
Processing mode: ProcessingMode, HeaderProcessingMode, BodyProcessingMode.

3. gRPC Contract (`gateway/gateway-runtime/proto/python_executor.proto`)

Defines PythonExecutorService with:

ExecuteStream(stream ExecutionRequest) → stream ExecutionResponse — hot-path execution.
InitPolicy(InitPolicyRequest) → InitPolicyResponse — instance lifecycle.
DestroyPolicy(DestroyPolicyRequest) → DestroyPolicyResponse — instance lifecycle.
HealthCheck(HealthCheckRequest) → HealthCheckResponse — readiness probe.

4. Go Python Bridge (`gateway/gateway-runtime/policy-engine/internal/pythonbridge/`)

PythonBridge (bridge.go): Implements policy.Policy. On creation calls InitPolicy RPC to get an instance_id. On OnRequest/OnResponse builds an ExecutionRequest proto, delegates to StreamManager, merges returned updated_metadata back into SharedContext, and translates the ExecutionResponse to a Go action. Implements Close() to call DestroyPolicy RPC.
BridgeFactory (factory.go): Registered as the PolicyFactory for Python policies in the generated plugin_registry.go. Calls InitPolicy to create the Python instance before returning the PythonBridge. Also exposes PythonHealthAdapter for the admin health endpoint.
StreamManager (client.go): Singleton managing the persistent bidirectional ExecuteStream gRPC connection to the Python Executor over UDS. Thread-safe: uses sendMu to serialize stream.Send() calls and a pendingReqs map to correlate request_id → response channel. Also provides InitPolicy, DestroyPolicy, and HealthCheck unary RPC calls. Configurable via PYTHON_EXECUTOR_SOCKET and PYTHON_POLICY_TIMEOUT env vars.
Translator (translator.go): Converts ExecutionResponse proto to Go SDK RequestAction/ResponseAction, including error handling.

5. Gateway Builder Extensions (`gateway/gateway-builder/`)

Runtime auto-detection (discovery/): A policy directory with no go.mod but at least one .py file is classified as runtime: "python".
pipPackage support (pkg/types/manifest.go, discovery/python_module.go): New pipPackage field in build.yaml (format: <package>==<version>[@<index-url>]) downloads a Python policy from PyPI or a private index using pip install --target with cross-platform executable resolution (pip3 → pip → python3 -m pip).
Python artifact generation (policyengine/generator.go): Copies the Python Executor base, copies policy source directories into output/python-executor/policies/, generates python_policy_registry.py, and merges all requirements.txt files.
Updated plugin_registry.go.tmpl: Generates BridgeFactory registrations for Python policies, reading ProcessingMode from policy-definition.yaml at build time. Imports the pythonbridge package only when Python policies are present ({{- if .HasPythonPolicies }}).
Validation (validation/structure.go): ValidatePythonDirectoryStructure checks for policy-definition.yaml and at least one .py file.

6. Docker & Entrypoint (`gateway/gateway-runtime/`)

Dockerfile: Added a new python-deps build stage that installs Python dependencies from the merged requirements.txt into /app/python-libs using pip install --target. The production stage copies the Python Executor (/app/python-executor) and pre-installed libs (/app/python-libs), sets PYTHONPATH, and installs python3 in the final image. The python3-pip and python3-venv tools are not present in the production image.
docker-entrypoint.sh: Conditionally starts the Python Executor when main.py is present in the image (i.e., at least one Python policy was compiled). Waits for the UDS socket to be ready before starting the Policy Engine. Handles graceful SIGTERM shutdown of all three processes (Python Executor, Policy Engine, Envoy). Configurable via PYTHON_POLICY_WORKERS, PYTHON_POLICY_TIMEOUT, PYTHON_EXECUTOR_SOCKET.

7. Admin Health Endpoint (`gateway/gateway-runtime/policy-engine/internal/admin/`)

types.go: Added PythonExecutorHealth { Status, LoadedPolicies } to HealthResponse.
handlers.go: Added PythonHealthChecker interface; HealthHandler optionally calls it.
main.go: Instantiates PythonHealthAdapter if the Python Executor socket exists at startup.

8. Sample Policy (`gateway/sample-policies/slugify-body-python/`)

A canonical reference implementation (SlugifyBodyPolicy) that:

Uses python-slugify (a third-party PyPI package) to convert request body text into URL-safe slugs.
Demonstrates factory pattern, ProcessingMode with requestBodyMode: BUFFER, parameter parsing, body decoding/encoding, error handling, and close().
Includes policy-definition.yaml, requirements.txt, and a heavily commented policy.py.

User stories

As an API developer, I want to write gateway policies in Python so that I can leverage Python's rich library ecosystem (ML, NLP, data processing) without learning Go.
As a platform engineer, I want to distribute Python policies as PyPI packages so teams can consume and version them like any other Python dependency.
As a gateway operator, I want the Python Executor's health and loaded policy count to appear in the Policy Engine health endpoint so I can monitor it alongside the rest of the gateway.
As a policy developer, I want a clear reference policy (slugify-body-python) that shows the full Python policy lifecycle, including how to use third-party Python libraries.

Documentation

N/A — Documentation for the Python policy authoring guide, SDK reference, and pipPackage build.yaml field should be added as follow-up work. The slugify-body-python/policy.py file is heavily commented and serves as the inline developer guide for this release.

Automation tests

Unit tests

Gateway Builder (gateway/gateway-builder):

TestDetectRuntime_GoPolicy / TestDetectRuntime_PythonPolicy / TestDetectRuntime_GoTakesPrecedence / TestDetectRuntime_EmptyDirectory / TestDetectRuntime_NonexistentDirectory — runtime auto-detection logic.
TestParsePipPackageRef_SimplePackage / TestParsePipPackageRef_WithPrivateIndex / TestParsePipPackageRef_MissingVersion / TestParsePipPackageRef_EmptyVersion / TestParsePipPackageRef_EmptyPackageName — pip package reference parsing.
TestDiscoverPoliciesFromBuildFile_PythonAutoDetect / TestDiscoverPoliciesFromBuildFile_MixedGoAndPython — build file discovery with Python policies.
TestGenerateCode_Success / TestGenerateCode_EmptyPolicies / TestGenerateCode_MultiplePolicies — code generation including the generatePythonExecutorBase step.

Note: The pythonbridge Go package (gateway/gateway-runtime/policy-engine/internal/pythonbridge/) does not have unit tests yet — these require a live Python Executor or test doubles. Unit tests for PythonBridge, StreamManager, BridgeFactory, and Translator are planned as follow-up work.

Integration tests

No dedicated integration test feature file for Python policies has been added yet (the slugify-body-python policy is compiled and registered in build.yaml, but no BDD scenario exercises it end-to-end). Integration tests for the Python policy runtime — verifying request body slugification, SharedContext metadata propagation, policy lifecycle (InitPolicy / DestroyPolicy), and Python Executor health endpoint — are planned as follow-up work.

Security checks

Followed secure coding standards in http://wso2.com/technical-reports/wso2-secure-engineering-guidelines? yes
- The Python Executor communicates exclusively over a UDS (Unix Domain Socket), not a TCP port. No external network exposure.
- Python dependencies are installed at build time into an isolated --target directory inside the Docker image. pip and venv are absent from the production image.
- The production container runs as a non-root wso2 user (uid=10001).
- No raw SQL, user-controlled command execution, or deserialization of untrusted data.
- ExecutionRequest proto parameters are passed as google.protobuf.Struct (structured, typed); no shell interpolation or SQL involved.
Ran FindSecurityBugs plugin and verified report? no (Python-side; FindSecurityBugs is Java-specific. Python static analysis with bandit is recommended as follow-up.)
Confirmed that this PR doesn't commit any keys, passwords, tokens, usernames, or other secrets? yes

Samples

`slugify-body-python` — Canonical Python Policy Reference

This policy is the authoritative sample for Python policy development. It:

Transforms the request body — converts any plain-text body into a URL-safe ASCII slug using the third-party python-slugify library.
- Example: "Hello World! Héllo?" → "hello-world-hello"
Demonstrates the full lifecycle:
- get_policy(metadata, params) factory — parses separator and maxLength params once.
- on_request(ctx, params) hot path — guard → decode → validate → transform → respond.
- close() — no-op here, but shows where to put cleanup logic.
Shows error handling patterns — returns a 400 Bad Request ImmediateResponse for empty bodies instead of silently passing them.
Declares a third-party dependency in requirements.txt — demonstrates how policy-level pip dependencies are merged into the executor's dependency set at build time.

Usage in build.yaml:

policies:
  - name: slugify-body-python
    filePath: ./sample-policies/slugify-body-python

Usage in an API definition:

operations:
  - method: POST
    path: /slugify
    policies:
      - name: slugify-body-python
        version: v1.0.0
        params:
          separator: "-"
          maxLength: 0

PyPI-distributed policy (`pipPackage`)

Python policies can be distributed as PyPI packages and consumed via:

policies:
  - name: my-org-auth-policy
    pipPackage: my-org-auth-policy==2.3.0@https://pypi.my-company.com/simple

Related PRs

N/A

Test environment

Component	Version / Details
Go	1.25.7
Python	3.x
grpcio	≥ 1.59.0
protobuf	≥ 4.24.0
OS	macOS (development), Linux (Docker production image)
Envoy	v1.35.3 (base image: `envoyproxy/envoy:v1.35.3`)
Docker	Multi-platform build (`linux/amd64`, `linux/arm64`)

Add two Python policies (python-body-validator, python-json-body-transform) with policy-definition.yaml, example policy.py implementations and requirements. Update build config to register sample-policies and include them in the runtime build context. Enhance the runtime Dockerfile to preinstall Python dependencies in a dedicated stage and copy them into the production image; adjust Makefile to pass sample-policies as a build context. Minor logger message change in policy_loader.py and add results.md with analysis notes.

Replace the sequential await/yield execution in PythonExecutorServicer.ExecuteStream with a concurrent reader/worker/writer fan-out pattern so multiple policy executions can run concurrently in the thread pool. Add a bounded asyncio.Queue and semaphore to cap in-flight tasks, spawn per-request processing tasks with run_in_executor, and yield responses as they complete (out-of-order). Introduce PYTHON_POLICY_MAX_CONCURRENT (MAX_CONCURRENT) and propagate it from main.py into the server/servicer constructors. Also add CONCURRENT_FANOUT_PLAN.md and update generated .pyc artifacts. Files changed: gateway/gateway-runtime/python-executor/executor/server.py (core change), gateway/gateway-runtime/python-executor/main.py (new env var and constructor arg), added CONCURRENT_FANOUT_PLAN.md, and updated __pycache__ artifacts.

Introduce lifecycle RPCs and instance management for Python policies. Adds InitPolicy and DestroyPolicy to the proto and regenerates Go/Python stubs; ExecutionRequest gains an instance_id. Python executor rewritten to support factories and instances: new PolicyInstanceStore, rewritten PolicyLoader, and a gRPC server that handles InitPolicy/DestroyPolicy and dispatches ExecuteStream to instances by instance_id. Python SDK switched to class-based Policy ABC with a close() hook; sample policy updated to use get_policy() factory. Go pythonbridge updated to call InitPolicy in GetPolicy, store the returned instance_id, include it on Execute requests, and expose PythonBridge.Close() which calls DestroyPolicy. Documentation and a discussion file for Close() (wso2#734) are added. This enables factory-controlled instancing and proper cleanup of Python policy resources.

…tation # Conflicts: # gateway/gateway-builder/go.sum

Convert AuthContext to a flat map[string]string for the Python bridge proto wireformat by adding authContextToMap and using it in buildSharedContext. Fix import path in python_executor_pb2_grpc.py to import from proto. Update several policy dependency versions in build-lock.yaml (api-key-auth/basic-auth/jwt-auth -> v0.8.1; mcp-auth/mcp-authz -> v0.2.2). Also include minor import/formatting tidy-ups in factory.go and buildPolicyMetadata.

Delete three sample Python policies (add-python-response-header, python-body-validator, python-json-body-transform) including their policy-definition.yaml, policy.py, and requirements.txt files. Update gateway/build.yaml and gateway/build-lock.yaml to remove references to the removed sample policies and keep the build manifests consistent.

Introduce a new Python policy 'prompt-compression' to compress LLM prompts: adds policy implementation (policy.py), policy-definition.yaml, user docs, and requirements. Registers the policy in build manifests and sample-policies build config, and adds a gateway controller default policy manifest. Also adds a CURL_REFERENCE.md with end-to-end test commands and removes the obsolete STATELESS_PYTHON_PLAN.md. This change ships the new policy, documentation, and integration points so the policy can be built, deployed, and manually tested.

Reworks gateway/CURL_REFERENCE.md into a comprehensive, up-to-date AI Gateway curl reference: renames the doc, adds a one-shot payload generator, reorganizes sections (health, LLM provider/proxy management, policies, prompt-compression tests, rate-limiting, observability, API key management), updates example endpoints to Gemini provider/proxy routes, and includes a full smoke-test script. Also updates gateway/build-lock.yaml to bump the request-rewrite policy version from v0.2.0 to v0.2.1.

Introduce pipPackage as a manifest source for Python policies and add local runtime detection and pip-based fetching. Key changes: - Rename BuildEntry.pythonmodule -> pipPackage and propagate throughout types and validation. - Add DetectRuntime to auto-detect go/python by directory fingerprint (go.mod vs .py files). - Implement ParsePipPackageRef, FetchPipPackage, findPolicyDir and pip executable resolution to install packages into an isolated target and locate policy-definition.yaml. - Extend WriteBuildLockWithVersions to include pipPackage entries. - Route discovery: support explicit pipPackage (remote), explicit gomodule (Go), and filePath with runtime auto-detection (local Go or Python). Consolidate Python discovery logic into helpers (discoverPipPolicy, discoverLocalPythonPolicy, buildPythonDiscoveredPolicy). - Update tests to cover runtime detection, pip ref parsing, fingerprint discovery, and adjust policyengine tests to supply outputDir and ensure python-executor sibling dir exists. - Update Dockerfile to install python3, pip and venv for building Python policies. - Update build.yaml/build-lock.yaml to mark prompt-compression as a pipPackage and add uppercase-body sample policy; bump request-rewrite version. - Remove some default Python policy YAMLs and remove explicit runtime: python from sample/default policy files where appropriate. - Minor formatting and struct field alignment changes in generator.go. These changes enable Python policies to be referenced as pip packages or local directories with correct discovery and installation during build time.

Add a new slugify-body-python sample policy and update build manifests to reference it; remove legacy Python sample policies. Integrate Python executor health into the admin server and /health handler via a PythonHealthChecker adapter (created from the pythonbridge StreamManager) so the policy engine reports executor readiness and loaded policy count. Improve pythonbridge StreamManager lifecycle (long-lived stream context, stream cancellation on disconnect/close) and add a HealthCheck RPC wrapper. Refactor PythonBridge request builders, update BridgeFactory to expose instance IDs and provide the PythonHealthAdapter. Update python-executor server and instance store: use InstanceRecord dataclass, switch to a deque+Condition response pipeline for streaming concurrency, and ensure correct instance record handling. Update tests and server wiring to accommodate new handler signatures.

Delete a set of outdated planning, proposal, and documentation markdown files related to Python policy support and agent guidance to clean up the repo. Files removed: AGENTS.md, CONCURRENT_FANOUT_PLAN.md, PYTHON_LIFECYCLE_PLAN.md, discussion#734.md, first_proposal.md, gateway/docs/python-policy-support.md, python-policy-comprehensive-plan.md, results.md. This is a docs-only cleanup; no source code or proto changes were made.

coderabbitai · 2026-03-12T08:20:39Z

Python Policy Runtime Support

This PR introduces comprehensive support for authoring and executing gateway policies in Python, enabling developers to leverage Python libraries and ecosystems alongside existing Go-based policies. The implementation maintains transparent integration with the Policy Engine without kernel-level changes.

Gateway Builder Enhancements

Extended the policy discovery and validation pipeline to recognize Python policies:

Runtime Detection: Automatically detects policy runtime (Go or Python) based on directory contents (go.mod for Go, .py files for Python)
Multiple Source Options: Policies can now be sourced from local directories (filePath), Go modules (gomodule), or pip packages (pipPackage)
Pip Package Resolution: Implements pip reference parsing (format: package==version[@index-url]), downloads wheels, and extracts module metadata for dependency management
Build Manifest Extension: Updated manifest schema to include pipPackage field and pip-specific metadata for resolved Python policies
Validation Logic: Introduced Python-specific directory structure and file validation; enforces that policy definitions and at least one .py source file exist
Code Generation: Extended phase 3 code generation to accept output directory parameter; generates python_policy_registry.py mapping policy identifiers to module paths; merges base and policy-specific requirements

Python Executor Implementation

Created a standalone asyncio-based gRPC server providing Python policy runtime:

Service Architecture: Implements PythonExecutorService with lifecycle (InitPolicy, DestroyPolicy) and streaming execution (ExecuteStream) RPCs plus health checks
Policy Lifecycle: Manages policy instances with thread-safe storage, execution concurrency control, and graceful shutdown semantics
Policy Loading: Dynamically imports policy modules from generated registry; validates exported factories and falls back to major version when exact versions unavailable
Bidirectional Streaming: Supports out-of-order response emission and per-request concurrency limits for efficient parallel execution
Python SDK: Provides Policy abstract base class, processing mode configuration, typed contexts for request/response phases, and action types for upstream modifications or immediate responses

Policy Engine Integration

Bridged Go and Python policy ecosystems:

Bridge Pattern: Created Python bridge that implements Go policy interfaces and translates between Go and Python execution models via gRPC
Request/Response Handling: Translates policy context, metadata, and parameters to protobuf messages; converts protobuf execution results back to Go action types
Streaming Support: Supports header/body streaming for both request and response phases with appropriate error handling (immediate response for non-streaming phases, forwarding for streaming)
Instance Management: Coordinates policy instance lifecycle (initialization, parameter resolution, cleanup) across process boundaries
Health Integration: Exposes Python executor health status (readiness and loaded policy count) through the admin endpoint

Runtime and Infrastructure

Updated container and process management to support the new Python execution tier:

Docker Build: Added Python tooling to builder stages; introduced separate python-deps stage to isolate pip-installed dependencies; configures Python path for executor module discovery
Entrypoint: Extended startup logic to conditionally launch Python executor when registry exists; manages three-process lifecycle (Envoy, Policy Engine, Python Executor); coordinates graceful shutdown and socket cleanup
Environment Configuration: Introduced configurable parameters for Python executor behavior (worker pool size, concurrent execution limit, request timeout)
Socket Communication: Enforces Unix domain socket-only communication for Python executor, improving isolation and security posture
Process Management: Implements health checking during startup, timeout handling, and robust process termination sequences

Testing and Validation

Comprehensive test coverage for new components:

Gateway Builder Tests: Validates discovery of Python policies from all source types (local, pip packages); verifies pip reference parsing; tests runtime auto-detection and error handling for validation constraints
Python Bridge Tests: Covers request/response metadata handling, context translation, streaming behavior, and instance lifecycle management
Proto Regeneration: Added automated checks ensuring protobuf definitions remain synchronized; supports both Go and Python code generation
Python Unit Tests: Tests for executor services, policy loading, instance management, and message translation

Security Considerations

Implemented isolation boundaries and input validation:

Socket-Based IPC: Restricts communication between Policy Engine and Python executor to Unix domain sockets, preventing network exposure
Isolated Dependencies: Pip packages installed in separate layer without pip/venv artifacts in runtime image
Process Isolation: Python executor runs as distinct process with independent lifecycle management
Input Validation: Enforces policy metadata constraints, validates Python directory structure, and validates pip reference formats

Walkthrough

Wires Python policy support end-to-end: build-file pip discovery and wheel extraction, runtime detection, Python policy discovery/validation, policy codegen emitting Python executor artifacts and requirements, a gRPC Python Executor service + SDK, Go <-> Python bridge and stream manager, runtime packaging/entrypoint changes, and aggregated health checks.

Changes

Cohort / File(s)	Summary
Builder CLI `gateway/gateway-builder/cmd/builder/main.go`	Passes `outputDir` into policy code generation call (`policyengine.GenerateCode(..., outputDir)`).
Types & Manifest `gateway/gateway-builder/pkg/types/manifest.go`, `gateway/gateway-builder/pkg/types/policy.go`, `gateway/gateway-runtime/proto/python_executor.proto`	Added `PipPackage` to BuildEntry; extended `DiscoveredPolicy` with runtime and Python/pip metadata; introduced PythonExecutorService protobuf schema and messages.
Discovery & Pip support `gateway/gateway-builder/internal/discovery/...`, `gateway/gateway-builder/internal/discovery/python_module.go`, `gateway/gateway-builder/internal/discovery/discovery_test.go`	Refactored discovery to accept `pipPackage` alongside `filePath`/`gomodule`; added DetectRuntime, pip-ref parsing, wheel download/extract (FetchPipPackage), local Python discovery helpers, and extensive Python-focused tests.
Build manifest & Lockfile `gateway/gateway-builder/internal/buildfile/generator.go`	Added `pipPackage` to lock entries, populate from policy metadata, and resolve Python pip package versions when selecting candidates.
Code generation & registry `gateway/gateway-builder/internal/policyengine/*.go`, `gateway/gateway-builder/templates/plugin_registry.go.tmpl`	GenerateCode now accepts `outputDir`, splits policies by runtime, always generates/copies Python executor base, copies local Python sources, emits `python_policy_registry.py`, merges/updates `requirements.txt`, and templates/registry updated to register both Go and Python policies.
Policyengine tests `gateway/gateway-builder/internal/policyengine/policyengine_test.go`	Updated tests to pass new `outputDir`, added test ensuring Python executor base files are copied into output.
Validation & runtime-aware dispatch `gateway/gateway-builder/internal/validation/*.go`, `gateway/gateway-builder/internal/discovery/policy.go`	Added Python directory validation and source-collection helpers; validator dispatches based on `Runtime` (python vs go).
Gateway runtime: Python Executor (Python) `gateway/gateway-runtime/python-executor/**`	New Python executor package: proto-generated modules, async gRPC server (InitPolicy/DestroyPolicy/ExecuteStream/HealthCheck), PolicyLoader, instance store, translator, CLI `main.py`, SDK, requirements and proto Python bindings.
Gateway runtime: Go Python Bridge & Client `gateway/gateway-runtime/policy-engine/internal/pythonbridge/*`	Added StreamManager gRPC client, Translator, BridgeFactory and bridge implementation to call Python executor over ExecuteStream, including health adapter and translator for proto↔Go SDK conversions.
Runtime integration & health `gateway/gateway-runtime/Dockerfile`, `gateway/gateway-runtime/docker-entrypoint.sh`, `gateway/gateway-runtime/policy-engine/internal/admin/*`	Dockerfile and entrypoint extended for Python executor and Python libs packaging; new env vars and socket handling; process lifecycle updated to manage Python executor; admin health API and HealthHandler extended to aggregate Python health and expose `Reason`.
Makefile, proto workflow & tests `gateway/gateway-runtime/Makefile`, various tests, `.gitignore`	Make targets for proto generation and python tests, proto regeneration checks; added Python-related .gitignore entries; many new/updated tests for discovery, bridge, and admin behavior.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant PolicyEngine as Policy Engine
    participant PythonBridge as Python Bridge
    participant StreamMgr as StreamManager
    participant PythonExec as Python Executor

    Client->>PolicyEngine: HTTP request
    PolicyEngine->>PythonBridge: Invoke policy hook (OnRequest/OnResponse)
    PythonBridge->>StreamMgr: Execute(ExecutionRequest) (assign request_id)
    StreamMgr->>PythonExec: send over ExecuteStream (gRPC)
    PythonExec->>PythonExec: run policy instance, produce ExecutionResponse
    PythonExec->>StreamMgr: stream ExecutionResponse (request_id)
    StreamMgr->>PythonBridge: deliver matched ExecutionResponse
    PythonBridge->>PolicyEngine: translated RequestAction/ResponseAction
    PolicyEngine->>Client: HTTP response

sequenceDiagram
    participant Builder as Gateway Builder
    participant Discovery as Discovery Module
    participant PipModule as Pip Handler
    participant PipExec as pip
    participant Generator as Code Generator

    Builder->>Discovery: DiscoverPoliciesFromBuildFile(buildfile)
    alt entry has pipPackage
        Discovery->>PipModule: discoverPipPolicy()
        PipModule->>PipModule: ParsePipPackageRef()
        PipModule->>PipExec: pip download --no-deps (wheel)
        PipExec->>PipModule: wheel file
        PipModule->>PipModule: read top_level.txt, extract module
        PipModule->>Discovery: DiscoveredPolicy (Runtime: python, extracted dir)
    else local filePath
        Discovery->>Discovery: DetectRuntime(policyDir)
        Discovery->>Discovery: discoverLocalPythonPolicy() or discoverGoPolicy()
    end
    Discovery->>Builder: []DiscoveredPolicy
    Builder->>Generator: GenerateCode(srcDir, policies, outputDir)
    Generator->>Generator: split by runtime, generate Python artifacts, merge requirements

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.64% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Python Policy Support via Python Runtime' clearly and concisely describes the primary change—introducing Python policy support to the gateway. It is specific enough to convey the main feature without excessive detail.
Description check	✅ Passed	The PR description is comprehensive and follows the template structure with all major sections present: Purpose, Goals, Approach, User stories, Documentation, Automation tests, Security checks, Samples, Related PRs, and Test environment. It provides clear context and justification for the changes.
Linked Issues check	✅ Passed	The PR fully addresses the linked issue `#1683` by implementing Python policy support in the Policy Engine. The code changes introduce a complete Python Executor runtime, Python policy SDK, Go-side bridge integration, and Gateway Builder extensions—all required to support Python-based policies alongside Go policies.
Out of Scope Changes check	✅ Passed	All changes are directly related to the Python policy runtime feature. No unrelated modifications to unrelated systems or components are present. Changes span Gateway Builder (policy discovery, pip support, code generation), Python Executor (gRPC server, SDK, lifecycle management), Go Bridge (streamManager, translator, factory), Docker/entrypoint (process management), admin health (Python health checking), and tests—all scoped to Python policy support.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Delete the default slugify-body-python policy manifest and clean up build manifests: remove slugify-body-python and uppercase-body entries from gateway/build.yaml; delete gateway-controller/default-policies/slugify-body-python.yaml; update gateway/sample-policies/build.yaml to reference ./slugify-body (and simplify the comment), and remove the prompt-compression entry. These changes consolidate the sample policy location and remove now-unused default policy references.

CLAassistant · 2026-03-12T08:27:01Z

All committers have signed the CLA.

coderabbitai

Actionable comments posted: 8

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

gateway/gateway-runtime/Makefile (1)

85-102: ⚠️ Potential issue | 🟠 Major

Missing sample-policies build context in build-coverage-image target.

The sample-policies build context was added to build and build-debug targets but is missing from build-coverage-image (and build-and-push-multiarch). If the Dockerfile references --from=sample-policies, these builds will fail.

🐛 Proposed fix

 build-coverage-image: test ## Build Gateway Runtime Docker image with coverage instrumentation
 	`@echo` "Building Gateway Runtime coverage image ($(IMAGE_NAME)-coverage:$(VERSION))..."
 	`@mkdir` -p target && cp ../build.yaml target/
 	DOCKER_BUILDKIT=1 docker build -f Dockerfile \
 		--build-context sdk=../../sdk \
 		--build-context common=../../common \
 		--build-context gateway-builder=../gateway-builder \
 		--build-context system-policies=../system-policies \
 		--build-context dev-policies=../dev-policies \
+		--build-context sample-policies=../sample-policies \
 		--build-context target=target \

Similarly for build-and-push-multiarch around line 117.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Makefile` around lines 85 - 102, The
build-coverage-image target is missing the sample-policies build context which
the Dockerfile may reference; add the build context flag --build-context
sample-policies=../sample-policies to the DOCKER_BUILDKIT docker build
invocation in the build-coverage-image recipe (same place where other
--build-context flags are listed) and also add the same build-context flag to
the build-and-push-multiarch target so any --from=sample-policies stage in the
Dockerfile resolves correctly (keep the existing --build-arg VERSION/ GIT_COMMIT
and tagging behavior unchanged).

🟡 Minor comments (14)

gateway/gateway-runtime/python-executor/sdk/policy.py-109-116 (1)
109-116: ⚠️ Potential issue | 🟡 Minor

Inconsistency confirmed: Go SDK defaults StatusCode to 0 (invalid HTTP status), Python defaults to 500. Align defaults or document the intentional difference.

The proto definition (ImmediateResponseAction) and Go SDK both default to 0, which is not a valid HTTP status code. The Python SDK defaults to 500 (a valid error status), creating an inconsistency. While developers should always explicitly set status_code (as shown in the sample policy), the SDK defaults should be aligned across runtimes or the difference should be explicitly documented if intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/sdk/policy.py` around lines 109 -
116, The ImmediateResponse dataclass currently defaults status_code to 500,
which is inconsistent with the proto ImmediateResponseAction and Go SDK (which
default to 0); change the default for status_code in ImmediateResponse to 0 to
match the proto/Go behavior and add a brief docstring line on ImmediateResponse
noting the default 0 is intentional (and that callers should set a valid HTTP
status) so the runtime defaults are consistent across SDKs (refer to
ImmediateResponse.status_code and ImmediateResponseAction).
gateway/gateway-runtime/python-executor/main.py-73-80 (1)
73-80: ⚠️ Potential issue | 🟡 Minor

Store task reference and use get_running_loop() instead of deprecated get_event_loop().

Two issues in the signal handling setup:

asyncio.get_event_loop() is deprecated since Python 3.10 and should be replaced with asyncio.get_running_loop().

The task created by asyncio.create_task(server.shutdown()) is not stored, which means it could be garbage collected before completion.
🐛 Proposed fix
+    shutdown_task = None
+
     def signal_handler():
+        nonlocal shutdown_task
         logger.info("Received shutdown signal")
-        asyncio.create_task(server.shutdown())
+        shutdown_task = asyncio.create_task(server.shutdown())

-    loop = asyncio.get_event_loop()
     for sig in (signal.SIGTERM, signal.SIGINT):
-        loop.add_signal_handler(sig, signal_handler)
+        asyncio.get_running_loop().add_signal_handler(sig, signal_handler)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/main.py` around lines 73 - 80,
Replace the deprecated asyncio.get_event_loop() call with
asyncio.get_running_loop() when initializing loop, and in signal_handler capture
and store the shutdown task so it isn't garbage-collected (e.g., assign the
result of asyncio.create_task(server.shutdown()) to a variable scoped where it
will live). Update references in the code around loop, signal_handler, and
server.shutdown() to use the new loop and to keep the created task reference
alive until completion.
gateway/gateway-runtime/python-executor/main.py-82-88 (1)
82-88: ⚠️ Potential issue | 🟡 Minor

Add a synchronization guard to prevent concurrent shutdown() calls.

The signal handler (line 77) creates an unawaited task calling server.shutdown(), while the finally block (line 88) also calls it. These can execute concurrently in the event loop without synchronization. Although PolicyInstanceStore.clear() is idempotent and closing instances twice is safe, server.stop() (line 386) lacks an explicit guard. Add a flag or asyncio.Event to ensure shutdown executes only once:
# In __init__:
self._shutdown_event = asyncio.Event()

# In shutdown():
if self._shutdown_event.is_set():
    return
self._shutdown_event.set()
# ... rest of shutdown logic
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/main.py` around lines 82 - 88, The
shutdown() method can be invoked concurrently by the signal handler task and the
finally block; add a one-time synchronization guard to ensure shutdown runs only
once by initializing a flag/event (e.g., self._shutdown_event = asyncio.Event()
in __init__) and at the start of shutdown() check if the event is set and return
early if so, otherwise set the event and proceed with the existing shutdown
logic (which calls PolicyInstanceStore.clear() and server.stop()); update any
callers (signal handler/unawaited task and the finally block) to simply
call/await server.shutdown() without worrying about duplicate execution.
gateway/gateway-controller/default-policies/prompt-compression.yaml-38-47 (1)
38-47: ⚠️ Potential issue | 🟡 Minor

Consider deprecating the no-op parameter.

minInputTokens is documented as having no effect and kept only for backward compatibility. Consider formally marking it as deprecated (e.g., via a deprecated: true field or naming convention) to signal that it may be removed in a future version, rather than silently accepting a parameter that does nothing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-controller/default-policies/prompt-compression.yaml` around
lines 38 - 47, The minInputTokens parameter is a no-op kept for backward
compatibility; mark it as deprecated so callers know it may be removed—add a
formal deprecation marker (e.g., deprecated: true or an x-wso2-deprecated: true
field) to the minInputTokens schema entry in prompt-compression.yaml, update its
description to state it is deprecated and slated for removal, and ensure any
schema consumers (validation or docs generators) will surface this deprecation
by using the same property name minInputTokens to locate and annotate it.
gateway/gateway-builder/internal/policyengine/policyengine_test.go-378-381 (1)
378-381: ⚠️ Potential issue | 🟡 Minor

Test isolation concern with sibling directory pattern.

Using filepath.Join(tmpDir, "..", "python-executor") creates a directory in the parent of t.TempDir(). Since t.TempDir() returns paths under a shared OS temp folder, parallel test runs may collide on the python-executor directory name.

Consider creating a dedicated parent directory within the temp dir to avoid potential flakiness:
Proposed fix
-	// Create python-executor sibling directory (required by generatePythonExecutorBase)
-	pythonExecDir := filepath.Join(tmpDir, "..", "python-executor")
-	testutils.CreateDir(t, pythonExecDir)
+	// Create a workspace with policy-engine and python-executor as siblings
+	workspace := t.TempDir()
+	tmpDir := filepath.Join(workspace, "policy-engine")
+	testutils.CreateDir(t, tmpDir)
+	pythonExecDir := filepath.Join(workspace, "python-executor")
+	testutils.CreateDir(t, pythonExecDir)
This pattern should be applied consistently across all affected tests (TestGenerateCode_Success, TestGenerateCode_EmptyPolicies, TestGenerateCode_MissingCmdDirectory, TestGenerateCode_MissingGoMod, TestGenerateCode_MultiplePolicies).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/policyengine_test.go` around
lines 378 - 381, The tests create a sibling "python-executor" directory using
pythonExecDir := filepath.Join(tmpDir, "..", "python-executor"), which can
collide across parallel test runs; change the setup to create an isolated parent
inside the temp dir (e.g., make a subdir under tmpDir to act as the parent) and
then create the python-executor directory as a child of that new parent so
generatePythonExecutorBase sees the sibling without touching the OS temp parent;
apply this change to the tests referenced (TestGenerateCode_Success,
TestGenerateCode_EmptyPolicies, TestGenerateCode_MissingCmdDirectory,
TestGenerateCode_MissingGoMod, TestGenerateCode_MultiplePolicies) and update the
pythonExecDir variable and any setup/teardown in those tests accordingly.
gateway/sample-policies/slugify-body-python/policy.py-137-153 (1)
137-153: ⚠️ Potential issue | 🟡 Minor

Reject inputs that collapse to an empty slug.

Bodies made only of punctuation or unsupported characters pass text.strip() but can still produce "" here, so the policy forwards an empty body upstream.
Suggested fix
         slug = slugify(
             text,
             separator=self._separator,
             max_length=self._max_length,
         )
+        if not slug:
+            return ImmediateResponse(
+                status_code=400,
+                headers={"Content-Type": "text/plain; charset=utf-8"},
+                body=b"slugify-body: request body must contain slugifiable text",
+            )
         logger.info(
             "on_request: slugified body — %r → %r",
             text[:60],
             slug[:60],
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/sample-policies/slugify-body-python/policy.py` around lines 137 -
153, The slug can be empty after slugify (variables slug and text), so before
returning the UpstreamRequestModifications you must detect slug == "" and reject
the request instead of forwarding an empty body: in the function/method where
slugify is called (look for slugify(...) and the return of
UpstreamRequestModifications), check if slug is empty and return an appropriate
rejection (e.g., raise an HTTP error or return a rejection response) with a
clear message rather than returning body=slug.encode(...).
gateway/gateway-builder/internal/policyengine/generator.go-234-281 (1)
234-281: ⚠️ Potential issue | 🟡 Minor

Requirements merging uses exact string matching, which won't detect version conflicts.

If one policy requires requests==2.28.0 and another requires requests==2.31.0, both lines will be added. This can cause unpredictable pip behavior. Consider extracting package names and detecting conflicts.

Would you like me to propose a more robust requirements merger that detects package version conflicts?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 234
- 281, The mergeRequirements function currently deduplicates by exact line text
which misses package/version conflicts (e.g., requests==2.28.0 vs
requests==2.31.0); update mergeRequirements to parse each requirement line into
a package name and version specifier (use a helper that splits on first
comparison operator or version separator), track seen packages in a map from
package name -> (versionSpec, sourcePolicy), and when encountering a different
version spec for the same package either (a) decide a deterministic resolution
(e.g., pick the highest compatible version) or (b) return a descriptive error
indicating the conflicting package, the differing version specs and which
DiscoveredPolicy (use p.Name and p.PythonSourceDir) introduced each; ensure
final output reconstructs a normalized requirements.txt (one entry per package)
and adjust the function signature/return to surface conflict errors so callers
can handle them.
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go-155-161 (1)
155-161: ⚠️ Potential issue | 🟡 Minor

Dropping responses when channel is full may cause silent failures.

The non-blocking send with default case logs an error but the caller waiting on respCh will timeout without knowing the response was actually received. Consider increasing buffer size or blocking with a short timeout.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go` around
lines 155 - 161, The current non-blocking send to ch (sending resp and logging
via slogger.Error with resp.RequestId) drops responses silently; either increase
the channel buffer where respCh is created to a safe size or replace the
non-blocking send with a bounded-blocking send using a short timeout (use
time.After) so the sender waits briefly before giving up and logs the timeout
including resp.RequestId; locate the channel creation for respCh to adjust its
capacity or update the send site (the select that currently has case ch <- resp
and default) to include a timeout case that logs the failure and preserves
observability for the caller.
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go-241-245 (1)
241-245: ⚠️ Potential issue | 🟡 Minor

Flattening ac.Properties may overwrite standard auth context keys.

If ac.Properties contains keys like "subject" or "issuer", they will overwrite the standard values set above. Consider namespacing properties or skipping reserved keys.
🛡️ Proposed fix
 	// Flatten properties
 	for k, v := range ac.Properties {
+		// Skip reserved keys to prevent overwriting standard fields
+		switch k {
+		case "subject", "issuer", "auth_type", "authenticated", "authorized", "credential_id":
+			continue
+		}
 		m[k] = v
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 241 - 245, The loop that flattens ac.Properties into m can clobber
reserved auth-context keys (e.g., "subject", "issuer") because it blindly
assigns m[k]=v; update the logic in the flattening block (the for k,v := range
ac.Properties loop) to either namespace property keys (e.g., prefix with
"prop.") or explicitly skip a set of reserved keys (subject, issuer, audience,
etc.) before assigning into m to prevent overwriting the standard auth context
values.
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go-273-290 (1)
273-290: ⚠️ Potential issue | 🟡 Minor

Error responses expose internal error details to clients.

The error message from err.Error() is returned directly in the response body, which could leak internal implementation details or stack traces to API consumers.
🛡️ Proposed fix - use generic error message
 func (b *PythonBridge) errorResponse(err error) policy.RequestAction {
+	b.slogger.Error("Policy execution error", "error", err)
 	return policy.ImmediateResponse{
 		StatusCode: 500,
 		Headers:    map[string]string{"Content-Type": "text/plain"},
-		Body:       []byte(fmt.Sprintf("Policy execution error: %v", err)),
+		Body:       []byte("Internal policy error"),
 	}
 }

 func (b *PythonBridge) errorResponseAction(err error) policy.ResponseAction {
+	b.slogger.Error("Policy execution error in response phase", "error", err)
 	statusCode := 500
 	return policy.UpstreamResponseModifications{
 		StatusCode: &statusCode,
-		Body:       []byte(fmt.Sprintf("Policy execution error: %v", err)),
+		Body:       []byte("Internal policy error"),
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 273 - 290, The current implementations of errorResponse and
errorResponseAction return err.Error() in response bodies which can leak
internals; change both functions (errorResponse and errorResponseAction) to
return a generic, non-sensitive message (e.g. "Policy execution failed") and
avoid including the actual err string or stack trace, while preserving the 500
status and Content-Type header (for errorResponse) and setting Body to the
generic message in policy.ImmediateResponse and
policy.UpstreamResponseModifications respectively.
gateway/gateway-builder/internal/discovery/manifest.go-240-257 (1)
240-257: ⚠️ Potential issue | 🟡 Minor

DetectRuntime silently defaults to Go on directory read errors.

If os.ReadDir fails (e.g., permission denied), the function returns "go" without logging, which could lead to confusing build failures downstream when a Python policy is misidentified.
🛡️ Proposed fix to add logging
 func DetectRuntime(policyDir string) string {
 	goMod := filepath.Join(policyDir, "go.mod")
 	if _, err := os.Stat(goMod); err == nil {
 		return "go"
 	}
 
 	entries, err := os.ReadDir(policyDir)
 	if err != nil {
+		slog.Warn("Failed to read policy directory for runtime detection, defaulting to Go",
+			"path", policyDir,
+			"error", err)
 		return "go"
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 240 -
257, DetectRuntime currently swallows os.ReadDir errors and silently returns
"go"; update DetectRuntime to log the ReadDir error before returning so
misidentification is visible. Specifically, in DetectRuntime (function name)
when os.ReadDir(policyDir) returns an error, call the existing logger (or fmt if
no logger is available) to emit a clear message including the policyDir and the
error, then continue to return "go" (preserving behavior). Ensure the log call
is placed right where the ReadDir error is checked so the error context is
captured.
gateway/gateway-runtime/python-executor/executor/policy_loader.py-99-122 (1)
99-122: ⚠️ Potential issue | 🟡 Minor

Ensure policies are loaded before querying factories.

get_factory doesn't verify that load_policies() has been called. If a caller invokes get_factory before loading, it will always raise KeyError even for valid policies.
🛡️ Proposed fix to add a guard
     def get_factory(self, name: str, version: str) -> PolicyFactory:
         """Get the factory function for a policy.
 
         Args:
             name: Policy name.
             version: Policy version.
 
         Returns:
             The get_policy factory callable.
 
         Raises:
             KeyError: If policy not found.
         """
+        if not self._loaded:
+            raise RuntimeError("PolicyLoader.load_policies() must be called before get_factory()")
+
         key = f"{name}:{version}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/policy_loader.py` around
lines 99 - 122, get_factory currently queries self._factories without ensuring
policies are loaded, so callers get KeyError if load_policies() wasn't run;
modify get_factory in PolicyLoader to ensure policies are loaded before lookup
(e.g., call self.load_policies() if not already loaded or check a boolean like
self._loaded and call load_policies()), then proceed to check self._factories
for key and fallback major version; this ensures valid policies are discoverable
and avoids spurious KeyError.
gateway/gateway-runtime/python-executor/executor/translator.py-82-87 (1)
82-87: ⚠️ Potential issue | 🟡 Minor

Inconsistent body presence detection between request and response bodies.

The request_body uses a truthiness check (line 83), which treats empty bytes as missing, while response_body uses an explicit response_body_present flag (line 90). To match the pattern and correctly handle empty request bodies, either add a request_body_present field to the ResponseContext proto definition, or refactor both to use consistent presence detection logic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/translator.py` around lines
82 - 87, Make request body presence detection consistent with the response by
using an explicit presence flag: update the proto to add request_body_present on
RequestContext and then in translator.py set request_body =
Body(content=proto_ctx.request_body, present=proto_ctx.request_body_present)
(referencing Body, proto_ctx.request_body and the new request_body_present
field) so empty byte bodies are treated as present the same way
response_body_present is used for responses.
gateway/gateway-runtime/python-executor/executor/server.py-41-56 (1)
41-56: ⚠️ Potential issue | 🟡 Minor

ThreadPoolExecutor is never shut down, risking resource leaks.

The _executor ThreadPoolExecutor created in __init__ is not explicitly shut down when the server terminates. While Python's garbage collector will eventually clean it up, best practice is to call shutdown() on the executor during graceful termination to ensure all worker threads are properly joined.

Consider adding a shutdown method to the servicer or tracking it in PythonExecutorServer.shutdown().
Proposed fix — add executor reference to server for cleanup

In PythonExecutorServer, after server stop, shut down the servicer's executor:
     async def shutdown(self):
         """Shutdown: close all live policy instances, then stop server."""
         logger.info("Shutting down Python Executor...")

         # Close all live instances
         instances = self._store.clear()
         for instance in instances:
             try:
                 instance.close()
             except Exception as e:
                 logger.warning(f"Error closing policy instance during shutdown: {e}")

         if self.server:
             await self.server.stop(grace_period=5)
+
+        if hasattr(self, '_servicer') and self._servicer:
+            self._servicer._executor.shutdown(wait=True)
+
         logger.info("Python Executor shutdown complete")
You'll also need to store the servicer reference in start():
         servicer = PythonExecutorServicer(
             self._loader, self._store, self.worker_count, self.max_concurrent
         )
+        self._servicer = servicer
         proto_grpc.add_PythonExecutorServiceServicer_to_server(servicer, self.server)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 41 -
56, The ThreadPoolExecutor stored as self._executor in PythonExecutorServicer
(created in __init__) is never shut down; add a shutdown method to
PythonExecutorServicer that calls self._executor.shutdown(wait=True) (and
optionally catch/log exceptions), then ensure the PythonExecutorServer stores
the servicer instance created in start() and invokes servicer.shutdown() from
PythonExecutorServer.shutdown()/stop to join worker threads and avoid resource
leaks.

🧹 Nitpick comments (19)

gateway/gateway-runtime/python-executor/executor/instance_store.py (1)

27-30: Avoid exposing a mutable record from a thread-safe store.

Line 49 returns the live InstanceRecord, so callers can mutate record.policy outside the lock. Freezing the dataclass, or returning Policy | None directly, keeps the thread-safety boundary inside the store.

Minimal hardening

-@dataclass
+@dataclass(frozen=True, slots=True)
 class InstanceRecord:
     """Holds a policy instance."""
     policy: Policy

Also applies to: 49-52

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/instance_store.py` around
lines 27 - 30, The InstanceRecord dataclass currently exposes a mutable policy
(InstanceRecord.policy) to callers, breaking the store's thread-safety because a
live InstanceRecord is returned (see the method that returns the record at the
49-52 region); fix by making InstanceRecord immutable (add frozen=True) or
change the store API to return Policy | None (a copy or the policy object only)
instead of returning the InstanceRecord instance, and update all callers to
handle the new return type so mutations cannot occur outside the store's lock.

gateway/gateway-builder/internal/policyengine/registry.go (1)

160-167: Placeholder implementation: formatSystemParams always returns "nil".

This function ignores the input params and always returns "nil". The comment indicates this needs proper Go map syntax. If system parameters are expected to be passed to policies, this will silently drop them.

Is this intentional for this PR, or should I help implement proper map serialization? If deferred intentionally, consider adding a TODO comment or tracking issue.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/registry.go` around lines 160 -
167, The function formatSystemParams currently ignores its input and always
returns "nil"; implement proper serialization of params into a Go literal map
string so generated code receives the actual values: handle nil by returning
"nil", iterate over params in formatSystemParams to produce a Go
map[string]interface{} literal (quoting string keys/values, formatting
numbers/bools, and recursively serializing nested maps or slices) and return
that string, or if you intentionally want to defer, replace the body with a
clear TODO comment referencing an issue/PR to avoid silently dropping
parameters; update any callers expecting real params accordingly.

gateway/gateway-runtime/Dockerfile (1)

157-191: Debug stage lacks Python Executor support.

The debug stage doesn't include Python Executor setup (no python3, no python-libs, no python-executor copy). If debugging Python policy issues is needed, this stage won't support it.

If Python debugging is out of scope for now, consider adding a comment documenting this limitation. Otherwise, mirror the production stage's Python setup here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Dockerfile` around lines 157 - 191, The debug Docker
stage named "debug" currently lacks the Python Executor setup required for
debugging Python policies; either add the same Python install and executor
artifacts used in the production stage (install python3 and any python-libs,
copy the python-executor binary or package from the builder stage, and set
executable permissions on it) referencing the existing copy/chmod pattern used
for "/app/policy-engine" and "/usr/local/bin/dlv" so the debug image can run
Python policies, or if Python debugging is intentionally unsupported, add a
clear comment in the "debug" stage explaining that Python Executor is omitted
and why.

gateway/gateway-runtime/python-executor/sdk/policy.py (1)

168-176: Consider marking close() with a lint suppression or brief inline comment.

The Ruff B027 warning flags this as an empty method in an ABC without @abstractmethod. The current design is intentional (opt-in cleanup), but adding a brief # noqa: B027 or a one-line comment clarifying the no-op intent would silence the linter and improve clarity for future readers.

💡 Optional: suppress lint warning

     def close(self) -> None:
         """Release resources held by this policy instance.

         Called when the route is removed or replaced (via DestroyPolicy RPC).
         Override to close connections, stop background threads, flush caches, etc.
         Must be idempotent — may be called multiple times.
         Default implementation does nothing.
         """
-        pass
+        pass  # noqa: B027 - intentional no-op default

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/sdk/policy.py` around lines 168 -
176, The empty, intentionally no-op method Policy.close() is triggering Ruff
B027; update the method to explicitly indicate the intent by adding a brief
inline comment or lint suppression (e.g., add either "# Intentional no-op;
cleanup is optional" above the pass or append "pass  # noqa: B027") so the
linter and future readers know the no-op is deliberate while leaving the method
callable/optional.

gateway/gateway-runtime/proto/python_executor.proto (1)

20-22: Buf lint: Package directory mismatch.

The proto package wso2.gateway.python.v1 doesn't match the file location gateway/gateway-runtime/proto. This is a Buf style convention warning. If the project uses Buf for linting/breaking change detection, either:

Move the proto to wso2/gateway/python/v1/python_executor.proto, or

Configure Buf to ignore this rule if the current location is intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/proto/python_executor.proto` around lines 20 - 22,
The proto package declaration wso2.gateway.python.v1 in python_executor.proto
does not match its repository location, causing a Buf lint "package directory
mismatch"; fix by either (A) moving the proto file so the directory structure
matches the package (i.e., place python_executor.proto under directories
corresponding to wso2/gateway/python/v1) or (B) if the current location is
intentional, update the Buf configuration to ignore the
package-directory-mismatch rule for this proto (or add a
buf.module_reference/experimental configuration) so the linter won't flag
wso2.gateway.python.v1; ensure the package line remains exactly "package
wso2.gateway.python.v1;" when choosing the move option and validate with buf
lint after the change.

gateway/gateway-runtime/python-executor/main.py (1)

34-35: Add error handling for environment variable parsing.

int() will raise ValueError if the environment variable contains a non-integer value, causing the executor to crash with an unclear error message.

♻️ Proposed fix with graceful defaults

-WORKER_COUNT = int(os.environ.get("PYTHON_POLICY_WORKERS", "4"))
-MAX_CONCURRENT = int(os.environ.get("PYTHON_POLICY_MAX_CONCURRENT", "100"))
+def _parse_int_env(name: str, default: int) -> int:
+    """Parse integer environment variable with fallback to default."""
+    value = os.environ.get(name)
+    if value is None:
+        return default
+    try:
+        return int(value)
+    except ValueError:
+        logging.getLogger(__name__).warning(
+            f"Invalid integer for {name}='{value}', using default {default}"
+        )
+        return default
+
+WORKER_COUNT = _parse_int_env("PYTHON_POLICY_WORKERS", 4)
+MAX_CONCURRENT = _parse_int_env("PYTHON_POLICY_MAX_CONCURRENT", 100)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/main.py` around lines 34 - 35, Wrap
the int() parsing of PYTHON_POLICY_WORKERS and PYTHON_POLICY_MAX_CONCURRENT in
safe parsing with ValueError handling so bad env values don't crash startup:
implement a small helper (e.g., parse_int_env(var_name, default)) or try/except
around the conversions used to set WORKER_COUNT and MAX_CONCURRENT, fall back to
the existing defaults (4 and 100) on parse failure, and emit a clear warning/log
message (use the module logger or print) that includes the invalid value and the
default being used; update references to WORKER_COUNT and MAX_CONCURRENT to use
the sanitized values.

gateway/gateway-builder/internal/validation/validator.go (1)

73-102: Consider handling unknown runtimes explicitly.

The current logic treats any non-"python" runtime as Go, which could silently apply incorrect validations if new runtimes are added in the future.

♻️ Proposed enhancement for explicit runtime handling

 		if policy.Runtime == "python" {
 			// Python-specific validations
 			structErrors := ValidatePythonDirectoryStructure(policy)
 			result.Errors = append(result.Errors, structErrors...)
 			if len(structErrors) > 0 {
 				result.Valid = false
 			}
-		} else {
+		} else if policy.Runtime == "" || policy.Runtime == "go" {
 			// Go-specific validations
 			// Directory structure validation
 			structErrors := ValidateDirectoryStructure(policy)
+			// ... existing code ...
+		} else {
+			result.Errors = append(result.Errors, types.ValidationError{
+				PolicyName:    policy.Name,
+				PolicyVersion: policy.Version,
+				FilePath:      policy.Path,
+				Message:       fmt.Sprintf("unsupported runtime: %s", policy.Runtime),
+			})
+			result.Valid = false
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/validation/validator.go` around lines 73 -
102, The code assumes any non-"python" runtime is Go; update the runtime
handling in the validator to be explicit: check policy.Runtime for "python"
(call ValidatePythonDirectoryStructure) and for "go" (call
ValidateDirectoryStructure, ValidateGoInterface, ValidateGoMod), and add a
default branch that appends a clear validation error to result.Errors and sets
result.Valid = false when an unknown runtime is provided; reference the existing
symbols policy.Runtime, ValidatePythonDirectoryStructure,
ValidateDirectoryStructure, ValidateGoInterface, ValidateGoMod, result.Errors
and result.Valid so the new logic replaces the current if/else block with an
explicit switch or if/else-if branches plus the default error path.

gateway/gateway-runtime/policy-engine/cmd/policy-engine/main.go (1)

242-251: Consider consolidating socket detection logic.

The current implementation has duplicate code in both branches. Consider refactoring to determine the socket path first, then create the health checker once:

Suggested refactor

 		// Check if Python executor is available (socket configured)
 		var pythonHealthChecker admin.PythonHealthChecker
-		if _, err := os.Stat(os.Getenv("PYTHON_EXECUTOR_SOCKET")); err == nil {
-			sm := pythonbridge.GetStreamManager()
-			pythonHealthChecker = pythonbridge.NewPythonHealthAdapter(sm)
-		} else if _, err := os.Stat("/var/run/api-platform/python-executor.sock"); err == nil {
+		pythonSocket := os.Getenv("PYTHON_EXECUTOR_SOCKET")
+		if pythonSocket == "" {
+			pythonSocket = "/var/run/api-platform/python-executor.sock"
+		}
+		if _, err := os.Stat(pythonSocket); err == nil {
 			sm := pythonbridge.GetStreamManager()
 			pythonHealthChecker = pythonbridge.NewPythonHealthAdapter(sm)
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/cmd/policy-engine/main.go` around lines
242 - 251, The socket detection duplicates creation logic for
pythonHealthChecker; refactor by first resolving the socket path (check
os.Getenv("PYTHON_EXECUTOR_SOCKET") and if empty or not present fall back to
"/var/run/api-platform/python-executor.sock"), then Stat that resolved path once
and, if it exists, call pythonbridge.GetStreamManager() and
pythonbridge.NewPythonHealthAdapter(sm) to assign pythonHealthChecker; finally
pass pythonHealthChecker into admin.NewServer(&cfg.PolicyEngine.Admin, k, reg,
xdsSyncStatusProvider, healthProvider, pythonHealthChecker).

gateway/gateway-runtime/python-executor/executor/policy_loader.py (2)

95-97: Use logging.exception to preserve stack trace.

Catching a broad Exception and using logger.error loses the stack trace. Using logger.exception will automatically include the traceback, which is valuable for debugging policy loading failures.

♻️ Proposed fix

         except Exception as e:
-            logger.error(f"Failed to load policy {policy_key} from {module_path}: {e}")
+            logger.exception(f"Failed to load policy {policy_key} from {module_path}")
             return False

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/policy_loader.py` around
lines 95 - 97, The except block in policy_loader.py that catches Exception and
calls logger.error (for example in the load policy routine handling policy_key
and module_path) should be changed to use logger.exception to preserve the
traceback; update the exception handler around where policy loading occurs (the
except Exception as e: block) to call logger.exception with a descriptive
message including policy_key and module_path, then return False as before so the
stack trace is included in logs for debugging.

86-89: Simplify attribute access.

Using getattr with a constant attribute is unnecessary. Direct attribute access is cleaner and equivalent.

♻️ Proposed fix

-            factory = getattr(module, 'get_policy')
-            if not callable(factory):
-                logger.error(f"Policy module {module_path} 'get_policy' is not callable")
-                return False
+            factory = module.get_policy
+            if not callable(factory):
+                logger.error(f"Policy module {module_path} 'get_policy' is not callable")
+                return False

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/policy_loader.py` around
lines 86 - 89, Replace the getattr call with direct attribute access: instead of
factory = getattr(module, 'get_policy'), use factory = module.get_policy; keep
the existing callable check (if not callable(factory): ...). This simplifies
access to the module's get_policy attribute while preserving the current
behavior in policy_loader.py where the variable factory is used to validate and
load the policy.

gateway/gateway-builder/internal/discovery/manifest.go (1)

500-533: parseProcessingMode silently ignores unrecognized mode strings.

Invalid values like "INVALID" are silently ignored, leaving default values. Consider logging a warning for unrecognized values to help policy authors debug configuration issues.

♻️ Proposed fix

 func parseProcessingMode(config *policy.ProcessingModeConfig) *policy.ProcessingMode {
 	if config == nil {
 		// Default: process request headers only
 		return &policy.ProcessingMode{
 			RequestHeaderMode:  policy.HeaderModeProcess,
 			RequestBodyMode:    policy.BodyModeSkip,
 			ResponseHeaderMode: policy.HeaderModeSkip,
 			ResponseBodyMode:   policy.BodyModeSkip,
 		}
 	}

 	mode := &policy.ProcessingMode{
 		RequestHeaderMode:  policy.HeaderModeProcess,
 		RequestBodyMode:    policy.BodyModeSkip,
 		ResponseHeaderMode: policy.HeaderModeSkip,
 		ResponseBodyMode:   policy.BodyModeSkip,
 	}

 	if config.RequestHeaderMode == "SKIP" {
 		mode.RequestHeaderMode = policy.HeaderModeSkip
+	} else if config.RequestHeaderMode != "" && config.RequestHeaderMode != "PROCESS" {
+		slog.Warn("Unrecognized RequestHeaderMode, using default", "value", config.RequestHeaderMode)
 	}
 	if config.RequestBodyMode == "BUFFER" {
 		mode.RequestBodyMode = policy.BodyModeBuffer
+	} else if config.RequestBodyMode != "" && config.RequestBodyMode != "SKIP" {
+		slog.Warn("Unrecognized RequestBodyMode, using default", "value", config.RequestBodyMode)
 	}
 	// ... similar for other modes

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 500 -
533, parseProcessingMode currently ignores unrecognized string values (e.g.,
"INVALID"), leaving defaults silently; update parseProcessingMode to validate
each config field (RequestHeaderMode, RequestBodyMode, ResponseHeaderMode,
ResponseBodyMode) and emit a warning when a value is not one of the expected
literals ("SKIP"/"PROCESS" for headers, "BUFFER" for bodies) so authors can
debug misconfiguration; implement this by either (a) adding a logger parameter
to parseProcessingMode and calling logger.Warnf with the field name and invalid
value, or (b) using the package-level logger if one exists, and change the
per-field if checks to explicit switches or maps that fall through to the
warning path when encountering an unknown string, then keep existing defaults.

gateway/gateway-runtime/python-executor/executor/translator.py (2)

221-226: Unused method _proto_list_to_python.

This helper method is defined but never called. The list conversion is handled inline within _proto_value_to_python at line 217.

♻️ Proposed fix - remove unused method

-    `@staticmethod`
-    def _proto_list_to_python(list_value: ListValue) -> list:
-        """Convert a protobuf ListValue to a Python list."""
-        if list_value is None:
-            return []
-        return [Translator._proto_value_to_python(v) for v in list_value.values]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/translator.py` around lines
221 - 226, The method _proto_list_to_python is unused (list conversion is done
inline in _proto_value_to_python); remove the unused method
_proto_list_to_python from translator.py and also delete any now-unreferenced
imports or comments that only supported it, ensuring _proto_value_to_python
remains the single conversion path and tests/other callers still use
Translator._proto_value_to_python as before.

20-20: Unused import: NullValue.

NullValue is imported but not used in the code.

♻️ Proposed fix

-from google.protobuf.struct_pb2 import Struct, Value, ListValue, NullValue
+from google.protobuf.struct_pb2 import Struct, Value, ListValue

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/translator.py` at line 20,
The import line in translator.py includes an unused symbol NullValue from
google.protobuf.struct_pb2; remove NullValue from the import list so it reads
from google.protobuf.struct_pb2 import Struct, Value, ListValue (cleaning up the
unused import and avoiding linter warnings), ensuring any references to
NullValue in functions or classes (if present) are also handled or replaced
before removing.

gateway/gateway-builder/internal/policyengine/generator.go (2)

177-178: Log ignored errors when reading base requirements.

Silently ignoring the error from os.ReadFile can hide misconfiguration. A debug log would help troubleshoot missing base requirements.

♻️ Proposed fix

 	// 3. Merge requirements.txt files (including base requirements)
 	baseReqPath := filepath.Join(pythonOutputDir, "requirements.txt")
-	baseRequirements, _ := os.ReadFile(baseReqPath)
+	baseRequirements, err := os.ReadFile(baseReqPath)
+	if err != nil && !os.IsNotExist(err) {
+		slog.Debug("Could not read base requirements.txt", "path", baseReqPath, "error", err)
+	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 177
- 178, The read of base requirements using os.ReadFile(baseReqPath) currently
ignores errors; update the code around baseReqPath (constructed with
filepath.Join and pythonOutputDir) to check the returned error, log a
debug/error message that includes baseReqPath and the error details, and handle
the failure (e.g., treat baseRequirements as empty or return the error up)
instead of swallowing it; reference the variables baseReqPath and
baseRequirements when adding the log and handling logic.

289-316: copyDir doesn't handle symlinks.

filepath.Walk follows symlinks by default, which could lead to unexpected behavior (infinite loops with circular symlinks, or copying unintended files). Consider using filepath.WalkDir and checking for symlinks.

♻️ Proposed fix using WalkDir with symlink handling

 // copyDir copies a directory recursively
 func copyDir(src, dst string) error {
-	return filepath.Walk(src, func(path string, info os.FileInfo, err error) error {
+	return filepath.WalkDir(src, func(path string, d os.DirEntry, err error) error {
 		if err != nil {
 			return err
 		}

+		// Skip symlinks to avoid potential issues
+		if d.Type()&os.ModeSymlink != 0 {
+			slog.Debug("Skipping symlink during copy", "path", path)
+			return nil
+		}
+
 		// Compute relative path
 		relPath, err := filepath.Rel(src, path)
 		if err != nil {
 			return err
 		}

 		dstPath := filepath.Join(dst, relPath)

-		if info.IsDir() {
+		if d.IsDir() {
+			info, err := d.Info()
+			if err != nil {
+				return err
+			}
 			return os.MkdirAll(dstPath, info.Mode())
 		}

 		// Copy file
 		data, err := os.ReadFile(path)
 		if err != nil {
 			return err
 		}

+		info, err := d.Info()
+		if err != nil {
+			return err
+		}
 		return os.WriteFile(dstPath, data, info.Mode())
 	})
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 289
- 316, copyDir currently uses filepath.Walk which follows symlinks and can cause
loops or copy unintended files; rewrite copyDir to use filepath.WalkDir and
inspect each DirEntry (or use os.Lstat on path) to detect if an entry is a
symlink (ModeSymlink). For symlinks, do not follow them: call os.Readlink to get
the link target and create a corresponding symlink at dst with os.Symlink
(preserving relative targets when possible), and for regular files/dirs continue
to create dirs or copy file contents as before. Ensure you avoid recursing into
symlinked directories and preserve file modes for non-symlink files (as in the
existing os.ReadFile/os.WriteFile and info.Mode usage).

gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go (1)

89-95: grpc.Dial, grpc.WithBlock, and grpc.WithTimeout are deprecated. In grpc-go v1.63+, grpc.Dial is deprecated in favor of grpc.NewClient, and the "Dial-era options" (WithBlock, WithTimeout) are ignored by NewClient. Migrate to grpc.NewClient with explicit context-based timeouts and readiness management (e.g., call conn.Connect() to wait for connection if needed).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go` around
lines 89 - 95, The current call using grpc.Dial with grpc.WithBlock and
grpc.WithTimeout is deprecated; replace it by creating a client with
grpc.NewClient and an explicit context timeout: build a context.Context with
context.WithTimeout for the 10s deadline, call grpc.NewClient (passing
grpc.WithContextDialer(dialer) and
grpc.WithTransportCredentials(insecure.NewCredentials()) as dial options), then
use the returned client/connection's Connect(ctx) to wait for readiness and
handle errors. Update the code paths that currently reference conn from
grpc.Dial to use the new client/conn returned by grpc.NewClient and ensure
cancellation of the timeout context after Connect returns.

gateway/gateway-runtime/python-executor/executor/server.py (3)

313-319: Consider moving import to module level.

The google.protobuf.struct_pb2.Struct import inside _dict_to_struct works but is unconventional. Moving it to the module-level imports improves clarity and avoids repeated import overhead on each call.
Proposed change

At the top of the file with other imports:
from google.protobuf.struct_pb2 import Struct
Then simplify the method:
     `@staticmethod`
     def _dict_to_struct(d: dict):
-        from google.protobuf.struct_pb2 import Struct
         s = Struct()
         if d:
             s.update(d)
         return s
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 313
- 319, The import of Struct is done inside the static method _dict_to_struct
which is unconventional and incurs repeated import overhead; move "from
google.protobuf.struct_pb2 import Struct" to the module-level imports and then
simplify _dict_to_struct to directly reference Struct (keeping the same
behavior: create s = Struct(), update if d, return s).
160-247: Well-structured concurrent fan-out implementation.

The bidirectional streaming with asyncio.Condition for coordination, Semaphore for concurrency limiting, and out-of-order response handling is correctly designed. The Go side correlating by request_id makes this pattern safe.

Minor style improvement per static analysis (line 237):
Optional: Use list unpacking instead of concatenation
-            all_tasks = [reader_task] + list(in_flight)
+            all_tasks = [reader_task, *in_flight]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 160
- 247, The code builds all_tasks with list concatenation ([reader_task] +
list(in_flight)) inside ExecuteStream; replace that concatenation with list
unpacking (e.g., [reader_task, *in_flight]) to be more idiomatic and avoid an
extra list() allocation—update the expression that constructs all_tasks
(referenced alongside reader_task and the in_flight set) accordingly.
68-74: Use asyncio.get_running_loop() instead of get_event_loop() in async contexts.

When called from within an async function, asyncio.get_event_loop() is discouraged and emits a DeprecationWarning in Python 3.10+. Use asyncio.get_running_loop() which is the recommended approach for accessing the event loop from inside coroutines.

This pattern appears at lines 69, 117, and 182. Apply the fix:
Example fix
-            loop = asyncio.get_event_loop()
+            loop = asyncio.get_running_loop()
Apply to all three locations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 68 -
74, Replace asyncio.get_event_loop() with asyncio.get_running_loop() in the
async contexts inside server.py where run_in_executor is used (specifically the
calls that use self._init_policy_sync and the other synchronous handlers invoked
via run_in_executor). Locate the three occurrences where the code does loop =
asyncio.get_event_loop() before calling loop.run_in_executor(...) and change
them to loop = asyncio.get_running_loop() so the coroutine uses the running
event loop; keep the subsequent run_in_executor(...) calls and arguments
unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 98f57ec5-1534-4663-97b6-bf83c074188c

📥 Commits

Reviewing files that changed from the base of the PR and between 8f0c592 and 7d09d99.

⛔ Files ignored due to path filters (3)

gateway/gateway-builder/go.sum is excluded by !**/*.sum
gateway/gateway-runtime/policy-engine/internal/pythonbridge/proto/python_executor.pb.go is excluded by !**/*.pb.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/proto/python_executor_grpc.pb.go is excluded by !**/*.pb.go

📒 Files selected for processing (52)

gateway/build-lock.yaml
gateway/build.yaml
gateway/gateway-builder/cmd/builder/main.go
gateway/gateway-builder/go.mod
gateway/gateway-builder/internal/buildfile/generator.go
gateway/gateway-builder/internal/discovery/discovery_test.go
gateway/gateway-builder/internal/discovery/manifest.go
gateway/gateway-builder/internal/discovery/policy.go
gateway/gateway-builder/internal/discovery/python_module.go
gateway/gateway-builder/internal/policyengine/generator.go
gateway/gateway-builder/internal/policyengine/policyengine_test.go
gateway/gateway-builder/internal/policyengine/registry.go
gateway/gateway-builder/internal/validation/structure.go
gateway/gateway-builder/internal/validation/validator.go
gateway/gateway-builder/pkg/types/manifest.go
gateway/gateway-builder/pkg/types/policy.go
gateway/gateway-builder/templates/plugin_registry.go.tmpl
gateway/gateway-controller/default-policies/add-python-header.yaml
gateway/gateway-controller/default-policies/prompt-compression.yaml
gateway/gateway-controller/default-policies/slugify-body-python.yaml
gateway/gateway-runtime/Dockerfile
gateway/gateway-runtime/Makefile
gateway/gateway-runtime/docker-entrypoint.sh
gateway/gateway-runtime/policy-engine/cmd/policy-engine/main.go
gateway/gateway-runtime/policy-engine/internal/admin/handlers.go
gateway/gateway-runtime/policy-engine/internal/admin/handlers_test.go
gateway/gateway-runtime/policy-engine/internal/admin/server.go
gateway/gateway-runtime/policy-engine/internal/admin/server_test.go
gateway/gateway-runtime/policy-engine/internal/admin/types.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go
gateway/gateway-runtime/proto/python_executor.proto
gateway/gateway-runtime/python-executor/executor/__init__.py
gateway/gateway-runtime/python-executor/executor/instance_store.py
gateway/gateway-runtime/python-executor/executor/policy_loader.py
gateway/gateway-runtime/python-executor/executor/server.py
gateway/gateway-runtime/python-executor/executor/translator.py
gateway/gateway-runtime/python-executor/main.py
gateway/gateway-runtime/python-executor/proto/__init__.py
gateway/gateway-runtime/python-executor/proto/python_executor_pb2.py
gateway/gateway-runtime/python-executor/proto/python_executor_pb2_grpc.py
gateway/gateway-runtime/python-executor/requirements.txt
gateway/gateway-runtime/python-executor/sdk/__init__.py
gateway/gateway-runtime/python-executor/sdk/policy.py
gateway/sample-policies/build.yaml
gateway/sample-policies/slugify-body-python/policy-definition.yaml
gateway/sample-policies/slugify-body-python/policy.py
gateway/sample-policies/slugify-body-python/requirements.txt
gateway/target/build.yaml
sdk/gateway/policy/v1alpha/definition.go

coderabbitai · 2026-03-12T08:35:58Z

+func findPolicyDir(installDir string) (string, error) {
+	var found string
+	err := filepath.Walk(installDir, func(path string, info os.FileInfo, err error) error {
+		if err != nil {
+			return err
+		}
+		if !info.IsDir() && info.Name() == "policy-definition.yaml" {
+			found = filepath.Dir(path)
+			return filepath.SkipAll
+		}
+		return nil
+	})
+	if err != nil {
+		return "", err
+	}
+	if found == "" {
+		return "", fmt.Errorf("no policy-definition.yaml found in installed package")
+	}
+	return found, nil


⚠️ Potential issue | 🟠 Major

Require exactly one policy-definition.yaml in the installed tree.

This returns the first match it finds. If the install tree contains multiple policy definitions, discovery silently picks one path instead of failing fast on an ambiguous package.

Suggested fix

func findPolicyDir(installDir string) (string, error) { - var found string + var matches []string err := filepath.Walk(installDir, func(path string, info os.FileInfo, err error) error { if err != nil { return err } if !info.IsDir() && info.Name() == "policy-definition.yaml" { - found = filepath.Dir(path) - return filepath.SkipAll + matches = append(matches, filepath.Dir(path)) } return nil }) if err != nil { return "", err } - if found == "" { + if len(matches) == 0 { return "", fmt.Errorf("no policy-definition.yaml found in installed package") } - return found, nil + if len(matches) > 1 { + return "", fmt.Errorf("multiple policy-definition.yaml files found in installed package") + } + return matches[0], nil }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 165 - 183, The function findPolicyDir currently stops at the first policy-definition.yaml and should instead ensure exactly one exists; modify findPolicyDir to scan the installDir and detect multiple matches by either collecting all matching directories or tracking the first found path and returning an error if a second match is encountered (e.g., include both paths in the error), and only return the single directory if exactly one match is found; keep references to the existing function name findPolicyDir and the "policy-definition.yaml" filename when implementing the change.

coderabbitai · 2026-03-12T08:35:58Z

+func (b *PythonBridge) OnRequest(ctx *policy.RequestContext, params map[string]interface{}) policy.RequestAction {
+	req := b.buildRequest(ctx, "on_request", params)
+
+	// Execute via stream manager
+	resp, err := b.streamManager.Execute(context.Background(), req)
+	if err != nil {
+		b.slogger.Error("Failed to execute Python policy", "error", err, "phase", "on_request")
+		return b.errorResponse(err)
+	}


⚠️ Potential issue | 🟠 Major

Request context is discarded in favor of context.Background().

Using context.Background() loses the original request's cancellation signals, deadlines, and tracing context. If the HTTP request is cancelled, the Python execution will continue unnecessarily.

🐛 Proposed fix - accept context parameter

The OnRequest and OnResponse methods should ideally receive a context.Context. If the SDK interface doesn't support this, consider extracting a context from RequestContext or using a context stored in the bridge.

For now, a workaround using a derived context:

func (b *PythonBridge) OnRequest(ctx *policy.RequestContext, params map[string]interface{}) policy.RequestAction { req := b.buildRequest(ctx, "on_request", params) + // TODO: Propagate proper context when SDK interface supports it + execCtx, cancel := context.WithTimeout(context.Background(), getTimeout()) + defer cancel() + // Execute via stream manager - resp, err := b.streamManager.Execute(context.Background(), req) + resp, err := b.streamManager.Execute(execCtx, req)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 57 - 65, The code is discarding the incoming request context by calling streamManager.Execute with context.Background(); change it to derive and pass the real request context from the RequestContext (e.g., use ctx.Context(), ctx.Request.Context(), or the bridge-stored context) when calling streamManager.Execute in PythonBridge.OnRequest, and make the equivalent change in PythonBridge.OnResponse so cancellation, deadlines and tracing propagate into buildRequest/streamManager.Execute.

CLAassistant · 2026-03-12T08:37:47Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/sample-policies/build.yaml`:
- Around line 14-15: The manifest uses the wrong directory name for the sample:
update the filePath value (the entry.FilePath that the gateway builder resolves
and validates) to match the actual sample directory (e.g., change
"./slugify-body" to the exact directory name "./slugify-body-python") so the
os.Stat() check in the discovery/manifest code and path validation in
fsutil/path_validation.go will succeed; ensure the filePath string exactly
matches the real directory name used in the repository.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 51f78fcc-54da-4b5b-bdc4-5ff7c9d3fe38

📥 Commits

Reviewing files that changed from the base of the PR and between 7d09d99 and 03812ba.

📒 Files selected for processing (1)

gateway/sample-policies/build.yaml

Update example PyPI package name in discovery code and tests from "wso2-gateway-policy-prompt-compression" to "my-gateway-policy" (comments and TestParsePipPackageRef). Remove two default policy YAMLs: gateway-controller/default-policies/add-python-header.yaml and prompt-compression.yaml.

Update .gitignore to exclude common Python artifacts (__pycache__, .venv, .pytest_cache, .coverage, .mypy_cache, .ipynb_checkpoints, egg metadata, etc.). Remove two sample Python policies (slugify-body-python and uppercase-body) from gateway/build-lock.yaml. Tweak gateway-runtime/docker-entrypoint.sh to check for python_policy_registry.py (generated only when Python policies exist) before starting the Python executor to avoid launching it when no Python policies are present.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

gateway/gateway-builder/internal/discovery/python_module.go (3)

165-184: ⚠️ Potential issue | 🟠 Major

Fail fast when multiple policy-definition.yaml files are found.

The function returns the first match found. If the installed package contains multiple policy definitions (e.g., from transitive dependencies), discovery silently picks one instead of failing with an ambiguity error.

Suggested fix to detect and reject ambiguous packages

 func findPolicyDir(installDir string) (string, error) {
-	var found string
+	var matches []string
 	err := filepath.Walk(installDir, func(path string, info os.FileInfo, err error) error {
 		if err != nil {
 			return err
 		}
 		if !info.IsDir() && info.Name() == "policy-definition.yaml" {
-			found = filepath.Dir(path)
-			return filepath.SkipAll
+			matches = append(matches, filepath.Dir(path))
 		}
 		return nil
 	})
 	if err != nil {
 		return "", err
 	}
-	if found == "" {
+	if len(matches) == 0 {
 		return "", fmt.Errorf("no policy-definition.yaml found in installed package")
 	}
-	return found, nil
+	if len(matches) > 1 {
+		return "", fmt.Errorf("multiple policy-definition.yaml files found in installed package: %v", matches)
+	}
+	return matches[0], nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 165
- 184, The function findPolicyDir currently stops at the first
"policy-definition.yaml" (uses filepath.SkipAll) and returns that, which hides
ambiguity; change it to collect all matching directories during filepath.Walk
(remove filepath.SkipAll) into a slice or map to deduplicate, track the number
of unique matches, and after the walk return an error if count == 0 or count > 1
(ambiguous) and only return the single directory when exactly one match is
found; update variables used in findPolicyDir (e.g., replace/augment found with
a matches slice or set) and ensure error paths propagate the new ambiguous
error.

69-127: ⚠️ Potential issue | 🟠 Major

Redact credentials from logs and error messages.

The pipPackage reference and indexURL may contain credentials when using private indexes (e.g., https://user:token@host/simple). Line 69 logs the full reference, line 92 logs indexURL directly, and line 111 includes raw stderr which may echo credentials.

Suggested fix to sanitize sensitive data

-	slog.Info("Fetching pip package", "reference", pipPackage, "phase", "discovery")
-
 	pkgName, version, indexURL, err := ParsePipPackageRef(pipPackage)
 	if err != nil {
 		return nil, err
 	}
+	slog.Info("Fetching pip package",
+		"package", pkgName,
+		"version", version,
+		"phase", "discovery")

 	// ... later ...

 	slog.Debug("Running pip install",
 		"package", pipSpec,
 		"target", installDir,
-		"indexURL", indexURL,
+		"hasCustomIndex", indexURL != "",
 		"phase", "discovery")

 	// ... in error handling ...
-	return nil, fmt.Errorf("pip install failed for %s: %w; stderr: %s", pipSpec, err, stderr.String())
+	return nil, fmt.Errorf("pip install failed for %s: %w", pipSpec, err)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 69
- 127, The logs and error messages expose credentials from pipPackage, indexURL
and pip stderr; update discovery code in python_module.go to sanitize/mask
credentials before logging or including in errors: add a helper (e.g.,
sanitizeURL or maskCredentials) that strips user:pass@ from URLs and use it when
logging pipPackage, indexURL and when building the pip error message (use
sanitized stderr or redact any URLs found in stderr) in the pip install block
(references: pipPackage, indexURL, stderr, pipSpec, the exec.CommandContext call
and resolvePipExecutable call); ensure RemoveAll(installDir) remains but never
logs raw credential-bearing strings.

139-160: ⚠️ Potential issue | 🟠 Major

Fix parsing for index URLs containing credentials.

Using strings.LastIndex(ref, "@") misparses values like pkg==1.0.0@https://user:token@host/simple because the last @ is inside the URL, not the separator. The version would incorrectly absorb part of the URL.

Suggested fix: split by `==` first, then handle index URL

 func ParsePipPackageRef(ref string) (pkgName, version, indexURL string, err error) {
-	// Split off optional @<index-url> suffix
-	mainPart := ref
-	if idx := strings.LastIndex(ref, "@"); idx > 0 {
-		candidate := ref[idx+1:]
-		// Only treat as index URL if it looks like a URL (contains "://")
-		if strings.Contains(candidate, "://") {
-			indexURL = candidate
-			mainPart = ref[:idx]
-		}
-	}
-
-	// Split "package==version"
-	parts := strings.SplitN(mainPart, "==", 2)
-	if len(parts) != 2 || parts[0] == "" || parts[1] == "" {
-		return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>': %s", ref)
+	// Split "package==version[`@index-url`]" by == first
+	parts := strings.SplitN(ref, "==", 2)
+	if len(parts) != 2 {
+		return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]': %s", ref)
 	}
 
 	pkgName = strings.TrimSpace(parts[0])
-	version = strings.TrimSpace(parts[1])
+	versionPart := strings.TrimSpace(parts[1])
+
+	// Check for optional @<index-url> suffix in the version part
+	if idx := strings.Index(versionPart, "@"); idx >= 0 {
+		version = strings.TrimSpace(versionPart[:idx])
+		indexURL = strings.TrimSpace(versionPart[idx+1:])
+		if !strings.Contains(indexURL, "://") {
+			return "", "", "", fmt.Errorf("invalid pipPackage format, index URL must contain '://': %s", ref)
+		}
+	} else {
+		version = versionPart
+	}
+
+	if pkgName == "" || version == "" {
+		return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]': %s", ref)
+	}
 
 	return pkgName, version, indexURL, nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 139
- 160, ParsePipPackageRef currently finds the separator "@" before splitting by
"==", which breaks when the index URL contains credentials; change the logic to
first split ref by "==" (using strings.SplitN(ref, "==", 2)) to get pkg and
rightSide, validate both parts, then detect an index URL inside rightSide by
locating "://"; if a scheme is found, find the separator '@' that precedes that
scheme (e.g., use strings.LastIndex on rightSide up to the scheme index) and
treat the substring after that '@' as indexURL and the substring before it as
version; update variables pkgName, version, indexURL in ParsePipPackageRef
accordingly and keep existing error handling for invalid formats.

🧹 Nitpick comments (3)

gateway/gateway-builder/internal/discovery/python_module.go (2)

76-80: Document temp directory cleanup responsibility.

The installDir is created but only cleaned up on error paths. On success, PipPackageInfo.Dir points inside this temp directory, so the caller must handle cleanup. Consider documenting this contract or providing a cleanup helper.

Suggested documentation addition

 // FetchPipPackage downloads a Python policy package using pip.
 //
 // Format: "<package>==<version>[@<index-url>]"
 //
 // Examples:
 //   - "my-gateway-policy==1.0.0"
 //   - "my-org-auth-policy==2.3.0@https://pypi.my-company.com/simple"
 //
 // The package is installed into an isolated directory using pip install --target.
+//
+// NOTE: On success, the returned PipPackageInfo.Dir resides within a temporary
+// directory that the caller is responsible for cleaning up when no longer needed.
+// Use os.RemoveAll(filepath.Dir(info.Dir)) or track the parent temp directory.
 func FetchPipPackage(pipPackage string) (*PipPackageInfo, error) {

Also applies to: 117-121

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 76
- 80, The temp install directory created by os.MkdirTemp (installDir) is only
removed on error, yet PipPackageInfo.Dir is set to a path inside that temp dir
on success; update the code and/or docs to make cleanup responsibility explicit:
either add a clear comment/docstring on the function that returns
*PipPackageInfo stating callers must remove PipPackageInfo.Dir/installDir when
done, or provide a cleanup helper (e.g., a Close/Remove method or a returned
cleanup func) alongside PipPackageInfo to encapsulate os.RemoveAll of
installDir; ensure references to installDir and PipPackageInfo.Dir (and any
created helper) are added where the function is defined and where PipPackageInfo
is constructed so callers know how to release the temp directory.

41-49: Fallback to python3 without validation may cause confusing failures.

If neither pip3 nor pip is found, the function silently returns python3 without verifying it exists. This could produce a confusing "executable not found" error later during cmd.Run().

Suggested fix to validate fallback

 func resolvePipExecutable() (exe string, prefixArgs []string) {
 	for _, candidate := range []string{"pip3", "pip"} {
 		if _, err := exec.LookPath(candidate); err == nil {
 			return candidate, nil
 		}
 	}
 	// Fallback: python3 -m pip
+	if _, err := exec.LookPath("python3"); err != nil {
+		// Return python3 anyway; caller will get a clear error from exec
+		slog.Warn("No pip executable found; falling back to python3 -m pip", "phase", "discovery")
+	}
 	return "python3", []string{"-m", "pip"}
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 41
- 49, The fallback in resolvePipExecutable should validate that "python3" exists
before returning it; update resolvePipExecutable to call
exec.LookPath("python3") and only return "python3" with []string{"-m","pip"} if
LookPath succeeds, otherwise return an explicit missing indicator (e.g., "" and
nil) so callers know no pip executable was found; also update the caller(s) that
build/Run exec.Command (where cmd.Run() is used) to check for an empty exe and
return a clear error message instead of letting exec.Command/Run fail with a
confusing "executable not found" error.

gateway/gateway-builder/internal/discovery/discovery_test.go (1)

871-908: Consider adding a test case for credentialed index URLs.

The ParsePipPackageRef tests cover basic scenarios, but a test case with credentials in the index URL (e.g., pkg==1.0.0@https://user:token@host/simple) would help verify correct parsing behavior for private registries using basic auth.

Example test case

func TestParsePipPackageRef_WithCredentialedIndex(t *testing.T) {
	pkg, version, indexURL, err := ParsePipPackageRef("my-policy==1.0.0@https://user:token@pypi.private.com/simple")

	require.NoError(t, err)
	assert.Equal(t, "my-policy", pkg)
	assert.Equal(t, "1.0.0", version)
	assert.Equal(t, "https://user:token@pypi.private.com/simple", indexURL)
}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/discovery_test.go` around lines
871 - 908, Add a unit test that verifies ParsePipPackageRef correctly parses
index URLs containing credentials by adding a new test (e.g.,
TestParsePipPackageRef_WithCredentialedIndex) that calls ParsePipPackageRef with
an input like "my-policy==1.0.0@https://user:token@pypi.private.com/simple",
asserts no error, and checks pkg == "my-policy", version == "1.0.0" and indexURL
== "https://user:token@pypi.private.com/simple"; place the test alongside the
existing tests in discovery_test.go so it covers credentialed private registries
and ensures ParsePipPackageRef handles user:password in the host component.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/docker-entrypoint.sh`:
- Around line 104-107: The exported environment variables for the Python
executor are incorrect: replace or add exports so they match what main.py reads
(use PYTHON_POLICY_MAX_CONCURRENT with the same default of 100 instead of
PYTHON_POLICY_TIMEOUT), keep PYTHON_POLICY_WORKERS and PYTHON_EXECUTOR_SOCKET
as-is, and remove the unused PYTHON_POLICY_TIMEOUT export; also update the
related startup log message (the log near the executor export block) to report
PYTHON_POLICY_MAX_CONCURRENT (and its value) instead of PYTHON_POLICY_TIMEOUT so
logs reflect the actual configuration used by main.py.
- Around line 71-78: The --py.* flags collected into PY_ARGS are never used
because main.py reads only environment variables; update docker-entrypoint.sh to
translate any --py.<name> <value> pairs into environment variables before
invoking main.py (e.g., convert --py.foo bar into export FOO=bar or the specific
env var names main.py expects), remove the unused accumulation into PY_ARGS, and
ensure PY_ARGS is not passed to python; alternatively, if you prefer CLI
support, add argument parsing in main.py to consume these flags (reference:
PY_ARGS, docker-entrypoint.sh, main.py, Python Executor) so the flags actually
affect runtime configuration.

---

Duplicate comments:
In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 165-184: The function findPolicyDir currently stops at the first
"policy-definition.yaml" (uses filepath.SkipAll) and returns that, which hides
ambiguity; change it to collect all matching directories during filepath.Walk
(remove filepath.SkipAll) into a slice or map to deduplicate, track the number
of unique matches, and after the walk return an error if count == 0 or count > 1
(ambiguous) and only return the single directory when exactly one match is
found; update variables used in findPolicyDir (e.g., replace/augment found with
a matches slice or set) and ensure error paths propagate the new ambiguous
error.
- Around line 69-127: The logs and error messages expose credentials from
pipPackage, indexURL and pip stderr; update discovery code in python_module.go
to sanitize/mask credentials before logging or including in errors: add a helper
(e.g., sanitizeURL or maskCredentials) that strips user:pass@ from URLs and use
it when logging pipPackage, indexURL and when building the pip error message
(use sanitized stderr or redact any URLs found in stderr) in the pip install
block (references: pipPackage, indexURL, stderr, pipSpec, the
exec.CommandContext call and resolvePipExecutable call); ensure
RemoveAll(installDir) remains but never logs raw credential-bearing strings.
- Around line 139-160: ParsePipPackageRef currently finds the separator "@"
before splitting by "==", which breaks when the index URL contains credentials;
change the logic to first split ref by "==" (using strings.SplitN(ref, "==", 2))
to get pkg and rightSide, validate both parts, then detect an index URL inside
rightSide by locating "://"; if a scheme is found, find the separator '@' that
precedes that scheme (e.g., use strings.LastIndex on rightSide up to the scheme
index) and treat the substring after that '@' as indexURL and the substring
before it as version; update variables pkgName, version, indexURL in
ParsePipPackageRef accordingly and keep existing error handling for invalid
formats.

---

Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/discovery_test.go`:
- Around line 871-908: Add a unit test that verifies ParsePipPackageRef
correctly parses index URLs containing credentials by adding a new test (e.g.,
TestParsePipPackageRef_WithCredentialedIndex) that calls ParsePipPackageRef with
an input like "my-policy==1.0.0@https://user:token@pypi.private.com/simple",
asserts no error, and checks pkg == "my-policy", version == "1.0.0" and indexURL
== "https://user:token@pypi.private.com/simple"; place the test alongside the
existing tests in discovery_test.go so it covers credentialed private registries
and ensures ParsePipPackageRef handles user:password in the host component.

In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 76-80: The temp install directory created by os.MkdirTemp
(installDir) is only removed on error, yet PipPackageInfo.Dir is set to a path
inside that temp dir on success; update the code and/or docs to make cleanup
responsibility explicit: either add a clear comment/docstring on the function
that returns *PipPackageInfo stating callers must remove
PipPackageInfo.Dir/installDir when done, or provide a cleanup helper (e.g., a
Close/Remove method or a returned cleanup func) alongside PipPackageInfo to
encapsulate os.RemoveAll of installDir; ensure references to installDir and
PipPackageInfo.Dir (and any created helper) are added where the function is
defined and where PipPackageInfo is constructed so callers know how to release
the temp directory.
- Around line 41-49: The fallback in resolvePipExecutable should validate that
"python3" exists before returning it; update resolvePipExecutable to call
exec.LookPath("python3") and only return "python3" with []string{"-m","pip"} if
LookPath succeeds, otherwise return an explicit missing indicator (e.g., "" and
nil) so callers know no pip executable was found; also update the caller(s) that
build/Run exec.Command (where cmd.Run() is used) to check for an empty exe and
return a clear error message instead of letting exec.Command/Run fail with a
confusing "executable not found" error.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 61a52953-9abb-496b-9acb-1ca0b85a9b8c

📥 Commits

Reviewing files that changed from the base of the PR and between 03812ba and 2fdc9f8.

📒 Files selected for processing (5)

.gitignore
gateway/build-lock.yaml
gateway/gateway-builder/internal/discovery/discovery_test.go
gateway/gateway-builder/internal/discovery/python_module.go
gateway/gateway-runtime/docker-entrypoint.sh

✅ Files skipped from review due to trivial changes (1)

.gitignore

Fetch pip wheels (no deps) for Python policies during discovery, extract top_level module and source for build-time validation, and record pip spec/index so dependency installation can be deferred to the python-deps Docker stage. Added helpers to find the downloaded .whl, read top_level.txt, and extract the module from the wheel. Adjusted FetchPipPackage to use `pip download --no-deps` and return PipPackageInfo with PipSpec, PipIndexURL, TopLevelModule and extracted Dir. Updated GeneratePythonArtifacts to skip copying pip packages, use the package's top-level module in the generated python registry, and merge requirements by adding pip specs and extra index URLs while reading requirements.txt only for local policies. Extended DiscoveredPolicy with pip-related fields and clarified PythonSourceDir usage.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

gateway/gateway-builder/internal/policyengine/generator.go (3)

236-250: Consider handling malformed version strings gracefully.

strings.Split(p.Version, ".")[0] will panic with an index-out-of-range if Version is an empty string. While upstream validation should prevent this, defensive coding would add robustness.

♻️ Proposed defensive check

 for _, p := range policies {
-	majorVersion := strings.Split(p.Version, ".")[0]
+	versionParts := strings.Split(p.Version, ".")
+	majorVersion := "0"
+	if len(versionParts) > 0 && versionParts[0] != "" {
+		majorVersion = versionParts[0]
+	}
 	key := fmt.Sprintf("%s:%s", p.Name, majorVersion)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 236
- 250, The loop over policies can panic when p.Version is empty; in the block
that computes majorVersion (currently using strings.Split(p.Version, ".")[0])
add a defensive check: trim and verify p.Version is non-empty (or use
strings.SplitN and default to "0") before indexing, then use that safe
majorVersion when building key; update the code around variables majorVersion,
key and the sb.WriteString call and keep sanitizeModuleName(p.Name) and the
IsPipPackage branch unchanged.

336-361: copyDir follows symlinks, which could lead to unexpected behavior or security issues.

filepath.Walk follows symlinks by default. If a policy directory contains a symlink pointing outside the source tree (or a circular symlink), this could copy unintended files or cause infinite loops.

♻️ Proposed fix to skip symlinks

 func copyDir(src, dst string) error {
 	return filepath.Walk(src, func(path string, info os.FileInfo, err error) error {
 		if err != nil {
 			return err
 		}
+
+		// Skip symlinks to avoid following links outside the source tree
+		if info.Mode()&os.ModeSymlink != 0 {
+			return nil
+		}

 		// Compute relative path
 		relPath, err := filepath.Rel(src, path)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 336
- 361, The copyDir function currently follows symlinks via filepath.Walk; update
it to detect and skip symlinks so external or circular links aren't followed.
Inside copyDir (or when switching to filepath.WalkDir), use os.Lstat (or
DirEntry.Type()/IsSymlink) to check whether the current path is a symlink and,
if so, return nil to skip it; otherwise proceed with creating directories and
copying files using the file's regular mode. Ensure both directory and file
handling branches perform the symlink check so no symlink targets are copied or
traversed.

329-333: sanitizeModuleName may not handle all invalid Python identifier characters.

Python module names cannot contain spaces, dots, or start with digits. Currently only hyphens are replaced with underscores. Consider validating/sanitizing more comprehensively to prevent import errors at runtime.

♻️ Proposed more thorough sanitization

 // sanitizeModuleName converts a policy name to a valid Python module name
 func sanitizeModuleName(name string) string {
-	// Replace hyphens with underscores
-	return strings.ReplaceAll(name, "-", "_")
+	// Replace hyphens with underscores
+	result := strings.ReplaceAll(name, "-", "_")
+	// Replace dots with underscores
+	result = strings.ReplaceAll(result, ".", "_")
+	// Ensure it doesn't start with a digit
+	if len(result) > 0 && result[0] >= '0' && result[0] <= '9' {
+		result = "_" + result
+	}
+	return result
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 329
- 333, sanitizeModuleName currently only replaces hyphens and can still produce
invalid Python module names; update sanitizeModuleName to (1) normalize to
lowercase, (2) replace any sequence of non-alphanumeric/underscore characters
(including spaces, dots, punctuation) with a single underscore, (3) trim
leading/trailing underscores, (4) if the result starts with a digit prefix it
with an underscore, and (5) fallback to a safe constant like "module" when the
result is empty — apply these rules in the sanitizeModuleName function so
generated names are valid Python identifiers and safe for imports.

gateway/gateway-builder/internal/discovery/manifest.go (2)

516-549: parseProcessingMode uses case-sensitive string comparison.

The mode strings are compared exactly (e.g., "SKIP", "BUFFER"). If the YAML contains lowercase or mixed-case values (e.g., "skip", "buffer"), they won't be recognized and the default will apply silently. Consider normalizing to uppercase for robustness.

♻️ Proposed fix to normalize case

+import "strings"
+
 func parseProcessingMode(config *policy.ProcessingModeConfig) *policy.ProcessingMode {
 	if config == nil {
 		// Default: process request headers only
 		return &policy.ProcessingMode{
 			RequestHeaderMode:  policy.HeaderModeProcess,
 			RequestBodyMode:    policy.BodyModeSkip,
 			ResponseHeaderMode: policy.HeaderModeSkip,
 			ResponseBodyMode:   policy.BodyModeSkip,
 		}
 	}

 	mode := &policy.ProcessingMode{
 		RequestHeaderMode:  policy.HeaderModeProcess,
 		RequestBodyMode:    policy.BodyModeSkip,
 		ResponseHeaderMode: policy.HeaderModeSkip,
 		ResponseBodyMode:   policy.BodyModeSkip,
 	}

-	if config.RequestHeaderMode == "SKIP" {
+	if strings.ToUpper(config.RequestHeaderMode) == "SKIP" {
 		mode.RequestHeaderMode = policy.HeaderModeSkip
 	}
-	if config.RequestBodyMode == "BUFFER" {
+	if strings.ToUpper(config.RequestBodyMode) == "BUFFER" {
 		mode.RequestBodyMode = policy.BodyModeBuffer
 	}
-	if config.ResponseHeaderMode == "PROCESS" {
+	if strings.ToUpper(config.ResponseHeaderMode) == "PROCESS" {
 		mode.ResponseHeaderMode = policy.HeaderModeProcess
 	}
-	if config.ResponseBodyMode == "BUFFER" {
+	if strings.ToUpper(config.ResponseBodyMode) == "BUFFER" {
 		mode.ResponseBodyMode = policy.BodyModeBuffer
 	}

 	return mode
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 516 -
549, The parseProcessingMode function compares config fields case-sensitively so
values like "skip" or "buffer" are ignored; normalize the incoming strings
before checks (e.g., call strings.ToUpper or use strings.EqualFold) when
inspecting config.RequestHeaderMode, config.RequestBodyMode,
config.ResponseHeaderMode, and config.ResponseBodyMode, then set
mode.RequestHeaderMode, mode.RequestBodyMode, mode.ResponseHeaderMode, and
mode.ResponseBodyMode accordingly; ensure you import strings if using ToUpper.

238-257: DetectRuntime only checks top-level directory for .py files.

If a Python policy has its .py files in a subdirectory (e.g., src/policy.py), this detection will incorrectly classify it as Go. Consider whether the convention requires top-level .py files or if recursive checking is needed.

♻️ Alternative: Check for policy.py specifically

 func DetectRuntime(policyDir string) string {
 	goMod := filepath.Join(policyDir, "go.mod")
 	if _, err := os.Stat(goMod); err == nil {
 		return "go"
 	}

-	entries, err := os.ReadDir(policyDir)
-	if err != nil {
-		return "go"
-	}
-	for _, e := range entries {
-		if !e.IsDir() && filepath.Ext(e.Name()) == ".py" {
-			return "python"
-		}
-	}
+	// Check for policy.py which is required for Python policies
+	policyPy := filepath.Join(policyDir, "policy.py")
+	if _, err := os.Stat(policyPy); err == nil {
+		return "python"
+	}

 	return "go"
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 238 -
257, DetectRuntime currently only inspects the policyDir top-level entries and
will miss Python policies whose .py files live in subdirectories; update
DetectRuntime to recursively scan the policyDir (e.g., use filepath.WalkDir or a
recursive os.ReadDir) and return "python" if any file with extension ".py" is
found (or, alternatively, specifically look for "policy.py" anywhere in the
tree), while preserving the existing error handling and returning "go" only if
no Python file is discovered.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 64-139: FetchPipPackage leaves the temporary downloadDir behind on
success; after calling findWheelFile and extractModuleFromWheel you must clean
up downloadDir before returning. Modify FetchPipPackage so that once extractDir
is prepared it removes the temporary downloadDir (e.g.
os.RemoveAll(downloadDir)) prior to the final return; if extractModuleFromWheel
currently returns a path inside downloadDir, first create a separate temp
directory (or move/copy the extracted module to a new temp dir) and set
PipPackageInfo.Dir to that new location, then remove downloadDir to avoid
deleting the returned extractDir; refer to FetchPipPackage, findWheelFile, and
extractModuleFromWheel to locate the relevant logic.

In `@gateway/gateway-builder/internal/policyengine/generator.go`:
- Around line 191-197: The code currently ignores the error from os.ReadFile
when loading baseReqPath; change the logic in the block around baseReqPath :=
filepath.Join(pythonOutputDir, "requirements.txt") and baseRequirements, _ :=
os.ReadFile(baseReqPath) to properly handle errors: call os.ReadFile and if it
returns an error check if os.IsNotExist(err) then treat baseRequirements as an
empty string, otherwise return the error (propagate it) so permission or I/O
errors are not silenced; keep the subsequent mergeRequirements(pythonPolicies,
string(baseRequirements)) call unchanged.

---

Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/manifest.go`:
- Around line 516-549: The parseProcessingMode function compares config fields
case-sensitively so values like "skip" or "buffer" are ignored; normalize the
incoming strings before checks (e.g., call strings.ToUpper or use
strings.EqualFold) when inspecting config.RequestHeaderMode,
config.RequestBodyMode, config.ResponseHeaderMode, and config.ResponseBodyMode,
then set mode.RequestHeaderMode, mode.RequestBodyMode, mode.ResponseHeaderMode,
and mode.ResponseBodyMode accordingly; ensure you import strings if using
ToUpper.
- Around line 238-257: DetectRuntime currently only inspects the policyDir
top-level entries and will miss Python policies whose .py files live in
subdirectories; update DetectRuntime to recursively scan the policyDir (e.g.,
use filepath.WalkDir or a recursive os.ReadDir) and return "python" if any file
with extension ".py" is found (or, alternatively, specifically look for
"policy.py" anywhere in the tree), while preserving the existing error handling
and returning "go" only if no Python file is discovered.

In `@gateway/gateway-builder/internal/policyengine/generator.go`:
- Around line 236-250: The loop over policies can panic when p.Version is empty;
in the block that computes majorVersion (currently using
strings.Split(p.Version, ".")[0]) add a defensive check: trim and verify
p.Version is non-empty (or use strings.SplitN and default to "0") before
indexing, then use that safe majorVersion when building key; update the code
around variables majorVersion, key and the sb.WriteString call and keep
sanitizeModuleName(p.Name) and the IsPipPackage branch unchanged.
- Around line 336-361: The copyDir function currently follows symlinks via
filepath.Walk; update it to detect and skip symlinks so external or circular
links aren't followed. Inside copyDir (or when switching to filepath.WalkDir),
use os.Lstat (or DirEntry.Type()/IsSymlink) to check whether the current path is
a symlink and, if so, return nil to skip it; otherwise proceed with creating
directories and copying files using the file's regular mode. Ensure both
directory and file handling branches perform the symlink check so no symlink
targets are copied or traversed.
- Around line 329-333: sanitizeModuleName currently only replaces hyphens and
can still produce invalid Python module names; update sanitizeModuleName to (1)
normalize to lowercase, (2) replace any sequence of non-alphanumeric/underscore
characters (including spaces, dots, punctuation) with a single underscore, (3)
trim leading/trailing underscores, (4) if the result starts with a digit prefix
it with an underscore, and (5) fallback to a safe constant like "module" when
the result is empty — apply these rules in the sanitizeModuleName function so
generated names are valid Python identifiers and safe for imports.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b0f49b13-b8e2-4ac5-94d6-55e8ff0806a4

📥 Commits

Reviewing files that changed from the base of the PR and between 2fdc9f8 and 2c46bd4.

📒 Files selected for processing (4)

gateway/gateway-builder/internal/discovery/manifest.go
gateway/gateway-builder/internal/discovery/python_module.go
gateway/gateway-builder/internal/policyengine/generator.go
gateway/gateway-builder/pkg/types/policy.go

Add sanitizeURL to redact credentials from index URLs in logs and errors to avoid leaking secrets. Update FetchPipPackage to log a sanitized indexURL, replace occurrences of the raw index URL in pip stderr on failure, and store the sanitized value in PipPackageInfo.IndexURL. Improve ParsePipPackageRef parsing and validation: split on '==', handle an '@<index-url>' suffix in the version part, trim values, and return clearer errors for invalid formats.

Add a unit test for parsing pip refs with credentialed indexes. Harden pip package fetching: document temporary dir ownership, improve pip executable resolution and return a clear error when no pip/python3 is found. Wire up CLI args in the Python executor (socket, workers, max-concurrent, log-level) so docker-entrypoint can override env defaults; update startup logging. Rename/replace PYTHON_POLICY_TIMEOUT with PYTHON_POLICY_MAX_CONCURRENT in the docker entrypoint and logs. Fix StreamManager.Connect to cancel/close previous streams/connections before creating new ones to avoid resource leaks. Adjust Python gRPC server shutdown call (use grace) and reduce sensitive logging in the sample slugify policy (log lengths instead of content). Update sample build file path for the Python slugify policy.

coderabbitai

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

gateway/gateway-builder/internal/discovery/discovery_test.go (1)
535-555: ⚠️ Potential issue | 🟠 Major

Don't treat missing or ambiguous policy dirs as Go.

These assertions lock in the fallback that makes an empty or typo'd Python policy path surface as a Go-specific go.mod failure (see Line 554). DetectRuntime should return an explicit unknown/error state here so discovery can report the actual problem.

Also applies to: 855-867
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/discovery_test.go` around lines
535 - 555, The test reveals that DiscoverPoliciesFromBuildFile currently treats
ambiguous/missing runtime files as Go and surfaces a go.mod error; update
DetectRuntime to return an explicit unknown/err state (not default to Go) when
it cannot determine runtime (no go.mod and no .py files), and modify
DiscoverPoliciesFromBuildFile to check for that unknown result and return a
clear, contextual error (e.g., "unable to detect runtime for policy X" or
propagate the DetectRuntime error) instead of proceeding to Go-specific parsing;
locate references to DetectRuntime and the runtime-switch logic inside
DiscoverPoliciesFromBuildFile to implement this behavior and add corresponding
error handling paths so the tests and other callers get an explicit
unknown-runtime error.

♻️ Duplicate comments (2)

gateway/gateway-builder/internal/discovery/python_module.go (2)

97-100: ⚠️ Potential issue | 🟠 Major

Still leaking downloadDir on the success path.

After extractModuleFromWheel succeeds, everything needed lives under extractDir; returning here without cleanup leaves the downloaded wheel temp directory behind on every successful fetch.

🧹 Proposed fix

 	downloadDir, err := os.MkdirTemp("", "pip-download-*")
 	if err != nil {
 		return nil, fmt.Errorf("failed to create temp directory: %w", err)
 	}
+	defer os.RemoveAll(downloadDir)

Also applies to: 152-172

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 97
- 100, The temp download directory (downloadDir) created with os.MkdirTemp in
python_module.go is not removed on the success path, leaking files; update the
logic around extractModuleFromWheel to ensure downloadDir is cleaned up after
successful extraction (either add a defer os.RemoveAll(downloadDir) immediately
after creating downloadDir, or explicitly call os.RemoveAll(downloadDir) after
extractModuleFromWheel returns successfully), and apply the same cleanup fix to
the other occurrence around lines 152-172; reference symbols: downloadDir,
extractDir, and extractModuleFromWheel.

177-201: ⚠️ Potential issue | 🟡 Minor

Reject malformed @ suffixes during parse.

pkg==1.0.0@garbage and pkg==@https://host currently pass this helper because the @ stays inside versionPart. That pushes a simple manifest validation error down into pip download.

🔎 Proposed fix

 	parts := strings.SplitN(ref, "==", 2)
 	if len(parts) != 2 || parts[0] == "" || parts[1] == "" {
-		return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>'")
+		return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'")
 	}
@@
-	if idx := strings.Index(versionPart, "@"); idx > 0 {
-		candidate := versionPart[idx+1:]
-		if strings.Contains(candidate, "://") {
-			indexURL = candidate
-			versionPart = versionPart[:idx]
-		}
+	if idx := strings.Index(versionPart, "@"); idx >= 0 {
+		if idx == 0 {
+			return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'")
+		}
+		candidate := strings.TrimSpace(versionPart[idx+1:])
+		if !strings.Contains(candidate, "://") {
+			return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'")
+		}
+		indexURL = candidate
+		versionPart = versionPart[:idx]
 	}
@@
 	if pkgName == "" || version == "" {
-		return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>'")
+		return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'")
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 177
- 201, The ParsePipPackageRef function currently allows malformed '@' suffixes
to pass (e.g., "pkg==1.0.0@garbage" or "pkg==@https://host"); update
ParsePipPackageRef to explicitly reject any '@' in versionPart unless there's a
non-empty version before the '@' and the suffix after the '@' is a valid index
URL (contains "://") — i.e., find idx := strings.Index(versionPart, "@"); if idx
== 0 return an error, if idx > 0 ensure candidate := versionPart[idx+1:] is
non-empty and contains "://", then set indexURL=candidate and
versionPart=versionPart[:idx]; otherwise return the same invalid-format error.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/docker-entrypoint.sh`:
- Around line 159-160: The script currently relies on set -e and calls wait (the
bare wait invocation at the shown spot and the similar waits around lines
255-260), which can cause the script to exit immediately if a child returns
non-zero and skip the cleanup/termination logic; update the script so that waits
cannot trigger errexit—temporarily disable errexit (or otherwise ensure wait
failures are swallowed), capture each child exit status into a variable, and
then restore errexit before running the existing cleanup/termination logic so
cleanup always runs regardless of child exit codes.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go`:
- Around line 188-201: The disconnect handler is sending on client response
channels while holding sm.pendingMu which can block; modify handleDisconnect so
it does not perform blocking sends under the lock: inside the locked section
(using sm.pendingMu) copy sm.pendingReqs into a local map/slice and remove
entries from sm.pendingReqs, then unlock; after unlocking, iterate the copied
entries and send the proto.ExecutionResponse (or perform non-blocking sends with
select to drop if full) to each ch; this preserves the original messages
(proto.ExecutionResponse{...}) but ensures no goroutine can be blocked while
holding sm.pendingMu.
- Around line 101-107: The Connect method currently ignores the caller's ctx by
using grpc.Dial with grpc.WithTimeout(10*time.Second); replace that dial with
grpc.DialContext(ctx, sm.socketPath, grpc.WithContextDialer(dialer),
grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithBlock()) so
the caller's deadline/cancellation is honored and remove grpc.WithTimeout; keep
existing error handling and use the same sm.socketPath and dialer identifiers.

In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 177-185: The semaphore (concurrency_limit) is acquired inside
process_request, but tasks are spawned for every incoming message, allowing an
unbounded in_flight set; instead acquire the semaphore before creating a task so
task creation itself is bounded. Change the flow so the dispatcher awaits
concurrency_limit.acquire() (or uses async with concurrency_limit) before adding
a task to in_flight, then create an asyncio.create_task that runs a wrapper
which calls _execute_policy and in its finally block releases the semaphore and
removes the task from in_flight; refer to concurrency_limit, process_request,
_execute_policy and the in_flight set when locating where to move the
acquire/release.
- Around line 362-364: The code unconditionally removes self.socket_path which
can delete regular files; update the cleanup in server.py (where
self.socket_path is handled, e.g., in the server start/teardown code) to first
check that the path exists and is a UNIX domain socket using os.stat +
stat.S_ISSOCK(self.socket_path) (or equivalent), and only call os.remove when
that check passes; if the path exists but is not a socket, do not remove it and
instead log or raise a clear error; also catch and log exceptions from
os.stat/os.remove to avoid crashing.
- Line 51: Add a proper shutdown path: implement a close(self) method on
PythonExecutorServicer that calls self._executor.shutdown(wait=True) to stop its
ThreadPoolExecutor; in PythonExecutorServer.store the servicer instance on
self._servicer instead of a local variable in start(), and in
PythonExecutorServer.shutdown() call self._servicer.close(); also ensure any
ThreadPoolExecutor created for aio.server() (the server-side worker pool) is
kept as an instance variable (e.g., self._aio_worker_executor) and explicitly
shutdown with shutdown(wait=True) in shutdown(); finally locate and explicitly
shutdown the migration thread pool used by this module (call shutdown(wait=True)
on that pool) so no worker threads are leaked.

In `@gateway/gateway-runtime/python-executor/main.py`:
- Around line 42-51: The CLI currently allows 0 or negative values for --workers
and --max-concurrent which will break runtime concurrency primitives; add
validation in the argument parsing by replacing the raw int conversion with a
positive-int checker (e.g., a small helper function like positive_int(value)
used as the type for parser.add_argument) and call parser.error when value <= 0
so startup fails fast with a clear message; apply this change for both
occurrences of parser.add_argument for "--workers" and "--max-concurrent"
(including the second set around lines 104-110) so both locations validate > 0.
- Around line 115-117: The signal handler currently fires an untracked
fire-and-forget task (asyncio.create_task(server.shutdown())) which can race
with the main finally block and swallow exceptions; introduce a module-level
variable (e.g., shutdown_task) and change signal_handler to: if shutdown_task is
None or shutdown_task.done(): assign shutdown_task =
asyncio.create_task(server.shutdown()) and attach a done-callback that logs
exceptions, otherwise skip creating a new task; update the finally block that
currently awaits server.shutdown() to instead await shutdown_task if it exists
so server.shutdown() is only invoked once and any exceptions are surfaced and
logged.

In `@gateway/sample-policies/slugify-body-python/policy.py`:
- Around line 222-227: The int() conversion of params.get("maxLength") should be
wrapped in a try/except to avoid raising ValueError for non-numeric inputs;
update the code around max_length = int(params.get("maxLength", 0)) (in the
slugify-body policy) to catch ValueError (and TypeError), log a warning that the
provided maxLength is invalid and will be treated as unlimited/default, and set
max_length = 0 in the exception handler; keep the existing negative-value check
for max_length afterwards.

---

Outside diff comments:
In `@gateway/gateway-builder/internal/discovery/discovery_test.go`:
- Around line 535-555: The test reveals that DiscoverPoliciesFromBuildFile
currently treats ambiguous/missing runtime files as Go and surfaces a go.mod
error; update DetectRuntime to return an explicit unknown/err state (not default
to Go) when it cannot determine runtime (no go.mod and no .py files), and modify
DiscoverPoliciesFromBuildFile to check for that unknown result and return a
clear, contextual error (e.g., "unable to detect runtime for policy X" or
propagate the DetectRuntime error) instead of proceeding to Go-specific parsing;
locate references to DetectRuntime and the runtime-switch logic inside
DiscoverPoliciesFromBuildFile to implement this behavior and add corresponding
error handling paths so the tests and other callers get an explicit
unknown-runtime error.

---

Duplicate comments:
In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 97-100: The temp download directory (downloadDir) created with
os.MkdirTemp in python_module.go is not removed on the success path, leaking
files; update the logic around extractModuleFromWheel to ensure downloadDir is
cleaned up after successful extraction (either add a defer
os.RemoveAll(downloadDir) immediately after creating downloadDir, or explicitly
call os.RemoveAll(downloadDir) after extractModuleFromWheel returns
successfully), and apply the same cleanup fix to the other occurrence around
lines 152-172; reference symbols: downloadDir, extractDir, and
extractModuleFromWheel.
- Around line 177-201: The ParsePipPackageRef function currently allows
malformed '@' suffixes to pass (e.g., "pkg==1.0.0@garbage" or
"pkg==@https://host"); update ParsePipPackageRef to explicitly reject any '@' in
versionPart unless there's a non-empty version before the '@' and the suffix
after the '@' is a valid index URL (contains "://") — i.e., find idx :=
strings.Index(versionPart, "@"); if idx == 0 return an error, if idx > 0 ensure
candidate := versionPart[idx+1:] is non-empty and contains "://", then set
indexURL=candidate and versionPart=versionPart[:idx]; otherwise return the same
invalid-format error.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c08eab94-fbc3-48ea-b7d4-8ff374536c0c

📥 Commits

Reviewing files that changed from the base of the PR and between 2c46bd4 and ef48d5d.

📒 Files selected for processing (8)

gateway/gateway-builder/internal/discovery/discovery_test.go
gateway/gateway-builder/internal/discovery/python_module.go
gateway/gateway-runtime/docker-entrypoint.sh
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go
gateway/gateway-runtime/python-executor/executor/server.py
gateway/gateway-runtime/python-executor/main.py
gateway/sample-policies/build.yaml
gateway/sample-policies/slugify-body-python/policy.py

🚧 Files skipped from review as they are similar to previous changes (1)

gateway/sample-policies/build.yaml

Make Python executor and Python-bridge more robust: - docker-entrypoint.sh: Preserve full --py.* handling and export PYTHON_EXECUTOR_SOCKET for both --py.socket=<val> and --py.socket <val> forms so shell-provided flags stay in sync with the Go-side socket checks. - policy-engine/internal/pythonbridge/client.go: Add an atomic connID and pass stream/conn/connID into receiveLoop so stale goroutines cannot close newer connections; capture stream under connectMu before sending to avoid TOCTOU races; use stream.Send for the captured stream. - python-executor/executor/server.py: Reorder shutdown to stop accepting RPCs and thread-pool tasks first, then clear instances, and finally shut down IO/thread pools to avoid races during cleanup. - python-executor/main.py: Improve CLI/env handling for integer flags by deferring validation of env-derived defaults and allowing CLI overrides to take precedence; add helpers to detect provided flags and fall back on sane defaults when env vars are invalid. These changes fix race conditions, stale connection handling, and make startup/shutdown behavior more resilient and predictable.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Initialize response and protect shared state updates under the response_ready lock to avoid races. Only append non-None responses, discard in_flight task and notify waiters inside a finally block so deque/in_flight changes are observed atomically, and release the concurrency semaphore after notifying waiters so they re-check in_flight correctly. Also ensure reader_done is set while holding the lock and that notify is always called in the reader's finally block.

sehan-dissanayake · 2026-03-13T10:21:06Z

@coderabbitai resume

coderabbitai · 2026-03-13T10:21:13Z

✅ Actions performed

Reviews resumed.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go`:
- Around line 331-339: The InitPolicy, DestroyPolicy, and HealthCheck methods
access sm.client without synchronizing with Connect which writes sm.client under
connectMu; protect reads by acquiring the same mutex (use
connectMu.RLock()/RUnlock() or Lock()/Unlock() around accesses to sm.client)
before checking IsConnected/using sm.client in InitPolicy, DestroyPolicy, and
HealthCheck so there is no data race with Connect which assigns sm.client.
- Around line 280-285: The select branch in Execute that reads from respCh
currently maps any ctx.Done() to a timeout error; change the ctx.Done() handling
(the select case using respCh and ctx.Done()) to inspect ctx.Err() and return a
cancellation error when ctx.Err() == context.Canceled (or return ctx.Err()
directly) and only return the timeout formatted error when ctx.Err() ==
context.DeadlineExceeded (or timeout matches); update the return in that branch
to use ctx.Err() or a descriptive message instead of always using
fmt.Errorf("timeout ...", timeout) so cancellation and timeout are
differentiated.

In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 142-156: DestroyPolicy currently removes the instance from
self._store and calls instance.close() immediately, which races with
_execute_policy that resolves the same instance later; change the lifecycle to
use an active-execution counter and deferred teardown: add/ensure an
active_count (or similar) field on the instance and update _execute_policy to
increment this counter before using the instance and decrement after
on_request/on_response completes, then modify _destroy_policy_sync to mark the
instance as pending_destroy (or set a flag) instead of removing/closing
immediately and only remove and call instance.close() once active_count reaches
zero (or perform removal/close in the decrement path when pending_destroy is
set); keep returning proto.DestroyPolicyResponse(success=True) as before when
scheduling teardown.
- Around line 59-61: The close() method currently calls
self._executor.shutdown(wait=True) which can block forever if a submitted policy
(via ExecuteStream() using run_in_executor()) is hung; change shutdown to a
non-blocking/interruptible form—either self._executor.shutdown(wait=False) or
self._executor.shutdown(wait=True, cancel_futures=True) and ensure you also
apply a bounded timeout/cleanup logic so shutdown cannot block indefinitely;
update every other site that calls shutdown (the other executor shutdown calls
around ExecuteStream/run_in_executor usage) to the same
non-blocking/cancel_futures-with-timeout pattern and ensure any necessary flags
are available in the Python version used.
- Around line 393-395: The code currently ignores the return value of
self.server.add_insecure_port(f"unix:{self.socket_path}") which returns 0 on
bind failure; modify the startup sequence in the server initialization (around
server.start in server.py) to capture the return value from add_insecure_port,
check if it equals 0, and if so log an error referencing self.socket_path and
raise an exception (or exit) before calling await self.server.start() and before
logging "Python Executor ready"; ensure the error path prevents startup
continuing when the UDS bind fails.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ac134c4d-d4df-4c44-b462-4f32a8fbe682

📥 Commits

Reviewing files that changed from the base of the PR and between 6c99a79 and e8ac1f8.

📒 Files selected for processing (5)

gateway/gateway-builder/internal/discovery/discovery_test.go
gateway/gateway-runtime/docker-entrypoint.sh
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go
gateway/gateway-runtime/python-executor/executor/server.py
gateway/gateway-runtime/python-executor/main.py

🚧 Files skipped from review as they are similar to previous changes (1)

gateway/gateway-runtime/docker-entrypoint.sh

coderabbitai · 2026-03-13T10:34:01Z

+    def close(self):
+        """Shutdown the thread pool executor."""
+        self._executor.shutdown(wait=True)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Python asyncio, does cancelling a coroutine that is awaiting loop.run_in_executor(...)cancel the underlyingThreadPoolExecutorjob, and how doesThreadPoolExecutor.shutdown(wait=True, cancel_futures=True) behave?

💡 Result:

Cancelling an asyncio Task that’s currently await loop.run_in_executor(...) does not reliably stop the underlying thread job. run_in_executor() submits work to a concurrent.futures.Executor and returns an asyncio.Future you can cancel/await, but cancelling that await only affects the awaiting side; the function already running in a worker thread generally keeps running because concurrent.futures.Future.cancel() cannot cancel once execution has started. [1], [2], [3]

What cancellation can do: if the executor job has not started yet, cancellation may succeed (because Future.cancel() can cancel only before the call begins). If it is running (or finished), cancel() returns False and the work continues. [1]

ThreadPoolExecutor.shutdown(wait=True, cancel_futures=True) behavior:

Stops accepting new submissions.

Cancels only futures that are still pending / not started.

With wait=True, it waits for already-started work to finish before returning; running (or completed) futures “won’t be cancelled” even with cancel_futures=True. [1]

Sources:
[1] Python docs: concurrent.futures (Future.cancel(), Executor.shutdown(cancel_futures=...)) https://docs.python.org/3/library/concurrent.futures.html
[2] Python docs: asyncio event loop run_in_executor() https://docs.python.org/3/library/asyncio-eventloop.html
[3] CPython issue documenting that cancelling run_in_executor does not stop the thread python/cpython#107505

🏁 Script executed:

cd gateway/gateway-runtime/python-executor/executor && head -n 65 server.py | tail -n 10

Repository: wso2/api-platform

Length of output: 458

🏁 Script executed:

cd gateway/gateway-runtime/python-executor/executor && sed -n '180,200p' server.py

Repository: wso2/api-platform

Length of output: 931

🏁 Script executed:

cd gateway/gateway-runtime/python-executor/executor && sed -n '405,415p' server.py

Repository: wso2/api-platform

Length of output: 489

🏁 Script executed:

cd gateway/gateway-runtime/python-executor/executor && sed -n '20,70p' server.py | grep -n "ExecuteStream\|run_in_executor" -A 3 -B 1

Repository: wso2/api-platform

Length of output: 43

close() can block shutdown indefinitely on stuck policy calls.

ExecuteStream() submits user policy code to the thread pool via run_in_executor(). When a stream is cancelled while work is running, the underlying thread job does not stop—concurrent.futures.Future.cancel() fails once execution has begun. Consequently, shutdown(wait=True) here blocks indefinitely waiting for that thread to complete. A single hung policy can therefore stall process exit well past the 5-second gRPC grace period.

Use shutdown(wait=False) or shutdown(wait=True, cancel_futures=True) with a timeout to allow graceful termination without indefinite blocking.

Also applies to: 188-192, 410-411

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 59 - 61, The close() method currently calls self._executor.shutdown(wait=True) which can block forever if a submitted policy (via ExecuteStream() using run_in_executor()) is hung; change shutdown to a non-blocking/interruptible form—either self._executor.shutdown(wait=False) or self._executor.shutdown(wait=True, cancel_futures=True) and ensure you also apply a bounded timeout/cleanup logic so shutdown cannot block indefinitely; update every other site that calls shutdown (the other executor shutdown calls around ExecuteStream/run_in_executor usage) to the same non-blocking/cancel_futures-with-timeout pattern and ensure any necessary flags are available in the Python version used.

Ensure in-flight policy executions aren't closed while running and avoid nil Python client RPCs. Add active_count and pending_destroy to InstanceRecord and new store methods (acquire_for_execution, release_execution, mark_for_destruction) to track active uses and mark instances for delayed destruction. Update server to mark instances for destruction, defer close until active_count reaches zero, and wrap execution with acquire/release logic. Also add mutex-protected client lookup and nil checks in the Go pythonbridge client (InitPolicy, DestroyPolicy, HealthCheck) to prevent RPCs when disconnected.

coderabbitai

🧹 Nitpick comments (1)

gateway/gateway-runtime/python-executor/executor/server.py (1)

251-258: Minor style suggestion: list unpacking instead of concatenation.

Static analysis suggests using [reader_task, *list(in_flight)] instead of concatenation, which is slightly more idiomatic.

Suggested fix

-            all_tasks = [reader_task] + list(in_flight)
+            all_tasks = [reader_task, *list(in_flight)]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 251
- 258, Replace the list concatenation used to build all_tasks with list
unpacking to be more idiomatic: in the asyncio.CancelledError except block where
reader_task and in_flight are combined (variables reader_task, in_flight, and
all_tasks), create all_tasks using [reader_task, *list(in_flight)] instead of
[reader_task] + list(in_flight); keep the surrounding logic (cancellation,
shielded gather, re-raise) unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 251-258: Replace the list concatenation used to build all_tasks
with list unpacking to be more idiomatic: in the asyncio.CancelledError except
block where reader_task and in_flight are combined (variables reader_task,
in_flight, and all_tasks), create all_tasks using [reader_task,
*list(in_flight)] instead of [reader_task] + list(in_flight); keep the
surrounding logic (cancellation, shielded gather, re-raise) unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7b25c53e-4d86-469b-946f-df1824d80d11

📥 Commits

Reviewing files that changed from the base of the PR and between e8ac1f8 and 623a6bc.

📒 Files selected for processing (3)

gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go
gateway/gateway-runtime/python-executor/executor/instance_store.py
gateway/gateway-runtime/python-executor/executor/server.py

Make Python bridge operations more robust: buildRequest/buildResponseRequest and buildSharedContext now return errors and use toProtoStruct with error handling; OnRequest/OnResponse log and handle build errors. Introduce sync.Once and closeErr for Close to make destruction safe for concurrent/multiple calls. Precompute PYTHON_POLICY_TIMEOUT at package init and use getTimeout wrapper. Update BridgeFactory to propagate param conversion errors. Change toProtoStruct to return (Struct, error) and adjust callers. Python executor: import Struct at module level, ensure concurrency semaphore is released if task creation fails, and remove redundant local import.

coderabbitai

♻️ Duplicate comments (2)

gateway/gateway-runtime/python-executor/executor/server.py (2)

410-412: ⚠️ Potential issue | 🔴 Critical

Fail fast when Unix socket bind is rejected.

Line 410 ignores the return from add_insecure_port(...). If bind fails (0), startup can continue and still log readiness on Line 412.

Suggested fix

-        self.server.add_insecure_port(f"unix:{self.socket_path}")
+        bind_result = self.server.add_insecure_port(f"unix:{self.socket_path}")
+        if bind_result == 0:
+            raise RuntimeError(
+                f"Failed to bind Python Executor to unix:{self.socket_path}"
+            )
         await self.server.start()

For grpc.aio.Server.add_insecure_port in Python, confirm the return value semantics on UDS: what does 0 mean, and should startup fail when 0 is returned?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 410
- 412, Check the return value of
self.server.add_insecure_port(f"unix:{self.socket_path}") and fail fast if it
returns 0 (indicating bind failure on the UDS); specifically, after calling
add_insecure_port capture its return (e.g., rv =
self.server.add_insecure_port(...)), and if rv == 0 then log an error via
logger.error including the socket path and either raise an exception or abort
startup (do not proceed to await self.server.start() or logger.info). This
ensures startup does not report "Python Executor ready" when the Unix socket
bind was rejected.

60-63: ⚠️ Potential issue | 🟠 Major

Avoid potentially unbounded shutdown blocking on policy executor close.

Line 62 uses shutdown(wait=True). If a policy call running via run_in_executor gets stuck, shutdown can block indefinitely when Line 428 calls self._servicer.close().

Suggested fix

     def close(self):
         """Shutdown the thread pool executor."""
-        self._executor.shutdown(wait=True)
+        # Don't block indefinitely on hung user policy code.
+        self._executor.shutdown(wait=False, cancel_futures=True)

In Python asyncio, if a task awaiting loop.run_in_executor(...) is cancelled, does the underlying ThreadPoolExecutor job continue running? Also confirm shutdown(wait=True) vs shutdown(wait=False, cancel_futures=True) behavior for running tasks.

Also applies to: 427-428

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 60 -
63, The close method currently calls self._executor.shutdown(wait=True) which
can block indefinitely if a run_in_executor job is stuck; change it to a
non-blocking shutdown and cancel outstanding work by calling
self._executor.shutdown(wait=False, cancel_futures=True) (or
shutdown(wait=False) on older Python with additional cleanup) so
_servicer.close() won't hang; also ensure any code using loop.run_in_executor
handles cancellation correctly since cancelling the asyncio Task does not stop
the underlying ThreadPoolExecutor worker unless cancel_futures=True is used.

🧹 Nitpick comments (2)

gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
254-258: Potential key collision when flattening AuthContext.Properties.

Properties are merged directly into the map after reserved keys (subject, issuer, auth_type, etc.). If ac.Properties contains any of these keys, it will silently overwrite them. Consider adding a prefix or checking for collisions.
♻️ Proposed fix - prefix properties to avoid collision
 	// Flatten properties
 	for k, v := range ac.Properties {
-		m[k] = v
+		m["prop_"+k] = v
 	}
Alternatively, validate that properties don't contain reserved keys.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 254 - 258, The loop that flattens AuthContext.Properties into the map m
(iterating ac.Properties) can silently overwrite reserved keys like "subject",
"issuer", "auth_type", etc.; update the flattening in the function/method that
builds m so it detects collisions or namespaces properties: either skip or
throw/log on keys that match reserved names, or prepend a stable prefix (e.g.,
"properties." or "prop_") to each ac.Properties key before inserting into m;
ensure you update all references to m that expect these property names
accordingly and keep the insertion logic around the "for k, v := range
ac.Properties" loop as the change point.
gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go (1)
48-52: Consider sanitizing error messages returned to clients.

The error message from Python is directly included in the response body. This could leak internal implementation details to API consumers. Consider logging the full error server-side and returning a generic message to clients.
♻️ Proposed fix
 		return policy.ImmediateResponse{
 			StatusCode: 500,
 			Headers:    map[string]string{"Content-Type": "text/plain"},
-			Body:       []byte(errResult.Message),
+			Body:       []byte("Internal policy error"),
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go`
around lines 48 - 52, The code in translator.go currently returns
errResult.Message directly in policy.ImmediateResponse.Body (via
errResult.Message) which can leak internal details; instead log the full
errResult.Message (and stack/trace if available) to the server logger and set
the response body to a generic message such as "Internal server error" or "An
error occurred" before returning the policy.ImmediateResponse in the function
that constructs the 500 response; keep the StatusCode and Headers unchanged but
do not expose errResult.Message to clients.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 410-412: Check the return value of
self.server.add_insecure_port(f"unix:{self.socket_path}") and fail fast if it
returns 0 (indicating bind failure on the UDS); specifically, after calling
add_insecure_port capture its return (e.g., rv =
self.server.add_insecure_port(...)), and if rv == 0 then log an error via
logger.error including the socket path and either raise an exception or abort
startup (do not proceed to await self.server.start() or logger.info). This
ensures startup does not report "Python Executor ready" when the Unix socket
bind was rejected.
- Around line 60-63: The close method currently calls
self._executor.shutdown(wait=True) which can block indefinitely if a
run_in_executor job is stuck; change it to a non-blocking shutdown and cancel
outstanding work by calling self._executor.shutdown(wait=False,
cancel_futures=True) (or shutdown(wait=False) on older Python with additional
cleanup) so _servicer.close() won't hang; also ensure any code using
loop.run_in_executor handles cancellation correctly since cancelling the asyncio
Task does not stop the underlying ThreadPoolExecutor worker unless
cancel_futures=True is used.

---

Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 254-258: The loop that flattens AuthContext.Properties into the
map m (iterating ac.Properties) can silently overwrite reserved keys like
"subject", "issuer", "auth_type", etc.; update the flattening in the
function/method that builds m so it detects collisions or namespaces properties:
either skip or throw/log on keys that match reserved names, or prepend a stable
prefix (e.g., "properties." or "prop_") to each ac.Properties key before
inserting into m; ensure you update all references to m that expect these
property names accordingly and keep the insertion logic around the "for k, v :=
range ac.Properties" loop as the change point.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go`:
- Around line 48-52: The code in translator.go currently returns
errResult.Message directly in policy.ImmediateResponse.Body (via
errResult.Message) which can leak internal details; instead log the full
errResult.Message (and stack/trace if available) to the server logger and set
the response body to a generic message such as "Internal server error" or "An
error occurred" before returning the policy.ImmediateResponse in the function
that constructs the 500 response; keep the StatusCode and Headers unchanged but
do not expose errResult.Message to clients.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7c41cf0a-2849-479a-9865-3860c8dafde3

📥 Commits

Reviewing files that changed from the base of the PR and between 623a6bc and 4c9eacd.

📒 Files selected for processing (5)

gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go
gateway/gateway-runtime/python-executor/executor/server.py

Introduce a configurable execution timeout and use a shared ThreadPoolExecutor across the Python executor. Pass timeout via CLI (--timeout) and env (PYTHON_POLICY_TIMEOUT) and log it in the docker entrypoint. Enforce per-request timeouts with asyncio.wait_for and return a timeout error response; servicer no longer owns executor shutdown. Also: namespace auth properties with a "prop_" prefix to avoid reserved key collisions, improve context error handling in the pythonbridge client, handle missing base requirements.txt safely when merging, check UNIX socket bind result and improve shutdown/cleanup logic and logging.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
59-77: ⚠️ Potential issue | 🟠 Major

Request context is discarded in favor of context.Background().

Using context.Background() loses the original request's cancellation signals and deadlines. If the HTTP request is cancelled, the Python execution will continue unnecessarily until the configured timeout expires.

This is a limitation of the policy.Policy SDK interface which doesn't pass context.Context to OnRequest/OnResponse. While Execute() applies its own timeout, cancellation signals from the original request are lost.

Consider tracking this as a future SDK enhancement to propagate request context.
🔧 Potential workaround - derive context with timeout
 func (b *PythonBridge) OnRequest(ctx *policy.RequestContext, params map[string]interface{}) policy.RequestAction {
 	req, err := b.buildRequest(ctx, "on_request", params)
 	if err != nil {
 		b.slogger.Error("Failed to build request", "error", err, "phase", "on_request")
 		return b.errorResponse(err)
 	}

-	resp, err := b.streamManager.Execute(context.Background(), req)
+	// TODO: Propagate proper context when SDK interface supports it.
+	// For now, apply the configured timeout to avoid unbounded waits.
+	execCtx, cancel := context.WithTimeout(context.Background(), getTimeout())
+	defer cancel()
+
+	resp, err := b.streamManager.Execute(execCtx, req)
Apply the same pattern to OnResponse at line 87.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 59 - 77, The code currently calls streamManager.Execute with
context.Background(), discarding request cancellation; change it to derive a
cancellable context from the incoming request context (use the request's
SharedContext available on policy.RequestContext) and apply the same execution
timeout used for Python calls (e.g., create ctxWithTimeout via
context.WithTimeout(ctx.SharedContext, <appropriate timeout>) and defer
cancel()) before calling b.streamManager.Execute(ctxWithTimeout, req) so
cancellations/deadlines propagate; apply the identical pattern in the OnResponse
handler as well.

🧹 Nitpick comments (2)

gateway/gateway-runtime/python-executor/main.py (1)

32-41: Chain the original exception for better traceability.

The static analysis correctly flags that raise ... from err should be used to preserve the exception chain, making debugging easier.

♻️ Proposed fix

 def positive_int(value):
     """Validate that the value is a positive integer."""
     try:
         ivalue = int(value)
         if ivalue <= 0:
             raise ValueError
         return ivalue
-    except (ValueError, TypeError):
-        raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer")
+    except (ValueError, TypeError) as err:
+        raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from err

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/main.py` around lines 32 - 41, The
positive_int validator currently raises argparse.ArgumentTypeError without
chaining the original exception; modify the except block in function
positive_int to capture the caught exception (e.g., except (ValueError,
TypeError) as err:) and re-raise argparse.ArgumentTypeError(f"'{value}' is not a
positive integer") from err so the original traceback is preserved for
debugging.

gateway/gateway-builder/internal/policyengine/generator.go (1)

343-370: Consider using filepath.WalkDir for better performance.

filepath.Walk calls os.Lstat on every file, while filepath.WalkDir avoids the stat call when not needed. For large policy directories, this can improve copy performance.

♻️ Proposed refactor using WalkDir

 // copyDir copies a directory recursively
 func copyDir(src, dst string) error {
-	return filepath.Walk(src, func(path string, info os.FileInfo, err error) error {
+	return filepath.WalkDir(src, func(path string, d fs.DirEntry, err error) error {
 		if err != nil {
 			return err
 		}

 		// Compute relative path
 		relPath, err := filepath.Rel(src, path)
 		if err != nil {
 			return err
 		}

 		dstPath := filepath.Join(dst, relPath)

-		if info.IsDir() {
-			return os.MkdirAll(dstPath, info.Mode())
+		if d.IsDir() {
+			return os.MkdirAll(dstPath, 0755)
 		}

 		// Copy file
 		data, err := os.ReadFile(path)
 		if err != nil {
 			return err
 		}

-		return os.WriteFile(dstPath, data, info.Mode())
+		info, err := d.Info()
+		if err != nil {
+			return err
+		}
+		return os.WriteFile(dstPath, data, info.Mode())
 	})
 }

Note: Add "io/fs" to imports.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 343
- 370, The copyDir function should be refactored to use filepath.WalkDir for
better performance: replace filepath.Walk(src, func(path string, info
os.FileInfo, err error) error { ... }) with filepath.WalkDir(src, func(path
string, d fs.DirEntry, err error) error { ... }), compute relPath the same way,
build dstPath, call d.IsDir() to handle directories and call d.Info() only when
you need the FileInfo (to get Mode()) for os.MkdirAll(dstPath, info.Mode()) and
for files use os.ReadFile(path) and os.WriteFile(dstPath, data, info.Mode()).
Preserve existing error handling and return semantics of copyDir and add "io/fs"
to imports.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/docker-entrypoint.sh`:
- Around line 112-117: The environment variable PYTHON_POLICY_TIMEOUT is set as
a bare number but the Go bridge (pythonbridge/client.go where time.ParseDuration
is used) expects a duration string; change the default export in
docker-entrypoint.sh to include a unit (e.g., export
PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30s}") and then update main.py’s
timeout handling to accept duration strings (parse with a duration parser and
fall back to integer seconds for backwards compatibility) so both consumers
interpret the value consistently; reference PYTHON_POLICY_TIMEOUT,
pythonbridge/client.go (time.ParseDuration usage) and main.py (current integer
parsing) when making the edits.

---

Duplicate comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 59-77: The code currently calls streamManager.Execute with
context.Background(), discarding request cancellation; change it to derive a
cancellable context from the incoming request context (use the request's
SharedContext available on policy.RequestContext) and apply the same execution
timeout used for Python calls (e.g., create ctxWithTimeout via
context.WithTimeout(ctx.SharedContext, <appropriate timeout>) and defer
cancel()) before calling b.streamManager.Execute(ctxWithTimeout, req) so
cancellations/deadlines propagate; apply the identical pattern in the OnResponse
handler as well.

---

Nitpick comments:
In `@gateway/gateway-builder/internal/policyengine/generator.go`:
- Around line 343-370: The copyDir function should be refactored to use
filepath.WalkDir for better performance: replace filepath.Walk(src, func(path
string, info os.FileInfo, err error) error { ... }) with filepath.WalkDir(src,
func(path string, d fs.DirEntry, err error) error { ... }), compute relPath the
same way, build dstPath, call d.IsDir() to handle directories and call d.Info()
only when you need the FileInfo (to get Mode()) for os.MkdirAll(dstPath,
info.Mode()) and for files use os.ReadFile(path) and os.WriteFile(dstPath, data,
info.Mode()). Preserve existing error handling and return semantics of copyDir
and add "io/fs" to imports.

In `@gateway/gateway-runtime/python-executor/main.py`:
- Around line 32-41: The positive_int validator currently raises
argparse.ArgumentTypeError without chaining the original exception; modify the
except block in function positive_int to capture the caught exception (e.g.,
except (ValueError, TypeError) as err:) and re-raise
argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from err so
the original traceback is preserved for debugging.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 73d266db-4963-4a3a-a71f-984c14b48f21

📥 Commits

Reviewing files that changed from the base of the PR and between 4c9eacd and 451aca5.

📒 Files selected for processing (6)

gateway/gateway-builder/internal/policyengine/generator.go
gateway/gateway-runtime/docker-entrypoint.sh
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go
gateway/gateway-runtime/python-executor/executor/server.py
gateway/gateway-runtime/python-executor/main.py

coderabbitai · 2026-03-17T06:53:54Z

+# Python Executor configuration
+export PYTHON_POLICY_WORKERS="${PYTHON_POLICY_WORKERS:-4}"
+export PYTHON_POLICY_MAX_CONCURRENT="${PYTHON_POLICY_MAX_CONCURRENT:-100}"
+export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30}"
+export PYTHON_EXECUTOR_SOCKET="${PYTHON_EXECUTOR_SOCKET:-/var/run/api-platform/python-executor.sock}"
+


⚠️ Potential issue | 🟡 Minor

Timeout format mismatch between Python and Go consumers.

PYTHON_POLICY_TIMEOUT is exported as a bare number (30), but the Go Python bridge at pythonbridge/client.go:404-407 parses it using time.ParseDuration(s), which requires a unit suffix (e.g., 30s). While the Go code falls back to 30 * time.Second on parse failure, this creates a silent configuration mismatch.

The Python executor (main.py) expects an integer, so it works correctly there.

🔧 Proposed fix - use duration format

# Python Executor configuration export PYTHON_POLICY_WORKERS="${PYTHON_POLICY_WORKERS:-4}" export PYTHON_POLICY_MAX_CONCURRENT="${PYTHON_POLICY_MAX_CONCURRENT:-100}" -export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30}" +export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30s}" export PYTHON_EXECUTOR_SOCKET="${PYTHON_EXECUTOR_SOCKET:-/var/run/api-platform/python-executor.sock}"

Then update main.py to parse duration strings, or alternatively update client.go to parse integer seconds.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/docker-entrypoint.sh` around lines 112 - 117, The environment variable PYTHON_POLICY_TIMEOUT is set as a bare number but the Go bridge (pythonbridge/client.go where time.ParseDuration is used) expects a duration string; change the default export in docker-entrypoint.sh to include a unit (e.g., export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30s}") and then update main.py’s timeout handling to accept duration strings (parse with a duration parser and fall back to integer seconds for backwards compatibility) so both consumers interpret the value consistently; reference PYTHON_POLICY_TIMEOUT, pythonbridge/client.go (time.ParseDuration usage) and main.py (current integer parsing) when making the edits.

Stop parsing ProcessingMode from policy-definition.yaml at build time and instead obtain processing requirements from the Python executor during InitPolicy. Removed ProcessingMode fields and helpers from the builder and templates, added a ProcessingMode message to the python_executor proto and included it in InitPolicyResponse, and added runtime conversion logic (processingModeFromProto) in the BridgeFactory. Enforced a fixed python-executor socket path in the pythonbridge (no env override) and updated docker-entrypoint/main to reject --py.socket overrides and use the fixed socket path. Standardized the /health response (Status/Timestamp/Reason), integrated python executor health checking, and updated/added tests accordingly. Also removed local SDK replace and sample-policies build context/copy from the runtime image and updated go.sum to reference the SDK release.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (7)

gateway/gateway-builder/internal/policyengine/registry.go (1)
154-162: formatSystemParams is dead code that silently discards parameters.

This function always returns "nil" even when params is not nil. While the current template handles SystemParameters inline and doesn't call this function, having it registered in the FuncMap creates confusion and risk if someone tries to use it later.

Either remove this function and its registration, or implement it properly.
Option 1: Remove dead code
-	tmpl, err := template.New("plugin_registry").Funcs(template.FuncMap{
-		"formatSystemParams": formatSystemParams,
-	}).Parse(templates.PluginRegistryTemplate)
+	tmpl, err := template.New("plugin_registry").Parse(templates.PluginRegistryTemplate)
And remove lines 154-162 entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/policyengine/registry.go` around lines 154 -
162, The function formatSystemParams currently always returns "nil" and should
either be removed along with its registration in the template FuncMap or
implemented to produce a valid Go literal for the input map; update the code
accordingly by modifying formatSystemParams to iterate over the params map and
build a Go map literal string (e.g., "map[string]interface{}{...}" with properly
quoted keys and Go-typed values) so templates can use it, or delete the
formatSystemParams function and remove any reference to it from the FuncMap to
avoid dead code and confusion.
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go (1)
402-414: Silent fallback on timeout parse failure.

If PYTHON_POLICY_TIMEOUT is set to a value without a unit suffix (e.g., 60 instead of 60s), time.ParseDuration fails and the code silently falls back to 30 seconds. Consider logging a warning or parsing integer-only values as seconds for better UX.
Proposed fix
 var pythonPolicyTimeout = func() time.Duration {
 	if s := os.Getenv("PYTHON_POLICY_TIMEOUT"); s != "" {
 		if d, err := time.ParseDuration(s); err == nil {
 			return d
 		}
+		// Try parsing as integer seconds for backward compatibility
+		if secs, err := strconv.Atoi(s); err == nil {
+			slog.Warn("PYTHON_POLICY_TIMEOUT should use duration format (e.g., '30s'), interpreting as seconds",
+				"value", s)
+			return time.Duration(secs) * time.Second
+		}
+		slog.Warn("Invalid PYTHON_POLICY_TIMEOUT value, using default",
+			"value", s, "default", "30s")
 	}
 	return 30 * time.Second
 }()
This requires adding "strconv" to imports.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go` around
lines 402 - 414, The env parsing for PYTHON_POLICY_TIMEOUT in the package-level
variable pythonPolicyTimeout silently falls back to 30s when time.ParseDuration
fails (e.g., "60" -> error); update the initializer for pythonPolicyTimeout to
(1) try time.ParseDuration as now, (2) if that fails attempt to parse the string
as an integer number of seconds using strconv.Atoi (or strconv.ParseInt) and
convert to time.Duration, and (3) log a warning via the package logger or
log.Printf when the provided value is invalid and the fallback to default is
used; ensure you add "strconv" to the imports and keep getTimeout() unchanged.
gateway/gateway-runtime/proto/python_executor.proto (1)
20-22: Buf lint: package directory mismatch (informational).

Buf reports that files with package wso2.gateway.python.v1 should be in a wso2/gateway/python/v1 directory. This is a structural lint rule that may conflict with your monorepo layout. If Buf is used in CI, consider either:

Adding a buf.yaml exception for this path

Reorganizing proto files to match the package structure

This doesn't affect runtime correctness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/proto/python_executor.proto` around lines 20 - 22,
Buf lint warns that the proto package declaration "package
wso2.gateway.python.v1" does not match its directory layout; resolve this by
either adding a Buf exception for this proto file in your Buf configuration
(buf.yaml) to allow the current placement, or move the file so its directory
path matches the package (e.g., wso2/gateway/python/v1). Ensure the change
references the same proto by checking the "package wso2.gateway.python.v1"
declaration (and the existing option go_package) so imports and generated code
continue to match after the change.
gateway/gateway-runtime/python-executor/sdk/policy.py (1)
177-185: Consider making close() abstract or suppressing the linter warning.

The close() method is intentionally a default no-op (as documented), but Ruff flags it as B027 because it's an empty method in an ABC without @abstractmethod. Two options:

Keep as-is and add # noqa: B027 to silence the warning

Make it @abstractmethod with a default implementation using super() pattern

The current design is valid—it mirrors typical resource-cleanup patterns where close() is optional. Adding a brief inline comment or noqa would clarify intent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/sdk/policy.py` around lines 177 -
185, The empty close() method in the ABC is intentionally a no-op but triggers
Ruff B027; to silence this, add an explicit inline comment explaining it's a
deliberate no-op and append "# noqa: B027" to the method (i.e., update the
close(self) -> None: block to include the explanatory comment and the noqa
marker), or alternatively convert close into an `@abstractmethod` with a default
implementation pattern if you want subclasses to be forced to override; choose
the noqa approach for minimal change and clarity.
gateway/gateway-runtime/python-executor/main.py (1)
42-43: Chain the exception for better debugging context.

When re-raising as ArgumentTypeError, preserve the original exception context using from. This helps with debugging by showing the full exception chain.
Proposed fix
     except (ValueError, TypeError):
-        raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer")
+        raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/python-executor/main.py` around lines 42 - 43, The
except block that currently does "except (ValueError, TypeError): raise
argparse.ArgumentTypeError(f"'{value}' is not a positive integer")" should
capture the original exception and chain it when re-raising; change it to
"except (ValueError, TypeError) as e:" and re-raise the ArgumentTypeError with
"from e" (i.e., raise argparse.ArgumentTypeError(f"'{value}' is not a positive
integer") from e) so the original exception context is preserved for debugging
while keeping the same message and using the existing variable names.
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
286-303: Error messages may leak internal details to clients.

The error response bodies include the raw error message: fmt.Sprintf("Policy execution error: %v", err). This could expose internal implementation details, stack traces, or sensitive information to API clients.

Consider sanitizing error messages for client responses while logging full details server-side:
func (b *PythonBridge) errorResponse(err error) policy.RequestAction {
    b.slogger.Error("Policy execution failed", "error", err)
    return policy.ImmediateResponse{
        StatusCode: 500,
        Headers:    map[string]string{"Content-Type": "text/plain"},
        Body:       []byte("Internal policy error"),
    }
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 286 - 303, The errorResponse and errorResponseAction functions currently
include raw err details in client-facing bodies; change both to log the full
error server-side (e.g., b.slogger.Error or equivalent) and return sanitized,
generic client messages (e.g., "Internal policy error") while preserving status
codes and headers (Content-Type: text/plain); update errorResponse to log the
error then return policy.ImmediateResponse with generic Body, and update
errorResponseAction to log the error then return
policy.UpstreamResponseModifications with a generic Body and StatusCode pointer,
ensuring no internal error strings are sent to clients.
gateway/gateway-builder/internal/discovery/manifest.go (1)
237-256: DetectRuntime silently defaults to "go" on directory read errors.

If os.ReadDir fails (e.g., permission denied), the function returns "go" without logging the error. This could mask legitimate filesystem issues and lead to confusing downstream errors when Go discovery fails on what's actually a Python directory.

Consider logging the error or propagating it:
Proposed fix
 func DetectRuntime(policyDir string) string {
 	goMod := filepath.Join(policyDir, "go.mod")
 	if _, err := os.Stat(goMod); err == nil {
 		return "go"
 	}

 	entries, err := os.ReadDir(policyDir)
 	if err != nil {
+		slog.Warn("Failed to read policy directory for runtime detection, defaulting to go",
+			"path", policyDir,
+			"error", err)
 		return "go"
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 237 -
256, DetectRuntime currently swallows os.ReadDir errors and always returns "go";
change DetectRuntime to return (string, error), keep the existing go.mod check,
and if os.ReadDir(policyDir) returns an error return "", the wrapped error (or
annotate with context) instead of defaulting to "go"; preserve the logic that
returns "python" when a .py file is found and otherwise return "go" with nil
error; update any call sites that call DetectRuntime to handle the (runtime,
error) return.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.go`:
- Around line 103-119: The processingModeFromProto function currently only
checks for empty strings but must also validate that each string corresponds to
a known enum constant; update processingModeFromProto (taking
*proto.ProcessingMode and returning policy.ProcessingMode) to map/validate
pm.RequestHeaderMode, pm.RequestBodyMode, pm.ResponseHeaderMode, and
pm.ResponseBodyMode against the supported values (the
policy.HeaderProcessingMode and policy.BodyProcessingMode enum constants) and
return an error if any value is unrecognized; implement the checks with explicit
branch logic (switch or lookup) for each field and only construct and return
policy.ProcessingMode when all four fields are valid.

---

Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/manifest.go`:
- Around line 237-256: DetectRuntime currently swallows os.ReadDir errors and
always returns "go"; change DetectRuntime to return (string, error), keep the
existing go.mod check, and if os.ReadDir(policyDir) returns an error return "",
the wrapped error (or annotate with context) instead of defaulting to "go";
preserve the logic that returns "python" when a .py file is found and otherwise
return "go" with nil error; update any call sites that call DetectRuntime to
handle the (runtime, error) return.

In `@gateway/gateway-builder/internal/policyengine/registry.go`:
- Around line 154-162: The function formatSystemParams currently always returns
"nil" and should either be removed along with its registration in the template
FuncMap or implemented to produce a valid Go literal for the input map; update
the code accordingly by modifying formatSystemParams to iterate over the params
map and build a Go map literal string (e.g., "map[string]interface{}{...}" with
properly quoted keys and Go-typed values) so templates can use it, or delete the
formatSystemParams function and remove any reference to it from the FuncMap to
avoid dead code and confusion.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 286-303: The errorResponse and errorResponseAction functions
currently include raw err details in client-facing bodies; change both to log
the full error server-side (e.g., b.slogger.Error or equivalent) and return
sanitized, generic client messages (e.g., "Internal policy error") while
preserving status codes and headers (Content-Type: text/plain); update
errorResponse to log the error then return policy.ImmediateResponse with generic
Body, and update errorResponseAction to log the error then return
policy.UpstreamResponseModifications with a generic Body and StatusCode pointer,
ensuring no internal error strings are sent to clients.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go`:
- Around line 402-414: The env parsing for PYTHON_POLICY_TIMEOUT in the
package-level variable pythonPolicyTimeout silently falls back to 30s when
time.ParseDuration fails (e.g., "60" -> error); update the initializer for
pythonPolicyTimeout to (1) try time.ParseDuration as now, (2) if that fails
attempt to parse the string as an integer number of seconds using strconv.Atoi
(or strconv.ParseInt) and convert to time.Duration, and (3) log a warning via
the package logger or log.Printf when the provided value is invalid and the
fallback to default is used; ensure you add "strconv" to the imports and keep
getTimeout() unchanged.

In `@gateway/gateway-runtime/proto/python_executor.proto`:
- Around line 20-22: Buf lint warns that the proto package declaration "package
wso2.gateway.python.v1" does not match its directory layout; resolve this by
either adding a Buf exception for this proto file in your Buf configuration
(buf.yaml) to allow the current placement, or move the file so its directory
path matches the package (e.g., wso2/gateway/python/v1). Ensure the change
references the same proto by checking the "package wso2.gateway.python.v1"
declaration (and the existing option go_package) so imports and generated code
continue to match after the change.

In `@gateway/gateway-runtime/python-executor/main.py`:
- Around line 42-43: The except block that currently does "except (ValueError,
TypeError): raise argparse.ArgumentTypeError(f"'{value}' is not a positive
integer")" should capture the original exception and chain it when re-raising;
change it to "except (ValueError, TypeError) as e:" and re-raise the
ArgumentTypeError with "from e" (i.e., raise
argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from e) so
the original exception context is preserved for debugging while keeping the same
message and using the existing variable names.

In `@gateway/gateway-runtime/python-executor/sdk/policy.py`:
- Around line 177-185: The empty close() method in the ABC is intentionally a
no-op but triggers Ruff B027; to silence this, add an explicit inline comment
explaining it's a deliberate no-op and append "# noqa: B027" to the method
(i.e., update the close(self) -> None: block to include the explanatory comment
and the noqa marker), or alternatively convert close into an `@abstractmethod`
with a default implementation pattern if you want subclasses to be forced to
override; choose the noqa approach for minimal change and clarity.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a7043647-8fd0-4365-a04b-ff23ee216068

📥 Commits

Reviewing files that changed from the base of the PR and between 451aca5 and 0c0e4b9.

⛔ Files ignored due to path filters (1)

gateway/gateway-runtime/policy-engine/internal/pythonbridge/proto/python_executor.pb.go is excluded by !**/*.pb.go

📒 Files selected for processing (23)

gateway/build-lock.yaml
gateway/gateway-builder/internal/discovery/discovery_test.go
gateway/gateway-builder/internal/discovery/manifest.go
gateway/gateway-builder/internal/policyengine/generator.go
gateway/gateway-builder/internal/policyengine/registry.go
gateway/gateway-builder/pkg/types/policy.go
gateway/gateway-builder/templates/plugin_registry.go.tmpl
gateway/gateway-runtime/Dockerfile
gateway/gateway-runtime/docker-entrypoint.sh
gateway/gateway-runtime/policy-engine/cmd/policy-engine/main.go
gateway/gateway-runtime/policy-engine/internal/admin/handlers.go
gateway/gateway-runtime/policy-engine/internal/admin/handlers_test.go
gateway/gateway-runtime/policy-engine/internal/admin/types.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.go
gateway/gateway-runtime/proto/python_executor.proto
gateway/gateway-runtime/python-executor/executor/server.py
gateway/gateway-runtime/python-executor/main.py
gateway/gateway-runtime/python-executor/proto/python_executor_pb2.py
gateway/gateway-runtime/python-executor/sdk/policy.py
gateway/sample-policies/slugify-body-python/policy-definition.yaml
gateway/sample-policies/slugify-body-python/policy.py

✅ Files skipped from review due to trivial changes (3)

gateway/gateway-runtime/policy-engine/internal/admin/types.go
gateway/sample-policies/slugify-body-python/policy-definition.yaml
gateway/gateway-runtime/python-executor/proto/python_executor_pb2.py

🚧 Files skipped from review as they are similar to previous changes (5)

gateway/gateway-builder/pkg/types/policy.go
gateway/sample-policies/slugify-body-python/policy.py
gateway/gateway-runtime/Dockerfile
gateway/gateway-builder/internal/policyengine/generator.go
gateway/build-lock.yaml

Refactor policy runtime detection to return (string, error) and propagate failures from DiscoverPoliciesFromBuildFile; update tests to check for errors. Harden ProcessingMode parsing with explicit enum validation and clearer error messages. Simplify plugin registry template parsing and remove an unused formatting helper. Improve Python bridge error handling by logging failures and returning generic internal-error responses (including Content-Type header for response actions). Chain exceptions in python-executor argparse error for proper context.

Merges 1032 upstream commits into the Python Policy feature branch. Conflict resolutions: - build-manifest.yaml: took upstream (v1.x policy versions) - plugin_registry.go.tmpl: merged upstream SDK path with Python loops - Dockerfile: merged upstream base (Go 1.26.1, sdk-core, wso2 user, llm-pricing) with Python feature (python-deps stage, python-executor, PYTHONPATH) - sample-policies/build.yaml: upstream policies + Python sample Post-merge updates: - Renamed build-lock.yaml refs to build-manifest.yaml in feature tests Note: pythonbridge SDK imports still reference sdk/gateway/policy/v1alpha and need manual migration to sdk/core/policy/v1alpha2.

…ha2 interface Introduce a redesigned pythonbridge that implements streaming RPCs and richer execution metadata (trace, deadline) and converts between Go and proto streaming payloads. Add StreamManager improvements (UDS dialer, suppressed late responses, typed StreamRequest/StreamResponse handling) and refactor bridge methods to handle header/body/stream phases, chunking and needs-more decisions. Add Makefile targets for proto generation and Python unit tests; regenerate/extend proto files and add python-executor SDK, executor code and unit tests. Remove sample/default Python policy YAMLs and update policy engine tests to verify copying of python-executor base files and bridge behavior.

sehan-dissanayake · 2026-04-10T05:16:32Z

@coderabbitai resume

coderabbitai · 2026-04-10T05:16:39Z

✅ Actions performed

Reviews resumed.

coderabbitai

🧹 Nitpick comments (1)

gateway/gateway-builder/internal/discovery/manifest.go (1)
243-262: Consider returning an error when no runtime indicators are found.

The default fallback to "go" at line 261 may mask configuration errors. If a directory lacks both go.mod and .py files, it defaults to Go and fails later in ValidateDirectoryStructure. While downstream validation will catch this, returning an error here would provide a clearer diagnostic earlier.
🔧 Optional: Return explicit error for ambiguous directories
 	for _, e := range entries {
 		if !e.IsDir() && filepath.Ext(e.Name()) == ".py" {
 			return "python", nil
 		}
 	}
 
-	return "go", nil
+	return "", fmt.Errorf("unable to detect runtime for %s: no go.mod or .py files found", policyDir)
 }
Note: This would require updating callers to handle the error case explicitly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 243 -
262, DetectRuntime currently defaults to returning "go" when neither go.mod nor
.py files are found, which can mask config issues; change DetectRuntime to
return an explicit error instead of "go" in the ambiguous case (update the final
return to something like return "", fmt.Errorf(...)) so callers get an immediate
clear diagnostic; ensure the error message references the policyDir and mentions
missing runtime indicators; update any callers (e.g., callers that call
DetectRuntime before ValidateDirectoryStructure) to handle the returned error
appropriately.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/manifest.go`:
- Around line 243-262: DetectRuntime currently defaults to returning "go" when
neither go.mod nor .py files are found, which can mask config issues; change
DetectRuntime to return an explicit error instead of "go" in the ambiguous case
(update the final return to something like return "", fmt.Errorf(...)) so
callers get an immediate clear diagnostic; ensure the error message references
the policyDir and mentions missing runtime indicators; update any callers (e.g.,
callers that call DetectRuntime before ValidateDirectoryStructure) to handle the
returned error appropriately.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9d1bdd5d-79f8-4a69-ab83-295ba67d70cc

📥 Commits

Reviewing files that changed from the base of the PR and between 451aca5 and 8b52625.

📒 Files selected for processing (11)

.gitignore
gateway/gateway-builder/cmd/builder/main.go
gateway/gateway-builder/internal/buildfile/generator.go
gateway/gateway-builder/internal/discovery/discovery_test.go
gateway/gateway-builder/internal/discovery/manifest.go
gateway/gateway-builder/internal/discovery/policy.go
gateway/gateway-builder/internal/policyengine/generator.go
gateway/gateway-builder/internal/policyengine/policyengine_test.go
gateway/gateway-builder/internal/policyengine/registry.go
gateway/gateway-builder/pkg/types/policy.go
gateway/gateway-builder/templates/plugin_registry.go.tmpl

✅ Files skipped from review due to trivial changes (3)

.gitignore
gateway/gateway-builder/internal/discovery/discovery_test.go
gateway/gateway-builder/internal/policyengine/generator.go

🚧 Files skipped from review as they are similar to previous changes (2)

gateway/gateway-builder/internal/policyengine/policyengine_test.go
gateway/gateway-builder/pkg/types/policy.go

Refine the bridge comment to note that the engine gates phase execution via Mode and that the bridge also performs internal checks as defense-in-depth. Add TestBuildPolicyChain_UsesBridgeModeForParticipation to ensure Kernel.BuildPolicyChain respects a bridge's ProcessingMode when determining RequiresRequestHeader and SupportsRequestStreaming. The test registers a dummy python policy and exercises buffered vs streaming modes, and adds necessary imports for kernel and registry.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
166-188: ⚠️ Potential issue | 🟠 Major

Propagate real request context into the needs-more RPCs.

These two paths still use context.Background(), so streaming decision calls lose cancellation, deadlines, and trace propagation. That means a canceled request can keep waiting on the Python executor until the default timeout expires. If the current streaming-policy interface does not expose a context, the fix likely needs to thread one through that interface or persist the active request context on the bridge before issuing these calls.
#!/bin/bash
set -euo pipefail

echo "Streaming policy interface signatures:"
rg -n -C2 'NeedsMore(Request|Response)Data\(' sdk gateway/gateway-runtime/policy-engine

echo
echo "Current bridge call sites using context.Background():"
rg -n -C2 'NeedsMore(Request|Response)Data|context\.Background\(' \
  gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go
Also applies to: 216-238
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 166 - 188, The NeedsMoreRequestData path is using context.Background()
which drops cancellation, deadlines and traces; replace those Background
contexts by propagating the active request context into the streaming decision
calls: thread a context parameter through the
NeedsMoreRequestData/NeedsMoreResponseData call sites (or store the active
request context on the bridge before invoking buildNeedsMoreRequestDataRequest /
buildNeedsMoreResponseDataRequest), then pass that context into
buildNeedsMoreRequestDataRequest, streamManager.Execute and any translator calls
instead of context.Background(); update the signatures impacted (e.g.
NeedsMoreRequestData, buildNeedsMoreRequestDataRequest, streamManager.Execute
invocation sites and the symmetric response methods) so cancellations and
deadlines on the incoming request are honored and traces propagate to the Python
executor.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/Dockerfile`:
- Around line 124-128: The python-deps stage uses envoyproxy/envoy:v1.35.3 which
mismatches the production stage's envoyproxy/envoy:v1.37.1 and can break
compiled C extensions; update the Dockerfile so both stages use the same image
version—either change the FROM line for the python-deps stage to match the
production tag (production) or introduce a top-level ARG (e.g., ENVOY_VERSION)
and reference it from both FROM lines (python-deps and production) to guarantee
consistency.

In `@gateway/gateway-runtime/Makefile`:
- Around line 165-167: The test-proto Makefile target currently regenerates
protobuf bindings and then compiles them, which can silently overwrite committed
generated files; modify the test-proto target to (1) run the existing
regeneration commands (the ones that produce the outputs referenced by the
Makefile variable $(PROTO_GENERATED)), then (2) run a diff check (e.g., git diff
--exit-code or diff -u) comparing the working tree against $(PROTO_GENERATED)
(or explicitly diff the regenerated files) and fail if there are differences so
CI fails on checked-in codegen drift; update the target that currently runs in
test-proto to include this diff/assertion step after regeneration.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 558-568: The DestroyPolicy response currently only logs when
resp.GetSuccess() is false, causing Close() to return nil; change the block
handling DestroyPolicy by treating a non-success response as an error: set
b.closeErr = fmt.Errorf("destroy Python policy instance rejected by executor:
%s", resp.GetErrorMessage()), log the warning with the error, and return so the
Close() caller observes the failure; update any surrounding logic that relies on
b.closeErr (e.g., Close(), closeOnce) to propagate that stored error rather than
swallowing it.

---

Duplicate comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 166-188: The NeedsMoreRequestData path is using
context.Background() which drops cancellation, deadlines and traces; replace
those Background contexts by propagating the active request context into the
streaming decision calls: thread a context parameter through the
NeedsMoreRequestData/NeedsMoreResponseData call sites (or store the active
request context on the bridge before invoking buildNeedsMoreRequestDataRequest /
buildNeedsMoreResponseDataRequest), then pass that context into
buildNeedsMoreRequestDataRequest, streamManager.Execute and any translator calls
instead of context.Background(); update the signatures impacted (e.g.
NeedsMoreRequestData, buildNeedsMoreRequestDataRequest, streamManager.Execute
invocation sites and the symmetric response methods) so cancellations and
deadlines on the incoming request are honored and traces propagate to the Python
executor.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 90f02c2f-5b35-4af0-8e8c-a8fdb6206318

📥 Commits

Reviewing files that changed from the base of the PR and between 8b52625 and 09ce453.

📒 Files selected for processing (6)

gateway/gateway-runtime/Dockerfile
gateway/gateway-runtime/Makefile
gateway/gateway-runtime/policy-engine/internal/admin/handlers_test.go
gateway/gateway-runtime/policy-engine/internal/admin/types.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/factory_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

gateway/gateway-runtime/policy-engine/internal/admin/types.go
gateway/gateway-runtime/policy-engine/internal/admin/handlers_test.go

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Set bridge.closeErr and return when DestroyPolicy is rejected by the Python executor, and log the wrapped error. Add unit test coverage with a fake executor client to verify Close returns the executor error, is idempotent, and only calls DestroyPolicy once. Also add a git-diff check to Makefile's test-proto target to ensure generated protobuf files are up-to-date.

…tation

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go (1)
136-157: Consider asserting that the unexpected-RPC stubs are never hit.

ExecuteStream, HealthCheck, and InitPolicy return errors but don't fail the test if called. Swapping errors.New(...) for t.Fatalf(...) (via a captured *testing.T) would make accidental regressions — e.g., Close() starting to issue a health check — surface immediately instead of being swallowed by the caller's error path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go`
around lines 136 - 157, The fakePythonExecutorClient's unexpected-RPC stubs
(ExecuteStream, HealthCheck, InitPolicy) should fail the test if invoked: add a
captured *testing.T field to fakePythonExecutorClient and, in
ExecuteStream/HealthCheck/InitPolicy, call t.Fatalf with a clear "unexpected
<MethodName> call" message (then return zero values to satisfy signatures);
update the test that constructs fakePythonExecutorClient to pass the *testing.T,
and ensure the test imports testing; keep DestroyPolicy unchanged.
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
525-547: Error parameter is unused in *ErrorAction helpers.

Each helper takes err error but never reads it — the caller already logs the error at the call site. Either drop the parameter for clarity, or thread the error into the immediate-response body/headers (e.g., a structured error code) so operators can correlate 500s with the underlying failure without having to cross-reference logs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 525 - 547, These helper methods (requestBodyErrorAction,
responseBodyErrorAction, requestHeaderErrorAction, responseHeaderErrorAction,
streamingRequestErrorAction, streamingResponseErrorAction) accept an err
parameter that is never used; remove the unused err parameter from their
signatures and update all call sites to stop passing an error, keeping the same
return types (policy.RequestAction, policy.ResponseAction,
policy.RequestHeaderAction, policy.ResponseHeaderAction,
policy.StreamingRequestAction, policy.StreamingResponseAction); ensure any
callers or tests that previously relied on the old signatures are updated
accordingly and run the build/tests to confirm no remaining references to the
removed parameter.
gateway/gateway-runtime/Makefile (1)
153-163: Consider aligning the Go and Python protoc invocations.

The Go step cds into proto/ and passes ../$(GO_PROTO_OUT), while the Python step runs from the Makefile directory and passes $(PROTO_FILE) directly. Both work, but the asymmetry makes the recipe harder to follow and prevents reusing a common include-path variable. Running both invocations from the same working directory (e.g., keeping the Makefile CWD and using -I proto for both) would make the target easier to extend and bring it closer to the 5-line body threshold flagged by checkmake.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gateway/gateway-runtime/Makefile` around lines 153 - 163, The proto Makefile
recipe should run both protoc invocations from the same CWD and use a common
include path; remove the cd proto in the Go step and change both commands to
pass -I proto, call the proto file as proto/python_executor.proto (or use
$(PROTO_FILE) with -I proto), and adjust Go outputs to --go_out=$(GO_PROTO_OUT)
and --go-grpc_out=$(GO_PROTO_OUT) (remove the ../ prefix), while preserving the
PATH="$(GO_BIN):$$PATH" prefix for the Go invocation and keeping the Python
invocation using $(PYTHON) -m grpc_tools.protoc; update the inputs and output
flags in the proto target so both invocations are symmetric and reuse the same
include-path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 166-189: NeedsMoreRequestData and NeedsMoreResponseData currently
call b.streamManager.Execute with context.Background(), dropping cancellation,
deadlines and traces; update these helpers to use a proper context: either (A)
store a per-execution context on the bridge when
OnRequestBodyChunk/OnResponseBodyChunk (or an earlier lifecycle method) runs and
read that context inside NeedsMoreRequestData/NeedsMoreResponseData before
calling buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and
streamManager.Execute (e.g., use b.execCtx or similar), or (B) if a
per-execution context isn’t available yet, derive a bounded context with
context.WithTimeout(context.Background(), getTimeout()) and pass that into
buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and
streamManager.Execute to cap work; make sure to replace the context.Background()
usages in NeedsMoreRequestData and NeedsMoreResponseData with the chosen context
so cancelation and timeouts propagate to the Python executor.

---

Nitpick comments:
In `@gateway/gateway-runtime/Makefile`:
- Around line 153-163: The proto Makefile recipe should run both protoc
invocations from the same CWD and use a common include path; remove the cd proto
in the Go step and change both commands to pass -I proto, call the proto file as
proto/python_executor.proto (or use $(PROTO_FILE) with -I proto), and adjust Go
outputs to --go_out=$(GO_PROTO_OUT) and --go-grpc_out=$(GO_PROTO_OUT) (remove
the ../ prefix), while preserving the PATH="$(GO_BIN):$$PATH" prefix for the Go
invocation and keeping the Python invocation using $(PYTHON) -m
grpc_tools.protoc; update the inputs and output flags in the proto target so
both invocations are symmetric and reuse the same include-path.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go`:
- Around line 136-157: The fakePythonExecutorClient's unexpected-RPC stubs
(ExecuteStream, HealthCheck, InitPolicy) should fail the test if invoked: add a
captured *testing.T field to fakePythonExecutorClient and, in
ExecuteStream/HealthCheck/InitPolicy, call t.Fatalf with a clear "unexpected
<MethodName> call" message (then return zero values to satisfy signatures);
update the test that constructs fakePythonExecutorClient to pass the *testing.T,
and ensure the test imports testing; keep DestroyPolicy unchanged.

In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 525-547: These helper methods (requestBodyErrorAction,
responseBodyErrorAction, requestHeaderErrorAction, responseHeaderErrorAction,
streamingRequestErrorAction, streamingResponseErrorAction) accept an err
parameter that is never used; remove the unused err parameter from their
signatures and update all call sites to stop passing an error, keeping the same
return types (policy.RequestAction, policy.ResponseAction,
policy.RequestHeaderAction, policy.ResponseHeaderAction,
policy.StreamingRequestAction, policy.StreamingResponseAction); ensure any
callers or tests that previously relied on the old signatures are updated
accordingly and run the build/tests to confirm no remaining references to the
removed parameter.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 168ce9db-9c10-4df8-9a4f-53581358bc58

📥 Commits

Reviewing files that changed from the base of the PR and between 09ce453 and dff9d30.

📒 Files selected for processing (5)

gateway/gateway-runtime/Dockerfile
gateway/gateway-runtime/Makefile
gateway/gateway-runtime/policy-engine/internal/admin/handlers.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

gateway/gateway-runtime/Dockerfile
gateway/gateway-runtime/policy-engine/internal/admin/handlers.go

coderabbitai · 2026-04-17T04:34:56Z

+func (b *bridge) NeedsMoreRequestData(accumulated []byte) bool {
+	if b.mode.RequestBodyMode != policy.BodyModeStream {
+		return false
+	}
+
+	req, err := b.buildNeedsMoreRequestDataRequest(context.Background(), accumulated)
+	if err != nil {
+		b.slogger.Error("Failed to build request streaming decision payload", "error", err)
+		return false
+	}
+
+	resp, err := b.streamManager.Execute(context.Background(), req)
+	if err != nil {
+		b.slogger.Error("Failed to execute Python request streaming decision", "error", err)
+		return false
+	}
+
+	decision, err := b.translator.ToGoNeedsMoreDecision(resp)
+	if err != nil {
+		b.slogger.Error("Failed to translate Python request streaming decision", "error", err)
+		return false
+	}
+	return decision
+}


⚠️ Potential issue | 🟡 Minor

NeedsMoreRequestData/NeedsMoreResponseData drop the request context.

Both hooks call b.streamManager.Execute(context.Background(), ...), so cancellation, deadlines, and tracing from the in-flight request cannot propagate into the Python executor. The SDK interface (NeedsMoreRequestData(accumulated []byte) bool) doesn't accept a context.Context, so a local fix here is limited, but two options are worth considering:

Store a per-execution context on the bridge when OnRequestBodyChunk/OnResponseBodyChunk (or an earlier phase) runs and reuse it from these helpers, or

Derive a bounded context via context.WithTimeout(context.Background(), getTimeout()) to at least cap executor work when the caller has abandoned the request.

Right now a hung or slow Python executor will keep processing streaming-decision RPCs for a request the gateway has already cancelled.

Also applies to: 216-239

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 166 - 189, NeedsMoreRequestData and NeedsMoreResponseData currently call b.streamManager.Execute with context.Background(), dropping cancellation, deadlines and traces; update these helpers to use a proper context: either (A) store a per-execution context on the bridge when OnRequestBodyChunk/OnResponseBodyChunk (or an earlier lifecycle method) runs and read that context inside NeedsMoreRequestData/NeedsMoreResponseData before calling buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and streamManager.Execute (e.g., use b.execCtx or similar), or (B) if a per-execution context isn’t available yet, derive a bounded context with context.WithTimeout(context.Background(), getTimeout()) and pass that into buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and streamManager.Execute to cap work; make sure to replace the context.Background() usages in NeedsMoreRequestData and NeedsMoreResponseData with the chosen context so cancelation and timeouts propagate to the Python executor.

sehan-dissanayake added 13 commits February 25, 2026 15:04

Python Policy Support: Basic Implementation

1fa2b06

Merge remote-tracking branch 'upstream/main' into feature/my-implemen…

4c5a7e4

…tation # Conflicts: # gateway/gateway-builder/go.sum

Cleanup

7d09d99

sehan-dissanayake requested review from Krishanx92, malinthaprasan and pubudu538 as code owners March 12, 2026 08:20

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

Comment thread gateway/sample-policies/build.yaml Outdated

sehan-dissanayake added 2 commits March 12, 2026 14:12

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

Comment thread gateway/gateway-runtime/docker-entrypoint.sh Outdated

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

Comment thread gateway/gateway-builder/internal/discovery/python_module.go

Comment thread gateway/gateway-builder/internal/policyengine/generator.go

sehan-dissanayake added 2 commits March 13, 2026 10:47

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

sehan-dissanayake and others added 3 commits March 13, 2026 15:16

Duplicate section comment marker removal

478e8dd

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

Comment thread gateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.go

sehan-dissanayake added 3 commits March 24, 2026 11:06

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

renuka-fernando mentioned this pull request Apr 10, 2026

[Feature]: Python Policy Support in the Policy Engine #1683

Open

sehan-dissanayake mentioned this pull request Apr 10, 2026

Gate policy phases by Mode() and add tests #1686

Merged

sehan-dissanayake added 3 commits April 16, 2026 11:47

Update Dockerfile

a9d21c2

Merge branch 'main' into feature/my-implementation

9cb947a

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Comment thread gateway/gateway-runtime/Dockerfile Outdated

Comment thread gateway/gateway-runtime/Makefile

Comment thread gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go

sehan-dissanayake and others added 3 commits April 16, 2026 14:45

Update gateway/gateway-runtime/Dockerfile

0756639

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Merge remote-tracking branch 'upstream/main' into feature/my-implemen…

dff9d30

…tation

coderabbitai bot reviewed Apr 17, 2026

View reviewed changes

Conversation

sehan-dissanayake commented Mar 12, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Discussion

Purpose

Goals

Approach

New Components

1. Python Executor (gateway/gateway-runtime/python-executor/)

2. Python Policy SDK (gateway/gateway-runtime/python-executor/sdk/policy.py)

3. gRPC Contract (gateway/gateway-runtime/proto/python_executor.proto)

4. Go Python Bridge (gateway/gateway-runtime/policy-engine/internal/pythonbridge/)

5. Gateway Builder Extensions (gateway/gateway-builder/)

6. Docker & Entrypoint (gateway/gateway-runtime/)

7. Admin Health Endpoint (gateway/gateway-runtime/policy-engine/internal/admin/)

8. Sample Policy (gateway/sample-policies/slugify-body-python/)

User stories

Documentation

Automation tests

Unit tests

Integration tests

Security checks

Samples

slugify-body-python — Canonical Python Policy Reference

PyPI-distributed policy (pipPackage)

Related PRs

Test environment

Uh oh!

coderabbitai bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Policy Runtime Support

Gateway Builder Enhancements

Python Executor Implementation

Policy Engine Integration

Runtime and Infrastructure

Testing and Validation

Security Considerations

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

CLAassistant commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CLAassistant commented Mar 12, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sehan-dissanayake commented Mar 12, 2026 •

edited by coderabbitai bot

Loading

1. Python Executor (`gateway/gateway-runtime/python-executor/`)

2. Python Policy SDK (`gateway/gateway-runtime/python-executor/sdk/policy.py`)

3. gRPC Contract (`gateway/gateway-runtime/proto/python_executor.proto`)

4. Go Python Bridge (`gateway/gateway-runtime/policy-engine/internal/pythonbridge/`)

5. Gateway Builder Extensions (`gateway/gateway-builder/`)

6. Docker & Entrypoint (`gateway/gateway-runtime/`)

7. Admin Health Endpoint (`gateway/gateway-runtime/policy-engine/internal/admin/`)

8. Sample Policy (`gateway/sample-policies/slugify-body-python/`)

`slugify-body-python` — Canonical Python Policy Reference

PyPI-distributed policy (`pipPackage`)

coderabbitai bot commented Mar 12, 2026 •

edited

Loading

CLAassistant commented Mar 12, 2026 •

edited

Loading