Python Policy Support via Python Runtime #1343
Python Policy Support via Python Runtime #1343sehan-dissanayake wants to merge 37 commits intowso2:mainfrom
Conversation
Add two Python policies (python-body-validator, python-json-body-transform) with policy-definition.yaml, example policy.py implementations and requirements. Update build config to register sample-policies and include them in the runtime build context. Enhance the runtime Dockerfile to preinstall Python dependencies in a dedicated stage and copy them into the production image; adjust Makefile to pass sample-policies as a build context. Minor logger message change in policy_loader.py and add results.md with analysis notes.
Replace the sequential await/yield execution in PythonExecutorServicer.ExecuteStream with a concurrent reader/worker/writer fan-out pattern so multiple policy executions can run concurrently in the thread pool. Add a bounded asyncio.Queue and semaphore to cap in-flight tasks, spawn per-request processing tasks with run_in_executor, and yield responses as they complete (out-of-order). Introduce PYTHON_POLICY_MAX_CONCURRENT (MAX_CONCURRENT) and propagate it from main.py into the server/servicer constructors. Also add CONCURRENT_FANOUT_PLAN.md and update generated .pyc artifacts. Files changed: gateway/gateway-runtime/python-executor/executor/server.py (core change), gateway/gateway-runtime/python-executor/main.py (new env var and constructor arg), added CONCURRENT_FANOUT_PLAN.md, and updated __pycache__ artifacts.
Introduce lifecycle RPCs and instance management for Python policies. Adds InitPolicy and DestroyPolicy to the proto and regenerates Go/Python stubs; ExecutionRequest gains an instance_id. Python executor rewritten to support factories and instances: new PolicyInstanceStore, rewritten PolicyLoader, and a gRPC server that handles InitPolicy/DestroyPolicy and dispatches ExecuteStream to instances by instance_id. Python SDK switched to class-based Policy ABC with a close() hook; sample policy updated to use get_policy() factory. Go pythonbridge updated to call InitPolicy in GetPolicy, store the returned instance_id, include it on Execute requests, and expose PythonBridge.Close() which calls DestroyPolicy. Documentation and a discussion file for Close() (wso2#734) are added. This enables factory-controlled instancing and proper cleanup of Python policy resources.
…tation # Conflicts: # gateway/gateway-builder/go.sum
Convert AuthContext to a flat map[string]string for the Python bridge proto wireformat by adding authContextToMap and using it in buildSharedContext. Fix import path in python_executor_pb2_grpc.py to import from proto. Update several policy dependency versions in build-lock.yaml (api-key-auth/basic-auth/jwt-auth -> v0.8.1; mcp-auth/mcp-authz -> v0.2.2). Also include minor import/formatting tidy-ups in factory.go and buildPolicyMetadata.
Delete three sample Python policies (add-python-response-header, python-body-validator, python-json-body-transform) including their policy-definition.yaml, policy.py, and requirements.txt files. Update gateway/build.yaml and gateway/build-lock.yaml to remove references to the removed sample policies and keep the build manifests consistent.
Introduce a new Python policy 'prompt-compression' to compress LLM prompts: adds policy implementation (policy.py), policy-definition.yaml, user docs, and requirements. Registers the policy in build manifests and sample-policies build config, and adds a gateway controller default policy manifest. Also adds a CURL_REFERENCE.md with end-to-end test commands and removes the obsolete STATELESS_PYTHON_PLAN.md. This change ships the new policy, documentation, and integration points so the policy can be built, deployed, and manually tested.
Reworks gateway/CURL_REFERENCE.md into a comprehensive, up-to-date AI Gateway curl reference: renames the doc, adds a one-shot payload generator, reorganizes sections (health, LLM provider/proxy management, policies, prompt-compression tests, rate-limiting, observability, API key management), updates example endpoints to Gemini provider/proxy routes, and includes a full smoke-test script. Also updates gateway/build-lock.yaml to bump the request-rewrite policy version from v0.2.0 to v0.2.1.
Introduce pipPackage as a manifest source for Python policies and add local runtime detection and pip-based fetching. Key changes: - Rename BuildEntry.pythonmodule -> pipPackage and propagate throughout types and validation. - Add DetectRuntime to auto-detect go/python by directory fingerprint (go.mod vs .py files). - Implement ParsePipPackageRef, FetchPipPackage, findPolicyDir and pip executable resolution to install packages into an isolated target and locate policy-definition.yaml. - Extend WriteBuildLockWithVersions to include pipPackage entries. - Route discovery: support explicit pipPackage (remote), explicit gomodule (Go), and filePath with runtime auto-detection (local Go or Python). Consolidate Python discovery logic into helpers (discoverPipPolicy, discoverLocalPythonPolicy, buildPythonDiscoveredPolicy). - Update tests to cover runtime detection, pip ref parsing, fingerprint discovery, and adjust policyengine tests to supply outputDir and ensure python-executor sibling dir exists. - Update Dockerfile to install python3, pip and venv for building Python policies. - Update build.yaml/build-lock.yaml to mark prompt-compression as a pipPackage and add uppercase-body sample policy; bump request-rewrite version. - Remove some default Python policy YAMLs and remove explicit runtime: python from sample/default policy files where appropriate. - Minor formatting and struct field alignment changes in generator.go. These changes enable Python policies to be referenced as pip packages or local directories with correct discovery and installation during build time.
Add a new slugify-body-python sample policy and update build manifests to reference it; remove legacy Python sample policies. Integrate Python executor health into the admin server and /health handler via a PythonHealthChecker adapter (created from the pythonbridge StreamManager) so the policy engine reports executor readiness and loaded policy count. Improve pythonbridge StreamManager lifecycle (long-lived stream context, stream cancellation on disconnect/close) and add a HealthCheck RPC wrapper. Refactor PythonBridge request builders, update BridgeFactory to expose instance IDs and provide the PythonHealthAdapter. Update python-executor server and instance store: use InstanceRecord dataclass, switch to a deque+Condition response pipeline for streaming concurrency, and ensure correct instance record handling. Update tests and server wiring to accommodate new handler signatures.
Delete a set of outdated planning, proposal, and documentation markdown files related to Python policy support and agent guidance to clean up the repo. Files removed: AGENTS.md, CONCURRENT_FANOUT_PLAN.md, PYTHON_LIFECYCLE_PLAN.md, discussion#734.md, first_proposal.md, gateway/docs/python-policy-support.md, python-policy-comprehensive-plan.md, results.md. This is a docs-only cleanup; no source code or proto changes were made.
Python Policy Runtime SupportThis PR introduces comprehensive support for authoring and executing gateway policies in Python, enabling developers to leverage Python libraries and ecosystems alongside existing Go-based policies. The implementation maintains transparent integration with the Policy Engine without kernel-level changes. Gateway Builder EnhancementsExtended the policy discovery and validation pipeline to recognize Python policies:
Python Executor ImplementationCreated a standalone asyncio-based gRPC server providing Python policy runtime:
Policy Engine IntegrationBridged Go and Python policy ecosystems:
Runtime and InfrastructureUpdated container and process management to support the new Python execution tier:
Testing and ValidationComprehensive test coverage for new components:
Security ConsiderationsImplemented isolation boundaries and input validation:
WalkthroughWires Python policy support end-to-end: build-file pip discovery and wheel extraction, runtime detection, Python policy discovery/validation, policy codegen emitting Python executor artifacts and requirements, a gRPC Python Executor service + SDK, Go <-> Python bridge and stream manager, runtime packaging/entrypoint changes, and aggregated health checks. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant PolicyEngine as Policy Engine
participant PythonBridge as Python Bridge
participant StreamMgr as StreamManager
participant PythonExec as Python Executor
Client->>PolicyEngine: HTTP request
PolicyEngine->>PythonBridge: Invoke policy hook (OnRequest/OnResponse)
PythonBridge->>StreamMgr: Execute(ExecutionRequest) (assign request_id)
StreamMgr->>PythonExec: send over ExecuteStream (gRPC)
PythonExec->>PythonExec: run policy instance, produce ExecutionResponse
PythonExec->>StreamMgr: stream ExecutionResponse (request_id)
StreamMgr->>PythonBridge: deliver matched ExecutionResponse
PythonBridge->>PolicyEngine: translated RequestAction/ResponseAction
PolicyEngine->>Client: HTTP response
sequenceDiagram
participant Builder as Gateway Builder
participant Discovery as Discovery Module
participant PipModule as Pip Handler
participant PipExec as pip
participant Generator as Code Generator
Builder->>Discovery: DiscoverPoliciesFromBuildFile(buildfile)
alt entry has pipPackage
Discovery->>PipModule: discoverPipPolicy()
PipModule->>PipModule: ParsePipPackageRef()
PipModule->>PipExec: pip download --no-deps (wheel)
PipExec->>PipModule: wheel file
PipModule->>PipModule: read top_level.txt, extract module
PipModule->>Discovery: DiscoveredPolicy (Runtime: python, extracted dir)
else local filePath
Discovery->>Discovery: DetectRuntime(policyDir)
Discovery->>Discovery: discoverLocalPythonPolicy() or discoverGoPolicy()
end
Discovery->>Builder: []DiscoveredPolicy
Builder->>Generator: GenerateCode(srcDir, policies, outputDir)
Generator->>Generator: split by runtime, generate Python artifacts, merge requirements
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Delete the default slugify-body-python policy manifest and clean up build manifests: remove slugify-body-python and uppercase-body entries from gateway/build.yaml; delete gateway-controller/default-policies/slugify-body-python.yaml; update gateway/sample-policies/build.yaml to reference ./slugify-body (and simplify the comment), and remove the prompt-compression entry. These changes consolidate the sample policy location and remove now-unused default policy references.
There was a problem hiding this comment.
Actionable comments posted: 8
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
gateway/gateway-runtime/Makefile (1)
85-102:⚠️ Potential issue | 🟠 MajorMissing
sample-policiesbuild context inbuild-coverage-imagetarget.The
sample-policiesbuild context was added tobuildandbuild-debugtargets but is missing frombuild-coverage-image(andbuild-and-push-multiarch). If the Dockerfile references--from=sample-policies, these builds will fail.🐛 Proposed fix
build-coverage-image: test ## Build Gateway Runtime Docker image with coverage instrumentation `@echo` "Building Gateway Runtime coverage image ($(IMAGE_NAME)-coverage:$(VERSION))..." `@mkdir` -p target && cp ../build.yaml target/ DOCKER_BUILDKIT=1 docker build -f Dockerfile \ --build-context sdk=../../sdk \ --build-context common=../../common \ --build-context gateway-builder=../gateway-builder \ --build-context system-policies=../system-policies \ --build-context dev-policies=../dev-policies \ + --build-context sample-policies=../sample-policies \ --build-context target=target \Similarly for
build-and-push-multiarcharound line 117.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/Makefile` around lines 85 - 102, The build-coverage-image target is missing the sample-policies build context which the Dockerfile may reference; add the build context flag --build-context sample-policies=../sample-policies to the DOCKER_BUILDKIT docker build invocation in the build-coverage-image recipe (same place where other --build-context flags are listed) and also add the same build-context flag to the build-and-push-multiarch target so any --from=sample-policies stage in the Dockerfile resolves correctly (keep the existing --build-arg VERSION/ GIT_COMMIT and tagging behavior unchanged).
🟡 Minor comments (14)
gateway/gateway-runtime/python-executor/sdk/policy.py-109-116 (1)
109-116:⚠️ Potential issue | 🟡 MinorInconsistency confirmed: Go SDK defaults
StatusCodeto 0 (invalid HTTP status), Python defaults to 500. Align defaults or document the intentional difference.The proto definition (
ImmediateResponseAction) and Go SDK both default to 0, which is not a valid HTTP status code. The Python SDK defaults to 500 (a valid error status), creating an inconsistency. While developers should always explicitly setstatus_code(as shown in the sample policy), the SDK defaults should be aligned across runtimes or the difference should be explicitly documented if intentional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/sdk/policy.py` around lines 109 - 116, The ImmediateResponse dataclass currently defaults status_code to 500, which is inconsistent with the proto ImmediateResponseAction and Go SDK (which default to 0); change the default for status_code in ImmediateResponse to 0 to match the proto/Go behavior and add a brief docstring line on ImmediateResponse noting the default 0 is intentional (and that callers should set a valid HTTP status) so the runtime defaults are consistent across SDKs (refer to ImmediateResponse.status_code and ImmediateResponseAction).gateway/gateway-runtime/python-executor/main.py-73-80 (1)
73-80:⚠️ Potential issue | 🟡 MinorStore task reference and use
get_running_loop()instead of deprecatedget_event_loop().Two issues in the signal handling setup:
asyncio.get_event_loop()is deprecated since Python 3.10 and should be replaced withasyncio.get_running_loop().- The task created by
asyncio.create_task(server.shutdown())is not stored, which means it could be garbage collected before completion.🐛 Proposed fix
+ shutdown_task = None + def signal_handler(): + nonlocal shutdown_task logger.info("Received shutdown signal") - asyncio.create_task(server.shutdown()) + shutdown_task = asyncio.create_task(server.shutdown()) - loop = asyncio.get_event_loop() for sig in (signal.SIGTERM, signal.SIGINT): - loop.add_signal_handler(sig, signal_handler) + asyncio.get_running_loop().add_signal_handler(sig, signal_handler)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/main.py` around lines 73 - 80, Replace the deprecated asyncio.get_event_loop() call with asyncio.get_running_loop() when initializing loop, and in signal_handler capture and store the shutdown task so it isn't garbage-collected (e.g., assign the result of asyncio.create_task(server.shutdown()) to a variable scoped where it will live). Update references in the code around loop, signal_handler, and server.shutdown() to use the new loop and to keep the created task reference alive until completion.gateway/gateway-runtime/python-executor/main.py-82-88 (1)
82-88:⚠️ Potential issue | 🟡 MinorAdd a synchronization guard to prevent concurrent shutdown() calls.
The signal handler (line 77) creates an unawaited task calling
server.shutdown(), while the finally block (line 88) also calls it. These can execute concurrently in the event loop without synchronization. AlthoughPolicyInstanceStore.clear()is idempotent and closing instances twice is safe,server.stop()(line 386) lacks an explicit guard. Add a flag or asyncio.Event to ensure shutdown executes only once:# In __init__: self._shutdown_event = asyncio.Event() # In shutdown(): if self._shutdown_event.is_set(): return self._shutdown_event.set() # ... rest of shutdown logic🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/main.py` around lines 82 - 88, The shutdown() method can be invoked concurrently by the signal handler task and the finally block; add a one-time synchronization guard to ensure shutdown runs only once by initializing a flag/event (e.g., self._shutdown_event = asyncio.Event() in __init__) and at the start of shutdown() check if the event is set and return early if so, otherwise set the event and proceed with the existing shutdown logic (which calls PolicyInstanceStore.clear() and server.stop()); update any callers (signal handler/unawaited task and the finally block) to simply call/await server.shutdown() without worrying about duplicate execution.gateway/gateway-controller/default-policies/prompt-compression.yaml-38-47 (1)
38-47:⚠️ Potential issue | 🟡 MinorConsider deprecating the no-op parameter.
minInputTokensis documented as having no effect and kept only for backward compatibility. Consider formally marking it as deprecated (e.g., via adeprecated: truefield or naming convention) to signal that it may be removed in a future version, rather than silently accepting a parameter that does nothing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-controller/default-policies/prompt-compression.yaml` around lines 38 - 47, The minInputTokens parameter is a no-op kept for backward compatibility; mark it as deprecated so callers know it may be removed—add a formal deprecation marker (e.g., deprecated: true or an x-wso2-deprecated: true field) to the minInputTokens schema entry in prompt-compression.yaml, update its description to state it is deprecated and slated for removal, and ensure any schema consumers (validation or docs generators) will surface this deprecation by using the same property name minInputTokens to locate and annotate it.gateway/gateway-builder/internal/policyengine/policyengine_test.go-378-381 (1)
378-381:⚠️ Potential issue | 🟡 MinorTest isolation concern with sibling directory pattern.
Using
filepath.Join(tmpDir, "..", "python-executor")creates a directory in the parent oft.TempDir(). Sincet.TempDir()returns paths under a shared OS temp folder, parallel test runs may collide on thepython-executordirectory name.Consider creating a dedicated parent directory within the temp dir to avoid potential flakiness:
Proposed fix
- // Create python-executor sibling directory (required by generatePythonExecutorBase) - pythonExecDir := filepath.Join(tmpDir, "..", "python-executor") - testutils.CreateDir(t, pythonExecDir) + // Create a workspace with policy-engine and python-executor as siblings + workspace := t.TempDir() + tmpDir := filepath.Join(workspace, "policy-engine") + testutils.CreateDir(t, tmpDir) + pythonExecDir := filepath.Join(workspace, "python-executor") + testutils.CreateDir(t, pythonExecDir)This pattern should be applied consistently across all affected tests (
TestGenerateCode_Success,TestGenerateCode_EmptyPolicies,TestGenerateCode_MissingCmdDirectory,TestGenerateCode_MissingGoMod,TestGenerateCode_MultiplePolicies).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/policyengine_test.go` around lines 378 - 381, The tests create a sibling "python-executor" directory using pythonExecDir := filepath.Join(tmpDir, "..", "python-executor"), which can collide across parallel test runs; change the setup to create an isolated parent inside the temp dir (e.g., make a subdir under tmpDir to act as the parent) and then create the python-executor directory as a child of that new parent so generatePythonExecutorBase sees the sibling without touching the OS temp parent; apply this change to the tests referenced (TestGenerateCode_Success, TestGenerateCode_EmptyPolicies, TestGenerateCode_MissingCmdDirectory, TestGenerateCode_MissingGoMod, TestGenerateCode_MultiplePolicies) and update the pythonExecDir variable and any setup/teardown in those tests accordingly.gateway/sample-policies/slugify-body-python/policy.py-137-153 (1)
137-153:⚠️ Potential issue | 🟡 MinorReject inputs that collapse to an empty slug.
Bodies made only of punctuation or unsupported characters pass
text.strip()but can still produce""here, so the policy forwards an empty body upstream.Suggested fix
slug = slugify( text, separator=self._separator, max_length=self._max_length, ) + if not slug: + return ImmediateResponse( + status_code=400, + headers={"Content-Type": "text/plain; charset=utf-8"}, + body=b"slugify-body: request body must contain slugifiable text", + ) logger.info( "on_request: slugified body — %r → %r", text[:60], slug[:60], )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/sample-policies/slugify-body-python/policy.py` around lines 137 - 153, The slug can be empty after slugify (variables slug and text), so before returning the UpstreamRequestModifications you must detect slug == "" and reject the request instead of forwarding an empty body: in the function/method where slugify is called (look for slugify(...) and the return of UpstreamRequestModifications), check if slug is empty and return an appropriate rejection (e.g., raise an HTTP error or return a rejection response) with a clear message rather than returning body=slug.encode(...).gateway/gateway-builder/internal/policyengine/generator.go-234-281 (1)
234-281:⚠️ Potential issue | 🟡 MinorRequirements merging uses exact string matching, which won't detect version conflicts.
If one policy requires
requests==2.28.0and another requiresrequests==2.31.0, both lines will be added. This can cause unpredictable pip behavior. Consider extracting package names and detecting conflicts.Would you like me to propose a more robust requirements merger that detects package version conflicts?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 234 - 281, The mergeRequirements function currently deduplicates by exact line text which misses package/version conflicts (e.g., requests==2.28.0 vs requests==2.31.0); update mergeRequirements to parse each requirement line into a package name and version specifier (use a helper that splits on first comparison operator or version separator), track seen packages in a map from package name -> (versionSpec, sourcePolicy), and when encountering a different version spec for the same package either (a) decide a deterministic resolution (e.g., pick the highest compatible version) or (b) return a descriptive error indicating the conflicting package, the differing version specs and which DiscoveredPolicy (use p.Name and p.PythonSourceDir) introduced each; ensure final output reconstructs a normalized requirements.txt (one entry per package) and adjust the function signature/return to surface conflict errors so callers can handle them.gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go-155-161 (1)
155-161:⚠️ Potential issue | 🟡 MinorDropping responses when channel is full may cause silent failures.
The non-blocking send with
defaultcase logs an error but the caller waiting onrespChwill timeout without knowing the response was actually received. Consider increasing buffer size or blocking with a short timeout.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go` around lines 155 - 161, The current non-blocking send to ch (sending resp and logging via slogger.Error with resp.RequestId) drops responses silently; either increase the channel buffer where respCh is created to a safe size or replace the non-blocking send with a bounded-blocking send using a short timeout (use time.After) so the sender waits briefly before giving up and logs the timeout including resp.RequestId; locate the channel creation for respCh to adjust its capacity or update the send site (the select that currently has case ch <- resp and default) to include a timeout case that logs the failure and preserves observability for the caller.gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go-241-245 (1)
241-245:⚠️ Potential issue | 🟡 MinorFlattening
ac.Propertiesmay overwrite standard auth context keys.If
ac.Propertiescontains keys like"subject"or"issuer", they will overwrite the standard values set above. Consider namespacing properties or skipping reserved keys.🛡️ Proposed fix
// Flatten properties for k, v := range ac.Properties { + // Skip reserved keys to prevent overwriting standard fields + switch k { + case "subject", "issuer", "auth_type", "authenticated", "authorized", "credential_id": + continue + } m[k] = v }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 241 - 245, The loop that flattens ac.Properties into m can clobber reserved auth-context keys (e.g., "subject", "issuer") because it blindly assigns m[k]=v; update the logic in the flattening block (the for k,v := range ac.Properties loop) to either namespace property keys (e.g., prefix with "prop.") or explicitly skip a set of reserved keys (subject, issuer, audience, etc.) before assigning into m to prevent overwriting the standard auth context values.gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go-273-290 (1)
273-290:⚠️ Potential issue | 🟡 MinorError responses expose internal error details to clients.
The error message from
err.Error()is returned directly in the response body, which could leak internal implementation details or stack traces to API consumers.🛡️ Proposed fix - use generic error message
func (b *PythonBridge) errorResponse(err error) policy.RequestAction { + b.slogger.Error("Policy execution error", "error", err) return policy.ImmediateResponse{ StatusCode: 500, Headers: map[string]string{"Content-Type": "text/plain"}, - Body: []byte(fmt.Sprintf("Policy execution error: %v", err)), + Body: []byte("Internal policy error"), } } func (b *PythonBridge) errorResponseAction(err error) policy.ResponseAction { + b.slogger.Error("Policy execution error in response phase", "error", err) statusCode := 500 return policy.UpstreamResponseModifications{ StatusCode: &statusCode, - Body: []byte(fmt.Sprintf("Policy execution error: %v", err)), + Body: []byte("Internal policy error"), } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 273 - 290, The current implementations of errorResponse and errorResponseAction return err.Error() in response bodies which can leak internals; change both functions (errorResponse and errorResponseAction) to return a generic, non-sensitive message (e.g. "Policy execution failed") and avoid including the actual err string or stack trace, while preserving the 500 status and Content-Type header (for errorResponse) and setting Body to the generic message in policy.ImmediateResponse and policy.UpstreamResponseModifications respectively.gateway/gateway-builder/internal/discovery/manifest.go-240-257 (1)
240-257:⚠️ Potential issue | 🟡 Minor
DetectRuntimesilently defaults to Go on directory read errors.If
os.ReadDirfails (e.g., permission denied), the function returns"go"without logging, which could lead to confusing build failures downstream when a Python policy is misidentified.🛡️ Proposed fix to add logging
func DetectRuntime(policyDir string) string { goMod := filepath.Join(policyDir, "go.mod") if _, err := os.Stat(goMod); err == nil { return "go" } entries, err := os.ReadDir(policyDir) if err != nil { + slog.Warn("Failed to read policy directory for runtime detection, defaulting to Go", + "path", policyDir, + "error", err) return "go" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 240 - 257, DetectRuntime currently swallows os.ReadDir errors and silently returns "go"; update DetectRuntime to log the ReadDir error before returning so misidentification is visible. Specifically, in DetectRuntime (function name) when os.ReadDir(policyDir) returns an error, call the existing logger (or fmt if no logger is available) to emit a clear message including the policyDir and the error, then continue to return "go" (preserving behavior). Ensure the log call is placed right where the ReadDir error is checked so the error context is captured.gateway/gateway-runtime/python-executor/executor/policy_loader.py-99-122 (1)
99-122:⚠️ Potential issue | 🟡 MinorEnsure policies are loaded before querying factories.
get_factorydoesn't verify thatload_policies()has been called. If a caller invokesget_factorybefore loading, it will always raiseKeyErroreven for valid policies.🛡️ Proposed fix to add a guard
def get_factory(self, name: str, version: str) -> PolicyFactory: """Get the factory function for a policy. Args: name: Policy name. version: Policy version. Returns: The get_policy factory callable. Raises: KeyError: If policy not found. """ + if not self._loaded: + raise RuntimeError("PolicyLoader.load_policies() must be called before get_factory()") + key = f"{name}:{version}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/policy_loader.py` around lines 99 - 122, get_factory currently queries self._factories without ensuring policies are loaded, so callers get KeyError if load_policies() wasn't run; modify get_factory in PolicyLoader to ensure policies are loaded before lookup (e.g., call self.load_policies() if not already loaded or check a boolean like self._loaded and call load_policies()), then proceed to check self._factories for key and fallback major version; this ensures valid policies are discoverable and avoids spurious KeyError.gateway/gateway-runtime/python-executor/executor/translator.py-82-87 (1)
82-87:⚠️ Potential issue | 🟡 MinorInconsistent body presence detection between request and response bodies.
The
request_bodyuses a truthiness check (line 83), which treats empty bytes as missing, whileresponse_bodyuses an explicitresponse_body_presentflag (line 90). To match the pattern and correctly handle empty request bodies, either add arequest_body_presentfield to theResponseContextproto definition, or refactor both to use consistent presence detection logic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/translator.py` around lines 82 - 87, Make request body presence detection consistent with the response by using an explicit presence flag: update the proto to add request_body_present on RequestContext and then in translator.py set request_body = Body(content=proto_ctx.request_body, present=proto_ctx.request_body_present) (referencing Body, proto_ctx.request_body and the new request_body_present field) so empty byte bodies are treated as present the same way response_body_present is used for responses.gateway/gateway-runtime/python-executor/executor/server.py-41-56 (1)
41-56:⚠️ Potential issue | 🟡 MinorThreadPoolExecutor is never shut down, risking resource leaks.
The
_executorThreadPoolExecutor created in__init__is not explicitly shut down when the server terminates. While Python's garbage collector will eventually clean it up, best practice is to callshutdown()on the executor during graceful termination to ensure all worker threads are properly joined.Consider adding a shutdown method to the servicer or tracking it in
PythonExecutorServer.shutdown().Proposed fix — add executor reference to server for cleanup
In
PythonExecutorServer, after server stop, shut down the servicer's executor:async def shutdown(self): """Shutdown: close all live policy instances, then stop server.""" logger.info("Shutting down Python Executor...") # Close all live instances instances = self._store.clear() for instance in instances: try: instance.close() except Exception as e: logger.warning(f"Error closing policy instance during shutdown: {e}") if self.server: await self.server.stop(grace_period=5) + + if hasattr(self, '_servicer') and self._servicer: + self._servicer._executor.shutdown(wait=True) + logger.info("Python Executor shutdown complete")You'll also need to store the servicer reference in
start():servicer = PythonExecutorServicer( self._loader, self._store, self.worker_count, self.max_concurrent ) + self._servicer = servicer proto_grpc.add_PythonExecutorServiceServicer_to_server(servicer, self.server)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 41 - 56, The ThreadPoolExecutor stored as self._executor in PythonExecutorServicer (created in __init__) is never shut down; add a shutdown method to PythonExecutorServicer that calls self._executor.shutdown(wait=True) (and optionally catch/log exceptions), then ensure the PythonExecutorServer stores the servicer instance created in start() and invokes servicer.shutdown() from PythonExecutorServer.shutdown()/stop to join worker threads and avoid resource leaks.
🧹 Nitpick comments (19)
gateway/gateway-runtime/python-executor/executor/instance_store.py (1)
27-30: Avoid exposing a mutable record from a thread-safe store.Line 49 returns the live
InstanceRecord, so callers can mutaterecord.policyoutside the lock. Freezing the dataclass, or returningPolicy | Nonedirectly, keeps the thread-safety boundary inside the store.Minimal hardening
-@dataclass +@dataclass(frozen=True, slots=True) class InstanceRecord: """Holds a policy instance.""" policy: PolicyAlso applies to: 49-52
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/instance_store.py` around lines 27 - 30, The InstanceRecord dataclass currently exposes a mutable policy (InstanceRecord.policy) to callers, breaking the store's thread-safety because a live InstanceRecord is returned (see the method that returns the record at the 49-52 region); fix by making InstanceRecord immutable (add frozen=True) or change the store API to return Policy | None (a copy or the policy object only) instead of returning the InstanceRecord instance, and update all callers to handle the new return type so mutations cannot occur outside the store's lock.gateway/gateway-builder/internal/policyengine/registry.go (1)
160-167: Placeholder implementation:formatSystemParamsalways returns"nil".This function ignores the input
paramsand always returns"nil". The comment indicates this needs proper Go map syntax. If system parameters are expected to be passed to policies, this will silently drop them.Is this intentional for this PR, or should I help implement proper map serialization? If deferred intentionally, consider adding a TODO comment or tracking issue.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/registry.go` around lines 160 - 167, The function formatSystemParams currently ignores its input and always returns "nil"; implement proper serialization of params into a Go literal map string so generated code receives the actual values: handle nil by returning "nil", iterate over params in formatSystemParams to produce a Go map[string]interface{} literal (quoting string keys/values, formatting numbers/bools, and recursively serializing nested maps or slices) and return that string, or if you intentionally want to defer, replace the body with a clear TODO comment referencing an issue/PR to avoid silently dropping parameters; update any callers expecting real params accordingly.gateway/gateway-runtime/Dockerfile (1)
157-191: Debug stage lacks Python Executor support.The
debugstage doesn't include Python Executor setup (no python3, no python-libs, no python-executor copy). If debugging Python policy issues is needed, this stage won't support it.If Python debugging is out of scope for now, consider adding a comment documenting this limitation. Otherwise, mirror the production stage's Python setup here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/Dockerfile` around lines 157 - 191, The debug Docker stage named "debug" currently lacks the Python Executor setup required for debugging Python policies; either add the same Python install and executor artifacts used in the production stage (install python3 and any python-libs, copy the python-executor binary or package from the builder stage, and set executable permissions on it) referencing the existing copy/chmod pattern used for "/app/policy-engine" and "/usr/local/bin/dlv" so the debug image can run Python policies, or if Python debugging is intentionally unsupported, add a clear comment in the "debug" stage explaining that Python Executor is omitted and why.gateway/gateway-runtime/python-executor/sdk/policy.py (1)
168-176: Consider markingclose()with a lint suppression or brief inline comment.The Ruff B027 warning flags this as an empty method in an ABC without
@abstractmethod. The current design is intentional (opt-in cleanup), but adding a brief# noqa: B027or a one-line comment clarifying the no-op intent would silence the linter and improve clarity for future readers.💡 Optional: suppress lint warning
def close(self) -> None: """Release resources held by this policy instance. Called when the route is removed or replaced (via DestroyPolicy RPC). Override to close connections, stop background threads, flush caches, etc. Must be idempotent — may be called multiple times. Default implementation does nothing. """ - pass + pass # noqa: B027 - intentional no-op default🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/sdk/policy.py` around lines 168 - 176, The empty, intentionally no-op method Policy.close() is triggering Ruff B027; update the method to explicitly indicate the intent by adding a brief inline comment or lint suppression (e.g., add either "# Intentional no-op; cleanup is optional" above the pass or append "pass # noqa: B027") so the linter and future readers know the no-op is deliberate while leaving the method callable/optional.gateway/gateway-runtime/proto/python_executor.proto (1)
20-22: Buf lint: Package directory mismatch.The proto package
wso2.gateway.python.v1doesn't match the file locationgateway/gateway-runtime/proto. This is a Buf style convention warning. If the project uses Buf for linting/breaking change detection, either:
- Move the proto to
wso2/gateway/python/v1/python_executor.proto, or- Configure Buf to ignore this rule if the current location is intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/proto/python_executor.proto` around lines 20 - 22, The proto package declaration wso2.gateway.python.v1 in python_executor.proto does not match its repository location, causing a Buf lint "package directory mismatch"; fix by either (A) moving the proto file so the directory structure matches the package (i.e., place python_executor.proto under directories corresponding to wso2/gateway/python/v1) or (B) if the current location is intentional, update the Buf configuration to ignore the package-directory-mismatch rule for this proto (or add a buf.module_reference/experimental configuration) so the linter won't flag wso2.gateway.python.v1; ensure the package line remains exactly "package wso2.gateway.python.v1;" when choosing the move option and validate with buf lint after the change.gateway/gateway-runtime/python-executor/main.py (1)
34-35: Add error handling for environment variable parsing.
int()will raiseValueErrorif the environment variable contains a non-integer value, causing the executor to crash with an unclear error message.♻️ Proposed fix with graceful defaults
-WORKER_COUNT = int(os.environ.get("PYTHON_POLICY_WORKERS", "4")) -MAX_CONCURRENT = int(os.environ.get("PYTHON_POLICY_MAX_CONCURRENT", "100")) +def _parse_int_env(name: str, default: int) -> int: + """Parse integer environment variable with fallback to default.""" + value = os.environ.get(name) + if value is None: + return default + try: + return int(value) + except ValueError: + logging.getLogger(__name__).warning( + f"Invalid integer for {name}='{value}', using default {default}" + ) + return default + +WORKER_COUNT = _parse_int_env("PYTHON_POLICY_WORKERS", 4) +MAX_CONCURRENT = _parse_int_env("PYTHON_POLICY_MAX_CONCURRENT", 100)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/main.py` around lines 34 - 35, Wrap the int() parsing of PYTHON_POLICY_WORKERS and PYTHON_POLICY_MAX_CONCURRENT in safe parsing with ValueError handling so bad env values don't crash startup: implement a small helper (e.g., parse_int_env(var_name, default)) or try/except around the conversions used to set WORKER_COUNT and MAX_CONCURRENT, fall back to the existing defaults (4 and 100) on parse failure, and emit a clear warning/log message (use the module logger or print) that includes the invalid value and the default being used; update references to WORKER_COUNT and MAX_CONCURRENT to use the sanitized values.gateway/gateway-builder/internal/validation/validator.go (1)
73-102: Consider handling unknown runtimes explicitly.The current logic treats any non-"python" runtime as Go, which could silently apply incorrect validations if new runtimes are added in the future.
♻️ Proposed enhancement for explicit runtime handling
if policy.Runtime == "python" { // Python-specific validations structErrors := ValidatePythonDirectoryStructure(policy) result.Errors = append(result.Errors, structErrors...) if len(structErrors) > 0 { result.Valid = false } - } else { + } else if policy.Runtime == "" || policy.Runtime == "go" { // Go-specific validations // Directory structure validation structErrors := ValidateDirectoryStructure(policy) + // ... existing code ... + } else { + result.Errors = append(result.Errors, types.ValidationError{ + PolicyName: policy.Name, + PolicyVersion: policy.Version, + FilePath: policy.Path, + Message: fmt.Sprintf("unsupported runtime: %s", policy.Runtime), + }) + result.Valid = false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/validation/validator.go` around lines 73 - 102, The code assumes any non-"python" runtime is Go; update the runtime handling in the validator to be explicit: check policy.Runtime for "python" (call ValidatePythonDirectoryStructure) and for "go" (call ValidateDirectoryStructure, ValidateGoInterface, ValidateGoMod), and add a default branch that appends a clear validation error to result.Errors and sets result.Valid = false when an unknown runtime is provided; reference the existing symbols policy.Runtime, ValidatePythonDirectoryStructure, ValidateDirectoryStructure, ValidateGoInterface, ValidateGoMod, result.Errors and result.Valid so the new logic replaces the current if/else block with an explicit switch or if/else-if branches plus the default error path.gateway/gateway-runtime/policy-engine/cmd/policy-engine/main.go (1)
242-251: Consider consolidating socket detection logic.The current implementation has duplicate code in both branches. Consider refactoring to determine the socket path first, then create the health checker once:
Suggested refactor
// Check if Python executor is available (socket configured) var pythonHealthChecker admin.PythonHealthChecker - if _, err := os.Stat(os.Getenv("PYTHON_EXECUTOR_SOCKET")); err == nil { - sm := pythonbridge.GetStreamManager() - pythonHealthChecker = pythonbridge.NewPythonHealthAdapter(sm) - } else if _, err := os.Stat("/var/run/api-platform/python-executor.sock"); err == nil { + pythonSocket := os.Getenv("PYTHON_EXECUTOR_SOCKET") + if pythonSocket == "" { + pythonSocket = "/var/run/api-platform/python-executor.sock" + } + if _, err := os.Stat(pythonSocket); err == nil { sm := pythonbridge.GetStreamManager() pythonHealthChecker = pythonbridge.NewPythonHealthAdapter(sm) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/cmd/policy-engine/main.go` around lines 242 - 251, The socket detection duplicates creation logic for pythonHealthChecker; refactor by first resolving the socket path (check os.Getenv("PYTHON_EXECUTOR_SOCKET") and if empty or not present fall back to "/var/run/api-platform/python-executor.sock"), then Stat that resolved path once and, if it exists, call pythonbridge.GetStreamManager() and pythonbridge.NewPythonHealthAdapter(sm) to assign pythonHealthChecker; finally pass pythonHealthChecker into admin.NewServer(&cfg.PolicyEngine.Admin, k, reg, xdsSyncStatusProvider, healthProvider, pythonHealthChecker).gateway/gateway-runtime/python-executor/executor/policy_loader.py (2)
95-97: Uselogging.exceptionto preserve stack trace.Catching a broad
Exceptionand usinglogger.errorloses the stack trace. Usinglogger.exceptionwill automatically include the traceback, which is valuable for debugging policy loading failures.♻️ Proposed fix
except Exception as e: - logger.error(f"Failed to load policy {policy_key} from {module_path}: {e}") + logger.exception(f"Failed to load policy {policy_key} from {module_path}") return False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/policy_loader.py` around lines 95 - 97, The except block in policy_loader.py that catches Exception and calls logger.error (for example in the load policy routine handling policy_key and module_path) should be changed to use logger.exception to preserve the traceback; update the exception handler around where policy loading occurs (the except Exception as e: block) to call logger.exception with a descriptive message including policy_key and module_path, then return False as before so the stack trace is included in logs for debugging.
86-89: Simplify attribute access.Using
getattrwith a constant attribute is unnecessary. Direct attribute access is cleaner and equivalent.♻️ Proposed fix
- factory = getattr(module, 'get_policy') - if not callable(factory): - logger.error(f"Policy module {module_path} 'get_policy' is not callable") - return False + factory = module.get_policy + if not callable(factory): + logger.error(f"Policy module {module_path} 'get_policy' is not callable") + return False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/policy_loader.py` around lines 86 - 89, Replace the getattr call with direct attribute access: instead of factory = getattr(module, 'get_policy'), use factory = module.get_policy; keep the existing callable check (if not callable(factory): ...). This simplifies access to the module's get_policy attribute while preserving the current behavior in policy_loader.py where the variable factory is used to validate and load the policy.gateway/gateway-builder/internal/discovery/manifest.go (1)
500-533:parseProcessingModesilently ignores unrecognized mode strings.Invalid values like
"INVALID"are silently ignored, leaving default values. Consider logging a warning for unrecognized values to help policy authors debug configuration issues.♻️ Proposed fix
func parseProcessingMode(config *policy.ProcessingModeConfig) *policy.ProcessingMode { if config == nil { // Default: process request headers only return &policy.ProcessingMode{ RequestHeaderMode: policy.HeaderModeProcess, RequestBodyMode: policy.BodyModeSkip, ResponseHeaderMode: policy.HeaderModeSkip, ResponseBodyMode: policy.BodyModeSkip, } } mode := &policy.ProcessingMode{ RequestHeaderMode: policy.HeaderModeProcess, RequestBodyMode: policy.BodyModeSkip, ResponseHeaderMode: policy.HeaderModeSkip, ResponseBodyMode: policy.BodyModeSkip, } if config.RequestHeaderMode == "SKIP" { mode.RequestHeaderMode = policy.HeaderModeSkip + } else if config.RequestHeaderMode != "" && config.RequestHeaderMode != "PROCESS" { + slog.Warn("Unrecognized RequestHeaderMode, using default", "value", config.RequestHeaderMode) } if config.RequestBodyMode == "BUFFER" { mode.RequestBodyMode = policy.BodyModeBuffer + } else if config.RequestBodyMode != "" && config.RequestBodyMode != "SKIP" { + slog.Warn("Unrecognized RequestBodyMode, using default", "value", config.RequestBodyMode) } // ... similar for other modes🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 500 - 533, parseProcessingMode currently ignores unrecognized string values (e.g., "INVALID"), leaving defaults silently; update parseProcessingMode to validate each config field (RequestHeaderMode, RequestBodyMode, ResponseHeaderMode, ResponseBodyMode) and emit a warning when a value is not one of the expected literals ("SKIP"/"PROCESS" for headers, "BUFFER" for bodies) so authors can debug misconfiguration; implement this by either (a) adding a logger parameter to parseProcessingMode and calling logger.Warnf with the field name and invalid value, or (b) using the package-level logger if one exists, and change the per-field if checks to explicit switches or maps that fall through to the warning path when encountering an unknown string, then keep existing defaults.gateway/gateway-runtime/python-executor/executor/translator.py (2)
221-226: Unused method_proto_list_to_python.This helper method is defined but never called. The list conversion is handled inline within
_proto_value_to_pythonat line 217.♻️ Proposed fix - remove unused method
- `@staticmethod` - def _proto_list_to_python(list_value: ListValue) -> list: - """Convert a protobuf ListValue to a Python list.""" - if list_value is None: - return [] - return [Translator._proto_value_to_python(v) for v in list_value.values]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/translator.py` around lines 221 - 226, The method _proto_list_to_python is unused (list conversion is done inline in _proto_value_to_python); remove the unused method _proto_list_to_python from translator.py and also delete any now-unreferenced imports or comments that only supported it, ensuring _proto_value_to_python remains the single conversion path and tests/other callers still use Translator._proto_value_to_python as before.
20-20: Unused import:NullValue.
NullValueis imported but not used in the code.♻️ Proposed fix
-from google.protobuf.struct_pb2 import Struct, Value, ListValue, NullValue +from google.protobuf.struct_pb2 import Struct, Value, ListValue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/translator.py` at line 20, The import line in translator.py includes an unused symbol NullValue from google.protobuf.struct_pb2; remove NullValue from the import list so it reads from google.protobuf.struct_pb2 import Struct, Value, ListValue (cleaning up the unused import and avoiding linter warnings), ensuring any references to NullValue in functions or classes (if present) are also handled or replaced before removing.gateway/gateway-builder/internal/policyengine/generator.go (2)
177-178: Log ignored errors when reading base requirements.Silently ignoring the error from
os.ReadFilecan hide misconfiguration. A debug log would help troubleshoot missing base requirements.♻️ Proposed fix
// 3. Merge requirements.txt files (including base requirements) baseReqPath := filepath.Join(pythonOutputDir, "requirements.txt") - baseRequirements, _ := os.ReadFile(baseReqPath) + baseRequirements, err := os.ReadFile(baseReqPath) + if err != nil && !os.IsNotExist(err) { + slog.Debug("Could not read base requirements.txt", "path", baseReqPath, "error", err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 177 - 178, The read of base requirements using os.ReadFile(baseReqPath) currently ignores errors; update the code around baseReqPath (constructed with filepath.Join and pythonOutputDir) to check the returned error, log a debug/error message that includes baseReqPath and the error details, and handle the failure (e.g., treat baseRequirements as empty or return the error up) instead of swallowing it; reference the variables baseReqPath and baseRequirements when adding the log and handling logic.
289-316:copyDirdoesn't handle symlinks.
filepath.Walkfollows symlinks by default, which could lead to unexpected behavior (infinite loops with circular symlinks, or copying unintended files). Consider usingfilepath.WalkDirand checking for symlinks.♻️ Proposed fix using WalkDir with symlink handling
// copyDir copies a directory recursively func copyDir(src, dst string) error { - return filepath.Walk(src, func(path string, info os.FileInfo, err error) error { + return filepath.WalkDir(src, func(path string, d os.DirEntry, err error) error { if err != nil { return err } + // Skip symlinks to avoid potential issues + if d.Type()&os.ModeSymlink != 0 { + slog.Debug("Skipping symlink during copy", "path", path) + return nil + } + // Compute relative path relPath, err := filepath.Rel(src, path) if err != nil { return err } dstPath := filepath.Join(dst, relPath) - if info.IsDir() { + if d.IsDir() { + info, err := d.Info() + if err != nil { + return err + } return os.MkdirAll(dstPath, info.Mode()) } // Copy file data, err := os.ReadFile(path) if err != nil { return err } + info, err := d.Info() + if err != nil { + return err + } return os.WriteFile(dstPath, data, info.Mode()) }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 289 - 316, copyDir currently uses filepath.Walk which follows symlinks and can cause loops or copy unintended files; rewrite copyDir to use filepath.WalkDir and inspect each DirEntry (or use os.Lstat on path) to detect if an entry is a symlink (ModeSymlink). For symlinks, do not follow them: call os.Readlink to get the link target and create a corresponding symlink at dst with os.Symlink (preserving relative targets when possible), and for regular files/dirs continue to create dirs or copy file contents as before. Ensure you avoid recursing into symlinked directories and preserve file modes for non-symlink files (as in the existing os.ReadFile/os.WriteFile and info.Mode usage).gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go (1)
89-95:grpc.Dial,grpc.WithBlock, andgrpc.WithTimeoutare deprecated. In grpc-go v1.63+,grpc.Dialis deprecated in favor ofgrpc.NewClient, and the "Dial-era options" (WithBlock,WithTimeout) are ignored byNewClient. Migrate togrpc.NewClientwith explicit context-based timeouts and readiness management (e.g., callconn.Connect()to wait for connection if needed).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go` around lines 89 - 95, The current call using grpc.Dial with grpc.WithBlock and grpc.WithTimeout is deprecated; replace it by creating a client with grpc.NewClient and an explicit context timeout: build a context.Context with context.WithTimeout for the 10s deadline, call grpc.NewClient (passing grpc.WithContextDialer(dialer) and grpc.WithTransportCredentials(insecure.NewCredentials()) as dial options), then use the returned client/connection's Connect(ctx) to wait for readiness and handle errors. Update the code paths that currently reference conn from grpc.Dial to use the new client/conn returned by grpc.NewClient and ensure cancellation of the timeout context after Connect returns.gateway/gateway-runtime/python-executor/executor/server.py (3)
313-319: Consider moving import to module level.The
google.protobuf.struct_pb2.Structimport inside_dict_to_structworks but is unconventional. Moving it to the module-level imports improves clarity and avoids repeated import overhead on each call.Proposed change
At the top of the file with other imports:
from google.protobuf.struct_pb2 import StructThen simplify the method:
`@staticmethod` def _dict_to_struct(d: dict): - from google.protobuf.struct_pb2 import Struct s = Struct() if d: s.update(d) return s🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 313 - 319, The import of Struct is done inside the static method _dict_to_struct which is unconventional and incurs repeated import overhead; move "from google.protobuf.struct_pb2 import Struct" to the module-level imports and then simplify _dict_to_struct to directly reference Struct (keeping the same behavior: create s = Struct(), update if d, return s).
160-247: Well-structured concurrent fan-out implementation.The bidirectional streaming with
asyncio.Conditionfor coordination,Semaphorefor concurrency limiting, and out-of-order response handling is correctly designed. The Go side correlating byrequest_idmakes this pattern safe.Minor style improvement per static analysis (line 237):
Optional: Use list unpacking instead of concatenation
- all_tasks = [reader_task] + list(in_flight) + all_tasks = [reader_task, *in_flight]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 160 - 247, The code builds all_tasks with list concatenation ([reader_task] + list(in_flight)) inside ExecuteStream; replace that concatenation with list unpacking (e.g., [reader_task, *in_flight]) to be more idiomatic and avoid an extra list() allocation—update the expression that constructs all_tasks (referenced alongside reader_task and the in_flight set) accordingly.
68-74: Useasyncio.get_running_loop()instead ofget_event_loop()in async contexts.When called from within an async function,
asyncio.get_event_loop()is discouraged and emits aDeprecationWarningin Python 3.10+. Useasyncio.get_running_loop()which is the recommended approach for accessing the event loop from inside coroutines.This pattern appears at lines 69, 117, and 182. Apply the fix:
Example fix
- loop = asyncio.get_event_loop() + loop = asyncio.get_running_loop()Apply to all three locations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 68 - 74, Replace asyncio.get_event_loop() with asyncio.get_running_loop() in the async contexts inside server.py where run_in_executor is used (specifically the calls that use self._init_policy_sync and the other synchronous handlers invoked via run_in_executor). Locate the three occurrences where the code does loop = asyncio.get_event_loop() before calling loop.run_in_executor(...) and change them to loop = asyncio.get_running_loop() so the coroutine uses the running event loop; keep the subsequent run_in_executor(...) calls and arguments unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 98f57ec5-1534-4663-97b6-bf83c074188c
⛔ Files ignored due to path filters (3)
gateway/gateway-builder/go.sumis excluded by!**/*.sumgateway/gateway-runtime/policy-engine/internal/pythonbridge/proto/python_executor.pb.gois excluded by!**/*.pb.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/proto/python_executor_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (52)
gateway/build-lock.yamlgateway/build.yamlgateway/gateway-builder/cmd/builder/main.gogateway/gateway-builder/go.modgateway/gateway-builder/internal/buildfile/generator.gogateway/gateway-builder/internal/discovery/discovery_test.gogateway/gateway-builder/internal/discovery/manifest.gogateway/gateway-builder/internal/discovery/policy.gogateway/gateway-builder/internal/discovery/python_module.gogateway/gateway-builder/internal/policyengine/generator.gogateway/gateway-builder/internal/policyengine/policyengine_test.gogateway/gateway-builder/internal/policyengine/registry.gogateway/gateway-builder/internal/validation/structure.gogateway/gateway-builder/internal/validation/validator.gogateway/gateway-builder/pkg/types/manifest.gogateway/gateway-builder/pkg/types/policy.gogateway/gateway-builder/templates/plugin_registry.go.tmplgateway/gateway-controller/default-policies/add-python-header.yamlgateway/gateway-controller/default-policies/prompt-compression.yamlgateway/gateway-controller/default-policies/slugify-body-python.yamlgateway/gateway-runtime/Dockerfilegateway/gateway-runtime/Makefilegateway/gateway-runtime/docker-entrypoint.shgateway/gateway-runtime/policy-engine/cmd/policy-engine/main.gogateway/gateway-runtime/policy-engine/internal/admin/handlers.gogateway/gateway-runtime/policy-engine/internal/admin/handlers_test.gogateway/gateway-runtime/policy-engine/internal/admin/server.gogateway/gateway-runtime/policy-engine/internal/admin/server_test.gogateway/gateway-runtime/policy-engine/internal/admin/types.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/client.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.gogateway/gateway-runtime/proto/python_executor.protogateway/gateway-runtime/python-executor/executor/__init__.pygateway/gateway-runtime/python-executor/executor/instance_store.pygateway/gateway-runtime/python-executor/executor/policy_loader.pygateway/gateway-runtime/python-executor/executor/server.pygateway/gateway-runtime/python-executor/executor/translator.pygateway/gateway-runtime/python-executor/main.pygateway/gateway-runtime/python-executor/proto/__init__.pygateway/gateway-runtime/python-executor/proto/python_executor_pb2.pygateway/gateway-runtime/python-executor/proto/python_executor_pb2_grpc.pygateway/gateway-runtime/python-executor/requirements.txtgateway/gateway-runtime/python-executor/sdk/__init__.pygateway/gateway-runtime/python-executor/sdk/policy.pygateway/sample-policies/build.yamlgateway/sample-policies/slugify-body-python/policy-definition.yamlgateway/sample-policies/slugify-body-python/policy.pygateway/sample-policies/slugify-body-python/requirements.txtgateway/target/build.yamlsdk/gateway/policy/v1alpha/definition.go
| func findPolicyDir(installDir string) (string, error) { | ||
| var found string | ||
| err := filepath.Walk(installDir, func(path string, info os.FileInfo, err error) error { | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if !info.IsDir() && info.Name() == "policy-definition.yaml" { | ||
| found = filepath.Dir(path) | ||
| return filepath.SkipAll | ||
| } | ||
| return nil | ||
| }) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| if found == "" { | ||
| return "", fmt.Errorf("no policy-definition.yaml found in installed package") | ||
| } | ||
| return found, nil |
There was a problem hiding this comment.
Require exactly one policy-definition.yaml in the installed tree.
This returns the first match it finds. If the install tree contains multiple policy definitions, discovery silently picks one path instead of failing fast on an ambiguous package.
Suggested fix
func findPolicyDir(installDir string) (string, error) {
- var found string
+ var matches []string
err := filepath.Walk(installDir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if !info.IsDir() && info.Name() == "policy-definition.yaml" {
- found = filepath.Dir(path)
- return filepath.SkipAll
+ matches = append(matches, filepath.Dir(path))
}
return nil
})
if err != nil {
return "", err
}
- if found == "" {
+ if len(matches) == 0 {
return "", fmt.Errorf("no policy-definition.yaml found in installed package")
}
- return found, nil
+ if len(matches) > 1 {
+ return "", fmt.Errorf("multiple policy-definition.yaml files found in installed package")
+ }
+ return matches[0], nil
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 165
- 183, The function findPolicyDir currently stops at the first
policy-definition.yaml and should instead ensure exactly one exists; modify
findPolicyDir to scan the installDir and detect multiple matches by either
collecting all matching directories or tracking the first found path and
returning an error if a second match is encountered (e.g., include both paths in
the error), and only return the single directory if exactly one match is found;
keep references to the existing function name findPolicyDir and the
"policy-definition.yaml" filename when implementing the change.
| func (b *PythonBridge) OnRequest(ctx *policy.RequestContext, params map[string]interface{}) policy.RequestAction { | ||
| req := b.buildRequest(ctx, "on_request", params) | ||
|
|
||
| // Execute via stream manager | ||
| resp, err := b.streamManager.Execute(context.Background(), req) | ||
| if err != nil { | ||
| b.slogger.Error("Failed to execute Python policy", "error", err, "phase", "on_request") | ||
| return b.errorResponse(err) | ||
| } |
There was a problem hiding this comment.
Request context is discarded in favor of context.Background().
Using context.Background() loses the original request's cancellation signals, deadlines, and tracing context. If the HTTP request is cancelled, the Python execution will continue unnecessarily.
🐛 Proposed fix - accept context parameter
The OnRequest and OnResponse methods should ideally receive a context.Context. If the SDK interface doesn't support this, consider extracting a context from RequestContext or using a context stored in the bridge.
For now, a workaround using a derived context:
func (b *PythonBridge) OnRequest(ctx *policy.RequestContext, params map[string]interface{}) policy.RequestAction {
req := b.buildRequest(ctx, "on_request", params)
+ // TODO: Propagate proper context when SDK interface supports it
+ execCtx, cancel := context.WithTimeout(context.Background(), getTimeout())
+ defer cancel()
+
// Execute via stream manager
- resp, err := b.streamManager.Execute(context.Background(), req)
+ resp, err := b.streamManager.Execute(execCtx, req)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 57 - 65, The code is discarding the incoming request context by calling
streamManager.Execute with context.Background(); change it to derive and pass
the real request context from the RequestContext (e.g., use ctx.Context(),
ctx.Request.Context(), or the bridge-stored context) when calling
streamManager.Execute in PythonBridge.OnRequest, and make the equivalent change
in PythonBridge.OnResponse so cancellation, deadlines and tracing propagate into
buildRequest/streamManager.Execute.
|
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/sample-policies/build.yaml`:
- Around line 14-15: The manifest uses the wrong directory name for the sample:
update the filePath value (the entry.FilePath that the gateway builder resolves
and validates) to match the actual sample directory (e.g., change
"./slugify-body" to the exact directory name "./slugify-body-python") so the
os.Stat() check in the discovery/manifest code and path validation in
fsutil/path_validation.go will succeed; ensure the filePath string exactly
matches the real directory name used in the repository.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 51f78fcc-54da-4b5b-bdc4-5ff7c9d3fe38
📒 Files selected for processing (1)
gateway/sample-policies/build.yaml
Update example PyPI package name in discovery code and tests from "wso2-gateway-policy-prompt-compression" to "my-gateway-policy" (comments and TestParsePipPackageRef). Remove two default policy YAMLs: gateway-controller/default-policies/add-python-header.yaml and prompt-compression.yaml.
Update .gitignore to exclude common Python artifacts (__pycache__, .venv, .pytest_cache, .coverage, .mypy_cache, .ipynb_checkpoints, egg metadata, etc.). Remove two sample Python policies (slugify-body-python and uppercase-body) from gateway/build-lock.yaml. Tweak gateway-runtime/docker-entrypoint.sh to check for python_policy_registry.py (generated only when Python policies exist) before starting the Python executor to avoid launching it when no Python policies are present.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
gateway/gateway-builder/internal/discovery/python_module.go (3)
165-184:⚠️ Potential issue | 🟠 MajorFail fast when multiple
policy-definition.yamlfiles are found.The function returns the first match found. If the installed package contains multiple policy definitions (e.g., from transitive dependencies), discovery silently picks one instead of failing with an ambiguity error.
Suggested fix to detect and reject ambiguous packages
func findPolicyDir(installDir string) (string, error) { - var found string + var matches []string err := filepath.Walk(installDir, func(path string, info os.FileInfo, err error) error { if err != nil { return err } if !info.IsDir() && info.Name() == "policy-definition.yaml" { - found = filepath.Dir(path) - return filepath.SkipAll + matches = append(matches, filepath.Dir(path)) } return nil }) if err != nil { return "", err } - if found == "" { + if len(matches) == 0 { return "", fmt.Errorf("no policy-definition.yaml found in installed package") } - return found, nil + if len(matches) > 1 { + return "", fmt.Errorf("multiple policy-definition.yaml files found in installed package: %v", matches) + } + return matches[0], nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 165 - 184, The function findPolicyDir currently stops at the first "policy-definition.yaml" (uses filepath.SkipAll) and returns that, which hides ambiguity; change it to collect all matching directories during filepath.Walk (remove filepath.SkipAll) into a slice or map to deduplicate, track the number of unique matches, and after the walk return an error if count == 0 or count > 1 (ambiguous) and only return the single directory when exactly one match is found; update variables used in findPolicyDir (e.g., replace/augment found with a matches slice or set) and ensure error paths propagate the new ambiguous error.
69-127:⚠️ Potential issue | 🟠 MajorRedact credentials from logs and error messages.
The
pipPackagereference andindexURLmay contain credentials when using private indexes (e.g.,https://user:token@host/simple). Line 69 logs the full reference, line 92 logsindexURLdirectly, and line 111 includes rawstderrwhich may echo credentials.Suggested fix to sanitize sensitive data
- slog.Info("Fetching pip package", "reference", pipPackage, "phase", "discovery") - pkgName, version, indexURL, err := ParsePipPackageRef(pipPackage) if err != nil { return nil, err } + slog.Info("Fetching pip package", + "package", pkgName, + "version", version, + "phase", "discovery") // ... later ... slog.Debug("Running pip install", "package", pipSpec, "target", installDir, - "indexURL", indexURL, + "hasCustomIndex", indexURL != "", "phase", "discovery") // ... in error handling ... - return nil, fmt.Errorf("pip install failed for %s: %w; stderr: %s", pipSpec, err, stderr.String()) + return nil, fmt.Errorf("pip install failed for %s: %w", pipSpec, err)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 69 - 127, The logs and error messages expose credentials from pipPackage, indexURL and pip stderr; update discovery code in python_module.go to sanitize/mask credentials before logging or including in errors: add a helper (e.g., sanitizeURL or maskCredentials) that strips user:pass@ from URLs and use it when logging pipPackage, indexURL and when building the pip error message (use sanitized stderr or redact any URLs found in stderr) in the pip install block (references: pipPackage, indexURL, stderr, pipSpec, the exec.CommandContext call and resolvePipExecutable call); ensure RemoveAll(installDir) remains but never logs raw credential-bearing strings.
139-160:⚠️ Potential issue | 🟠 MajorFix parsing for index URLs containing credentials.
Using
strings.LastIndex(ref, "@")misparses values likepkg==1.0.0@https://user:token@host/simplebecause the last@is inside the URL, not the separator. The version would incorrectly absorb part of the URL.Suggested fix: split by `==` first, then handle index URL
func ParsePipPackageRef(ref string) (pkgName, version, indexURL string, err error) { - // Split off optional @<index-url> suffix - mainPart := ref - if idx := strings.LastIndex(ref, "@"); idx > 0 { - candidate := ref[idx+1:] - // Only treat as index URL if it looks like a URL (contains "://") - if strings.Contains(candidate, "://") { - indexURL = candidate - mainPart = ref[:idx] - } - } - - // Split "package==version" - parts := strings.SplitN(mainPart, "==", 2) - if len(parts) != 2 || parts[0] == "" || parts[1] == "" { - return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>': %s", ref) + // Split "package==version[`@index-url`]" by == first + parts := strings.SplitN(ref, "==", 2) + if len(parts) != 2 { + return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]': %s", ref) } pkgName = strings.TrimSpace(parts[0]) - version = strings.TrimSpace(parts[1]) + versionPart := strings.TrimSpace(parts[1]) + + // Check for optional @<index-url> suffix in the version part + if idx := strings.Index(versionPart, "@"); idx >= 0 { + version = strings.TrimSpace(versionPart[:idx]) + indexURL = strings.TrimSpace(versionPart[idx+1:]) + if !strings.Contains(indexURL, "://") { + return "", "", "", fmt.Errorf("invalid pipPackage format, index URL must contain '://': %s", ref) + } + } else { + version = versionPart + } + + if pkgName == "" || version == "" { + return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]': %s", ref) + } return pkgName, version, indexURL, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 139 - 160, ParsePipPackageRef currently finds the separator "@" before splitting by "==", which breaks when the index URL contains credentials; change the logic to first split ref by "==" (using strings.SplitN(ref, "==", 2)) to get pkg and rightSide, validate both parts, then detect an index URL inside rightSide by locating "://"; if a scheme is found, find the separator '@' that precedes that scheme (e.g., use strings.LastIndex on rightSide up to the scheme index) and treat the substring after that '@' as indexURL and the substring before it as version; update variables pkgName, version, indexURL in ParsePipPackageRef accordingly and keep existing error handling for invalid formats.
🧹 Nitpick comments (3)
gateway/gateway-builder/internal/discovery/python_module.go (2)
76-80: Document temp directory cleanup responsibility.The
installDiris created but only cleaned up on error paths. On success,PipPackageInfo.Dirpoints inside this temp directory, so the caller must handle cleanup. Consider documenting this contract or providing a cleanup helper.Suggested documentation addition
// FetchPipPackage downloads a Python policy package using pip. // // Format: "<package>==<version>[@<index-url>]" // // Examples: // - "my-gateway-policy==1.0.0" // - "my-org-auth-policy==2.3.0@https://pypi.my-company.com/simple" // // The package is installed into an isolated directory using pip install --target. +// +// NOTE: On success, the returned PipPackageInfo.Dir resides within a temporary +// directory that the caller is responsible for cleaning up when no longer needed. +// Use os.RemoveAll(filepath.Dir(info.Dir)) or track the parent temp directory. func FetchPipPackage(pipPackage string) (*PipPackageInfo, error) {Also applies to: 117-121
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 76 - 80, The temp install directory created by os.MkdirTemp (installDir) is only removed on error, yet PipPackageInfo.Dir is set to a path inside that temp dir on success; update the code and/or docs to make cleanup responsibility explicit: either add a clear comment/docstring on the function that returns *PipPackageInfo stating callers must remove PipPackageInfo.Dir/installDir when done, or provide a cleanup helper (e.g., a Close/Remove method or a returned cleanup func) alongside PipPackageInfo to encapsulate os.RemoveAll of installDir; ensure references to installDir and PipPackageInfo.Dir (and any created helper) are added where the function is defined and where PipPackageInfo is constructed so callers know how to release the temp directory.
41-49: Fallback topython3without validation may cause confusing failures.If neither
pip3norpipis found, the function silently returnspython3without verifying it exists. This could produce a confusing "executable not found" error later duringcmd.Run().Suggested fix to validate fallback
func resolvePipExecutable() (exe string, prefixArgs []string) { for _, candidate := range []string{"pip3", "pip"} { if _, err := exec.LookPath(candidate); err == nil { return candidate, nil } } // Fallback: python3 -m pip + if _, err := exec.LookPath("python3"); err != nil { + // Return python3 anyway; caller will get a clear error from exec + slog.Warn("No pip executable found; falling back to python3 -m pip", "phase", "discovery") + } return "python3", []string{"-m", "pip"} }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 41 - 49, The fallback in resolvePipExecutable should validate that "python3" exists before returning it; update resolvePipExecutable to call exec.LookPath("python3") and only return "python3" with []string{"-m","pip"} if LookPath succeeds, otherwise return an explicit missing indicator (e.g., "" and nil) so callers know no pip executable was found; also update the caller(s) that build/Run exec.Command (where cmd.Run() is used) to check for an empty exe and return a clear error message instead of letting exec.Command/Run fail with a confusing "executable not found" error.gateway/gateway-builder/internal/discovery/discovery_test.go (1)
871-908: Consider adding a test case for credentialed index URLs.The
ParsePipPackageReftests cover basic scenarios, but a test case with credentials in the index URL (e.g.,pkg==1.0.0@https://user:token@host/simple) would help verify correct parsing behavior for private registries using basic auth.Example test case
func TestParsePipPackageRef_WithCredentialedIndex(t *testing.T) { pkg, version, indexURL, err := ParsePipPackageRef("my-policy==1.0.0@https://user:token@pypi.private.com/simple") require.NoError(t, err) assert.Equal(t, "my-policy", pkg) assert.Equal(t, "1.0.0", version) assert.Equal(t, "https://user:token@pypi.private.com/simple", indexURL) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/discovery_test.go` around lines 871 - 908, Add a unit test that verifies ParsePipPackageRef correctly parses index URLs containing credentials by adding a new test (e.g., TestParsePipPackageRef_WithCredentialedIndex) that calls ParsePipPackageRef with an input like "my-policy==1.0.0@https://user:token@pypi.private.com/simple", asserts no error, and checks pkg == "my-policy", version == "1.0.0" and indexURL == "https://user:token@pypi.private.com/simple"; place the test alongside the existing tests in discovery_test.go so it covers credentialed private registries and ensures ParsePipPackageRef handles user:password in the host component.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/docker-entrypoint.sh`:
- Around line 104-107: The exported environment variables for the Python
executor are incorrect: replace or add exports so they match what main.py reads
(use PYTHON_POLICY_MAX_CONCURRENT with the same default of 100 instead of
PYTHON_POLICY_TIMEOUT), keep PYTHON_POLICY_WORKERS and PYTHON_EXECUTOR_SOCKET
as-is, and remove the unused PYTHON_POLICY_TIMEOUT export; also update the
related startup log message (the log near the executor export block) to report
PYTHON_POLICY_MAX_CONCURRENT (and its value) instead of PYTHON_POLICY_TIMEOUT so
logs reflect the actual configuration used by main.py.
- Around line 71-78: The --py.* flags collected into PY_ARGS are never used
because main.py reads only environment variables; update docker-entrypoint.sh to
translate any --py.<name> <value> pairs into environment variables before
invoking main.py (e.g., convert --py.foo bar into export FOO=bar or the specific
env var names main.py expects), remove the unused accumulation into PY_ARGS, and
ensure PY_ARGS is not passed to python; alternatively, if you prefer CLI
support, add argument parsing in main.py to consume these flags (reference:
PY_ARGS, docker-entrypoint.sh, main.py, Python Executor) so the flags actually
affect runtime configuration.
---
Duplicate comments:
In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 165-184: The function findPolicyDir currently stops at the first
"policy-definition.yaml" (uses filepath.SkipAll) and returns that, which hides
ambiguity; change it to collect all matching directories during filepath.Walk
(remove filepath.SkipAll) into a slice or map to deduplicate, track the number
of unique matches, and after the walk return an error if count == 0 or count > 1
(ambiguous) and only return the single directory when exactly one match is
found; update variables used in findPolicyDir (e.g., replace/augment found with
a matches slice or set) and ensure error paths propagate the new ambiguous
error.
- Around line 69-127: The logs and error messages expose credentials from
pipPackage, indexURL and pip stderr; update discovery code in python_module.go
to sanitize/mask credentials before logging or including in errors: add a helper
(e.g., sanitizeURL or maskCredentials) that strips user:pass@ from URLs and use
it when logging pipPackage, indexURL and when building the pip error message
(use sanitized stderr or redact any URLs found in stderr) in the pip install
block (references: pipPackage, indexURL, stderr, pipSpec, the
exec.CommandContext call and resolvePipExecutable call); ensure
RemoveAll(installDir) remains but never logs raw credential-bearing strings.
- Around line 139-160: ParsePipPackageRef currently finds the separator "@"
before splitting by "==", which breaks when the index URL contains credentials;
change the logic to first split ref by "==" (using strings.SplitN(ref, "==", 2))
to get pkg and rightSide, validate both parts, then detect an index URL inside
rightSide by locating "://"; if a scheme is found, find the separator '@' that
precedes that scheme (e.g., use strings.LastIndex on rightSide up to the scheme
index) and treat the substring after that '@' as indexURL and the substring
before it as version; update variables pkgName, version, indexURL in
ParsePipPackageRef accordingly and keep existing error handling for invalid
formats.
---
Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/discovery_test.go`:
- Around line 871-908: Add a unit test that verifies ParsePipPackageRef
correctly parses index URLs containing credentials by adding a new test (e.g.,
TestParsePipPackageRef_WithCredentialedIndex) that calls ParsePipPackageRef with
an input like "my-policy==1.0.0@https://user:token@pypi.private.com/simple",
asserts no error, and checks pkg == "my-policy", version == "1.0.0" and indexURL
== "https://user:token@pypi.private.com/simple"; place the test alongside the
existing tests in discovery_test.go so it covers credentialed private registries
and ensures ParsePipPackageRef handles user:password in the host component.
In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 76-80: The temp install directory created by os.MkdirTemp
(installDir) is only removed on error, yet PipPackageInfo.Dir is set to a path
inside that temp dir on success; update the code and/or docs to make cleanup
responsibility explicit: either add a clear comment/docstring on the function
that returns *PipPackageInfo stating callers must remove
PipPackageInfo.Dir/installDir when done, or provide a cleanup helper (e.g., a
Close/Remove method or a returned cleanup func) alongside PipPackageInfo to
encapsulate os.RemoveAll of installDir; ensure references to installDir and
PipPackageInfo.Dir (and any created helper) are added where the function is
defined and where PipPackageInfo is constructed so callers know how to release
the temp directory.
- Around line 41-49: The fallback in resolvePipExecutable should validate that
"python3" exists before returning it; update resolvePipExecutable to call
exec.LookPath("python3") and only return "python3" with []string{"-m","pip"} if
LookPath succeeds, otherwise return an explicit missing indicator (e.g., "" and
nil) so callers know no pip executable was found; also update the caller(s) that
build/Run exec.Command (where cmd.Run() is used) to check for an empty exe and
return a clear error message instead of letting exec.Command/Run fail with a
confusing "executable not found" error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 61a52953-9abb-496b-9acb-1ca0b85a9b8c
📒 Files selected for processing (5)
.gitignoregateway/build-lock.yamlgateway/gateway-builder/internal/discovery/discovery_test.gogateway/gateway-builder/internal/discovery/python_module.gogateway/gateway-runtime/docker-entrypoint.sh
✅ Files skipped from review due to trivial changes (1)
- .gitignore
Fetch pip wheels (no deps) for Python policies during discovery, extract top_level module and source for build-time validation, and record pip spec/index so dependency installation can be deferred to the python-deps Docker stage. Added helpers to find the downloaded .whl, read top_level.txt, and extract the module from the wheel. Adjusted FetchPipPackage to use `pip download --no-deps` and return PipPackageInfo with PipSpec, PipIndexURL, TopLevelModule and extracted Dir. Updated GeneratePythonArtifacts to skip copying pip packages, use the package's top-level module in the generated python registry, and merge requirements by adding pip specs and extra index URLs while reading requirements.txt only for local policies. Extended DiscoveredPolicy with pip-related fields and clarified PythonSourceDir usage.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
gateway/gateway-builder/internal/policyengine/generator.go (3)
236-250: Consider handling malformed version strings gracefully.
strings.Split(p.Version, ".")[0]will panic with an index-out-of-range ifVersionis an empty string. While upstream validation should prevent this, defensive coding would add robustness.♻️ Proposed defensive check
for _, p := range policies { - majorVersion := strings.Split(p.Version, ".")[0] + versionParts := strings.Split(p.Version, ".") + majorVersion := "0" + if len(versionParts) > 0 && versionParts[0] != "" { + majorVersion = versionParts[0] + } key := fmt.Sprintf("%s:%s", p.Name, majorVersion)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 236 - 250, The loop over policies can panic when p.Version is empty; in the block that computes majorVersion (currently using strings.Split(p.Version, ".")[0]) add a defensive check: trim and verify p.Version is non-empty (or use strings.SplitN and default to "0") before indexing, then use that safe majorVersion when building key; update the code around variables majorVersion, key and the sb.WriteString call and keep sanitizeModuleName(p.Name) and the IsPipPackage branch unchanged.
336-361:copyDirfollows symlinks, which could lead to unexpected behavior or security issues.
filepath.Walkfollows symlinks by default. If a policy directory contains a symlink pointing outside the source tree (or a circular symlink), this could copy unintended files or cause infinite loops.♻️ Proposed fix to skip symlinks
func copyDir(src, dst string) error { return filepath.Walk(src, func(path string, info os.FileInfo, err error) error { if err != nil { return err } + + // Skip symlinks to avoid following links outside the source tree + if info.Mode()&os.ModeSymlink != 0 { + return nil + } // Compute relative path relPath, err := filepath.Rel(src, path)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 336 - 361, The copyDir function currently follows symlinks via filepath.Walk; update it to detect and skip symlinks so external or circular links aren't followed. Inside copyDir (or when switching to filepath.WalkDir), use os.Lstat (or DirEntry.Type()/IsSymlink) to check whether the current path is a symlink and, if so, return nil to skip it; otherwise proceed with creating directories and copying files using the file's regular mode. Ensure both directory and file handling branches perform the symlink check so no symlink targets are copied or traversed.
329-333:sanitizeModuleNamemay not handle all invalid Python identifier characters.Python module names cannot contain spaces, dots, or start with digits. Currently only hyphens are replaced with underscores. Consider validating/sanitizing more comprehensively to prevent import errors at runtime.
♻️ Proposed more thorough sanitization
// sanitizeModuleName converts a policy name to a valid Python module name func sanitizeModuleName(name string) string { - // Replace hyphens with underscores - return strings.ReplaceAll(name, "-", "_") + // Replace hyphens with underscores + result := strings.ReplaceAll(name, "-", "_") + // Replace dots with underscores + result = strings.ReplaceAll(result, ".", "_") + // Ensure it doesn't start with a digit + if len(result) > 0 && result[0] >= '0' && result[0] <= '9' { + result = "_" + result + } + return result }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 329 - 333, sanitizeModuleName currently only replaces hyphens and can still produce invalid Python module names; update sanitizeModuleName to (1) normalize to lowercase, (2) replace any sequence of non-alphanumeric/underscore characters (including spaces, dots, punctuation) with a single underscore, (3) trim leading/trailing underscores, (4) if the result starts with a digit prefix it with an underscore, and (5) fallback to a safe constant like "module" when the result is empty — apply these rules in the sanitizeModuleName function so generated names are valid Python identifiers and safe for imports.gateway/gateway-builder/internal/discovery/manifest.go (2)
516-549:parseProcessingModeuses case-sensitive string comparison.The mode strings are compared exactly (e.g.,
"SKIP","BUFFER"). If the YAML contains lowercase or mixed-case values (e.g.,"skip","buffer"), they won't be recognized and the default will apply silently. Consider normalizing to uppercase for robustness.♻️ Proposed fix to normalize case
+import "strings" + func parseProcessingMode(config *policy.ProcessingModeConfig) *policy.ProcessingMode { if config == nil { // Default: process request headers only return &policy.ProcessingMode{ RequestHeaderMode: policy.HeaderModeProcess, RequestBodyMode: policy.BodyModeSkip, ResponseHeaderMode: policy.HeaderModeSkip, ResponseBodyMode: policy.BodyModeSkip, } } mode := &policy.ProcessingMode{ RequestHeaderMode: policy.HeaderModeProcess, RequestBodyMode: policy.BodyModeSkip, ResponseHeaderMode: policy.HeaderModeSkip, ResponseBodyMode: policy.BodyModeSkip, } - if config.RequestHeaderMode == "SKIP" { + if strings.ToUpper(config.RequestHeaderMode) == "SKIP" { mode.RequestHeaderMode = policy.HeaderModeSkip } - if config.RequestBodyMode == "BUFFER" { + if strings.ToUpper(config.RequestBodyMode) == "BUFFER" { mode.RequestBodyMode = policy.BodyModeBuffer } - if config.ResponseHeaderMode == "PROCESS" { + if strings.ToUpper(config.ResponseHeaderMode) == "PROCESS" { mode.ResponseHeaderMode = policy.HeaderModeProcess } - if config.ResponseBodyMode == "BUFFER" { + if strings.ToUpper(config.ResponseBodyMode) == "BUFFER" { mode.ResponseBodyMode = policy.BodyModeBuffer } return mode }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 516 - 549, The parseProcessingMode function compares config fields case-sensitively so values like "skip" or "buffer" are ignored; normalize the incoming strings before checks (e.g., call strings.ToUpper or use strings.EqualFold) when inspecting config.RequestHeaderMode, config.RequestBodyMode, config.ResponseHeaderMode, and config.ResponseBodyMode, then set mode.RequestHeaderMode, mode.RequestBodyMode, mode.ResponseHeaderMode, and mode.ResponseBodyMode accordingly; ensure you import strings if using ToUpper.
238-257:DetectRuntimeonly checks top-level directory for.pyfiles.If a Python policy has its
.pyfiles in a subdirectory (e.g.,src/policy.py), this detection will incorrectly classify it as Go. Consider whether the convention requires top-level.pyfiles or if recursive checking is needed.♻️ Alternative: Check for policy.py specifically
func DetectRuntime(policyDir string) string { goMod := filepath.Join(policyDir, "go.mod") if _, err := os.Stat(goMod); err == nil { return "go" } - entries, err := os.ReadDir(policyDir) - if err != nil { - return "go" - } - for _, e := range entries { - if !e.IsDir() && filepath.Ext(e.Name()) == ".py" { - return "python" - } - } + // Check for policy.py which is required for Python policies + policyPy := filepath.Join(policyDir, "policy.py") + if _, err := os.Stat(policyPy); err == nil { + return "python" + } return "go" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 238 - 257, DetectRuntime currently only inspects the policyDir top-level entries and will miss Python policies whose .py files live in subdirectories; update DetectRuntime to recursively scan the policyDir (e.g., use filepath.WalkDir or a recursive os.ReadDir) and return "python" if any file with extension ".py" is found (or, alternatively, specifically look for "policy.py" anywhere in the tree), while preserving the existing error handling and returning "go" only if no Python file is discovered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 64-139: FetchPipPackage leaves the temporary downloadDir behind on
success; after calling findWheelFile and extractModuleFromWheel you must clean
up downloadDir before returning. Modify FetchPipPackage so that once extractDir
is prepared it removes the temporary downloadDir (e.g.
os.RemoveAll(downloadDir)) prior to the final return; if extractModuleFromWheel
currently returns a path inside downloadDir, first create a separate temp
directory (or move/copy the extracted module to a new temp dir) and set
PipPackageInfo.Dir to that new location, then remove downloadDir to avoid
deleting the returned extractDir; refer to FetchPipPackage, findWheelFile, and
extractModuleFromWheel to locate the relevant logic.
In `@gateway/gateway-builder/internal/policyengine/generator.go`:
- Around line 191-197: The code currently ignores the error from os.ReadFile
when loading baseReqPath; change the logic in the block around baseReqPath :=
filepath.Join(pythonOutputDir, "requirements.txt") and baseRequirements, _ :=
os.ReadFile(baseReqPath) to properly handle errors: call os.ReadFile and if it
returns an error check if os.IsNotExist(err) then treat baseRequirements as an
empty string, otherwise return the error (propagate it) so permission or I/O
errors are not silenced; keep the subsequent mergeRequirements(pythonPolicies,
string(baseRequirements)) call unchanged.
---
Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/manifest.go`:
- Around line 516-549: The parseProcessingMode function compares config fields
case-sensitively so values like "skip" or "buffer" are ignored; normalize the
incoming strings before checks (e.g., call strings.ToUpper or use
strings.EqualFold) when inspecting config.RequestHeaderMode,
config.RequestBodyMode, config.ResponseHeaderMode, and config.ResponseBodyMode,
then set mode.RequestHeaderMode, mode.RequestBodyMode, mode.ResponseHeaderMode,
and mode.ResponseBodyMode accordingly; ensure you import strings if using
ToUpper.
- Around line 238-257: DetectRuntime currently only inspects the policyDir
top-level entries and will miss Python policies whose .py files live in
subdirectories; update DetectRuntime to recursively scan the policyDir (e.g.,
use filepath.WalkDir or a recursive os.ReadDir) and return "python" if any file
with extension ".py" is found (or, alternatively, specifically look for
"policy.py" anywhere in the tree), while preserving the existing error handling
and returning "go" only if no Python file is discovered.
In `@gateway/gateway-builder/internal/policyengine/generator.go`:
- Around line 236-250: The loop over policies can panic when p.Version is empty;
in the block that computes majorVersion (currently using
strings.Split(p.Version, ".")[0]) add a defensive check: trim and verify
p.Version is non-empty (or use strings.SplitN and default to "0") before
indexing, then use that safe majorVersion when building key; update the code
around variables majorVersion, key and the sb.WriteString call and keep
sanitizeModuleName(p.Name) and the IsPipPackage branch unchanged.
- Around line 336-361: The copyDir function currently follows symlinks via
filepath.Walk; update it to detect and skip symlinks so external or circular
links aren't followed. Inside copyDir (or when switching to filepath.WalkDir),
use os.Lstat (or DirEntry.Type()/IsSymlink) to check whether the current path is
a symlink and, if so, return nil to skip it; otherwise proceed with creating
directories and copying files using the file's regular mode. Ensure both
directory and file handling branches perform the symlink check so no symlink
targets are copied or traversed.
- Around line 329-333: sanitizeModuleName currently only replaces hyphens and
can still produce invalid Python module names; update sanitizeModuleName to (1)
normalize to lowercase, (2) replace any sequence of non-alphanumeric/underscore
characters (including spaces, dots, punctuation) with a single underscore, (3)
trim leading/trailing underscores, (4) if the result starts with a digit prefix
it with an underscore, and (5) fallback to a safe constant like "module" when
the result is empty — apply these rules in the sanitizeModuleName function so
generated names are valid Python identifiers and safe for imports.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b0f49b13-b8e2-4ac5-94d6-55e8ff0806a4
📒 Files selected for processing (4)
gateway/gateway-builder/internal/discovery/manifest.gogateway/gateway-builder/internal/discovery/python_module.gogateway/gateway-builder/internal/policyengine/generator.gogateway/gateway-builder/pkg/types/policy.go
Add sanitizeURL to redact credentials from index URLs in logs and errors to avoid leaking secrets. Update FetchPipPackage to log a sanitized indexURL, replace occurrences of the raw index URL in pip stderr on failure, and store the sanitized value in PipPackageInfo.IndexURL. Improve ParsePipPackageRef parsing and validation: split on '==', handle an '@<index-url>' suffix in the version part, trim values, and return clearer errors for invalid formats.
Add a unit test for parsing pip refs with credentialed indexes. Harden pip package fetching: document temporary dir ownership, improve pip executable resolution and return a clear error when no pip/python3 is found. Wire up CLI args in the Python executor (socket, workers, max-concurrent, log-level) so docker-entrypoint can override env defaults; update startup logging. Rename/replace PYTHON_POLICY_TIMEOUT with PYTHON_POLICY_MAX_CONCURRENT in the docker entrypoint and logs. Fix StreamManager.Connect to cancel/close previous streams/connections before creating new ones to avoid resource leaks. Adjust Python gRPC server shutdown call (use grace) and reduce sensitive logging in the sample slugify policy (log lengths instead of content). Update sample build file path for the Python slugify policy.
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
gateway/gateway-builder/internal/discovery/discovery_test.go (1)
535-555:⚠️ Potential issue | 🟠 MajorDon't treat missing or ambiguous policy dirs as Go.
These assertions lock in the fallback that makes an empty or typo'd Python policy path surface as a Go-specific
go.modfailure (see Line 554).DetectRuntimeshould return an explicit unknown/error state here so discovery can report the actual problem.Also applies to: 855-867
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/discovery_test.go` around lines 535 - 555, The test reveals that DiscoverPoliciesFromBuildFile currently treats ambiguous/missing runtime files as Go and surfaces a go.mod error; update DetectRuntime to return an explicit unknown/err state (not default to Go) when it cannot determine runtime (no go.mod and no .py files), and modify DiscoverPoliciesFromBuildFile to check for that unknown result and return a clear, contextual error (e.g., "unable to detect runtime for policy X" or propagate the DetectRuntime error) instead of proceeding to Go-specific parsing; locate references to DetectRuntime and the runtime-switch logic inside DiscoverPoliciesFromBuildFile to implement this behavior and add corresponding error handling paths so the tests and other callers get an explicit unknown-runtime error.
♻️ Duplicate comments (2)
gateway/gateway-builder/internal/discovery/python_module.go (2)
97-100:⚠️ Potential issue | 🟠 MajorStill leaking
downloadDiron the success path.After
extractModuleFromWheelsucceeds, everything needed lives underextractDir; returning here without cleanup leaves the downloaded wheel temp directory behind on every successful fetch.🧹 Proposed fix
downloadDir, err := os.MkdirTemp("", "pip-download-*") if err != nil { return nil, fmt.Errorf("failed to create temp directory: %w", err) } + defer os.RemoveAll(downloadDir)Also applies to: 152-172
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 97 - 100, The temp download directory (downloadDir) created with os.MkdirTemp in python_module.go is not removed on the success path, leaking files; update the logic around extractModuleFromWheel to ensure downloadDir is cleaned up after successful extraction (either add a defer os.RemoveAll(downloadDir) immediately after creating downloadDir, or explicitly call os.RemoveAll(downloadDir) after extractModuleFromWheel returns successfully), and apply the same cleanup fix to the other occurrence around lines 152-172; reference symbols: downloadDir, extractDir, and extractModuleFromWheel.
177-201:⚠️ Potential issue | 🟡 MinorReject malformed
@suffixes during parse.
pkg==1.0.0@garbageandpkg==@https://hostcurrently pass this helper because the@stays insideversionPart. That pushes a simple manifest validation error down intopip download.🔎 Proposed fix
parts := strings.SplitN(ref, "==", 2) if len(parts) != 2 || parts[0] == "" || parts[1] == "" { - return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>'") + return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'") } @@ - if idx := strings.Index(versionPart, "@"); idx > 0 { - candidate := versionPart[idx+1:] - if strings.Contains(candidate, "://") { - indexURL = candidate - versionPart = versionPart[:idx] - } + if idx := strings.Index(versionPart, "@"); idx >= 0 { + if idx == 0 { + return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'") + } + candidate := strings.TrimSpace(versionPart[idx+1:]) + if !strings.Contains(candidate, "://") { + return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'") + } + indexURL = candidate + versionPart = versionPart[:idx] } @@ if pkgName == "" || version == "" { - return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>'") + return "", "", "", fmt.Errorf("invalid pipPackage format, expected '<package>==<version>[@<index-url>]'") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/python_module.go` around lines 177 - 201, The ParsePipPackageRef function currently allows malformed '@' suffixes to pass (e.g., "pkg==1.0.0@garbage" or "pkg==@https://host"); update ParsePipPackageRef to explicitly reject any '@' in versionPart unless there's a non-empty version before the '@' and the suffix after the '@' is a valid index URL (contains "://") — i.e., find idx := strings.Index(versionPart, "@"); if idx == 0 return an error, if idx > 0 ensure candidate := versionPart[idx+1:] is non-empty and contains "://", then set indexURL=candidate and versionPart=versionPart[:idx]; otherwise return the same invalid-format error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/docker-entrypoint.sh`:
- Around line 159-160: The script currently relies on set -e and calls wait (the
bare wait invocation at the shown spot and the similar waits around lines
255-260), which can cause the script to exit immediately if a child returns
non-zero and skip the cleanup/termination logic; update the script so that waits
cannot trigger errexit—temporarily disable errexit (or otherwise ensure wait
failures are swallowed), capture each child exit status into a variable, and
then restore errexit before running the existing cleanup/termination logic so
cleanup always runs regardless of child exit codes.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go`:
- Around line 188-201: The disconnect handler is sending on client response
channels while holding sm.pendingMu which can block; modify handleDisconnect so
it does not perform blocking sends under the lock: inside the locked section
(using sm.pendingMu) copy sm.pendingReqs into a local map/slice and remove
entries from sm.pendingReqs, then unlock; after unlocking, iterate the copied
entries and send the proto.ExecutionResponse (or perform non-blocking sends with
select to drop if full) to each ch; this preserves the original messages
(proto.ExecutionResponse{...}) but ensures no goroutine can be blocked while
holding sm.pendingMu.
- Around line 101-107: The Connect method currently ignores the caller's ctx by
using grpc.Dial with grpc.WithTimeout(10*time.Second); replace that dial with
grpc.DialContext(ctx, sm.socketPath, grpc.WithContextDialer(dialer),
grpc.WithTransportCredentials(insecure.NewCredentials()), grpc.WithBlock()) so
the caller's deadline/cancellation is honored and remove grpc.WithTimeout; keep
existing error handling and use the same sm.socketPath and dialer identifiers.
In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 177-185: The semaphore (concurrency_limit) is acquired inside
process_request, but tasks are spawned for every incoming message, allowing an
unbounded in_flight set; instead acquire the semaphore before creating a task so
task creation itself is bounded. Change the flow so the dispatcher awaits
concurrency_limit.acquire() (or uses async with concurrency_limit) before adding
a task to in_flight, then create an asyncio.create_task that runs a wrapper
which calls _execute_policy and in its finally block releases the semaphore and
removes the task from in_flight; refer to concurrency_limit, process_request,
_execute_policy and the in_flight set when locating where to move the
acquire/release.
- Around line 362-364: The code unconditionally removes self.socket_path which
can delete regular files; update the cleanup in server.py (where
self.socket_path is handled, e.g., in the server start/teardown code) to first
check that the path exists and is a UNIX domain socket using os.stat +
stat.S_ISSOCK(self.socket_path) (or equivalent), and only call os.remove when
that check passes; if the path exists but is not a socket, do not remove it and
instead log or raise a clear error; also catch and log exceptions from
os.stat/os.remove to avoid crashing.
- Line 51: Add a proper shutdown path: implement a close(self) method on
PythonExecutorServicer that calls self._executor.shutdown(wait=True) to stop its
ThreadPoolExecutor; in PythonExecutorServer.store the servicer instance on
self._servicer instead of a local variable in start(), and in
PythonExecutorServer.shutdown() call self._servicer.close(); also ensure any
ThreadPoolExecutor created for aio.server() (the server-side worker pool) is
kept as an instance variable (e.g., self._aio_worker_executor) and explicitly
shutdown with shutdown(wait=True) in shutdown(); finally locate and explicitly
shutdown the migration thread pool used by this module (call shutdown(wait=True)
on that pool) so no worker threads are leaked.
In `@gateway/gateway-runtime/python-executor/main.py`:
- Around line 42-51: The CLI currently allows 0 or negative values for --workers
and --max-concurrent which will break runtime concurrency primitives; add
validation in the argument parsing by replacing the raw int conversion with a
positive-int checker (e.g., a small helper function like positive_int(value)
used as the type for parser.add_argument) and call parser.error when value <= 0
so startup fails fast with a clear message; apply this change for both
occurrences of parser.add_argument for "--workers" and "--max-concurrent"
(including the second set around lines 104-110) so both locations validate > 0.
- Around line 115-117: The signal handler currently fires an untracked
fire-and-forget task (asyncio.create_task(server.shutdown())) which can race
with the main finally block and swallow exceptions; introduce a module-level
variable (e.g., shutdown_task) and change signal_handler to: if shutdown_task is
None or shutdown_task.done(): assign shutdown_task =
asyncio.create_task(server.shutdown()) and attach a done-callback that logs
exceptions, otherwise skip creating a new task; update the finally block that
currently awaits server.shutdown() to instead await shutdown_task if it exists
so server.shutdown() is only invoked once and any exceptions are surfaced and
logged.
In `@gateway/sample-policies/slugify-body-python/policy.py`:
- Around line 222-227: The int() conversion of params.get("maxLength") should be
wrapped in a try/except to avoid raising ValueError for non-numeric inputs;
update the code around max_length = int(params.get("maxLength", 0)) (in the
slugify-body policy) to catch ValueError (and TypeError), log a warning that the
provided maxLength is invalid and will be treated as unlimited/default, and set
max_length = 0 in the exception handler; keep the existing negative-value check
for max_length afterwards.
---
Outside diff comments:
In `@gateway/gateway-builder/internal/discovery/discovery_test.go`:
- Around line 535-555: The test reveals that DiscoverPoliciesFromBuildFile
currently treats ambiguous/missing runtime files as Go and surfaces a go.mod
error; update DetectRuntime to return an explicit unknown/err state (not default
to Go) when it cannot determine runtime (no go.mod and no .py files), and modify
DiscoverPoliciesFromBuildFile to check for that unknown result and return a
clear, contextual error (e.g., "unable to detect runtime for policy X" or
propagate the DetectRuntime error) instead of proceeding to Go-specific parsing;
locate references to DetectRuntime and the runtime-switch logic inside
DiscoverPoliciesFromBuildFile to implement this behavior and add corresponding
error handling paths so the tests and other callers get an explicit
unknown-runtime error.
---
Duplicate comments:
In `@gateway/gateway-builder/internal/discovery/python_module.go`:
- Around line 97-100: The temp download directory (downloadDir) created with
os.MkdirTemp in python_module.go is not removed on the success path, leaking
files; update the logic around extractModuleFromWheel to ensure downloadDir is
cleaned up after successful extraction (either add a defer
os.RemoveAll(downloadDir) immediately after creating downloadDir, or explicitly
call os.RemoveAll(downloadDir) after extractModuleFromWheel returns
successfully), and apply the same cleanup fix to the other occurrence around
lines 152-172; reference symbols: downloadDir, extractDir, and
extractModuleFromWheel.
- Around line 177-201: The ParsePipPackageRef function currently allows
malformed '@' suffixes to pass (e.g., "pkg==1.0.0@garbage" or
"pkg==@https://host"); update ParsePipPackageRef to explicitly reject any '@' in
versionPart unless there's a non-empty version before the '@' and the suffix
after the '@' is a valid index URL (contains "://") — i.e., find idx :=
strings.Index(versionPart, "@"); if idx == 0 return an error, if idx > 0 ensure
candidate := versionPart[idx+1:] is non-empty and contains "://", then set
indexURL=candidate and versionPart=versionPart[:idx]; otherwise return the same
invalid-format error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c08eab94-fbc3-48ea-b7d4-8ff374536c0c
📒 Files selected for processing (8)
gateway/gateway-builder/internal/discovery/discovery_test.gogateway/gateway-builder/internal/discovery/python_module.gogateway/gateway-runtime/docker-entrypoint.shgateway/gateway-runtime/policy-engine/internal/pythonbridge/client.gogateway/gateway-runtime/python-executor/executor/server.pygateway/gateway-runtime/python-executor/main.pygateway/sample-policies/build.yamlgateway/sample-policies/slugify-body-python/policy.py
🚧 Files skipped from review as they are similar to previous changes (1)
- gateway/sample-policies/build.yaml
Make Python executor and Python-bridge more robust: - docker-entrypoint.sh: Preserve full --py.* handling and export PYTHON_EXECUTOR_SOCKET for both --py.socket=<val> and --py.socket <val> forms so shell-provided flags stay in sync with the Go-side socket checks. - policy-engine/internal/pythonbridge/client.go: Add an atomic connID and pass stream/conn/connID into receiveLoop so stale goroutines cannot close newer connections; capture stream under connectMu before sending to avoid TOCTOU races; use stream.Send for the captured stream. - python-executor/executor/server.py: Reorder shutdown to stop accepting RPCs and thread-pool tasks first, then clear instances, and finally shut down IO/thread pools to avoid races during cleanup. - python-executor/main.py: Improve CLI/env handling for integer flags by deferring validation of env-derived defaults and allowing CLI overrides to take precedence; add helpers to detect provided flags and fall back on sane defaults when env vars are invalid. These changes fix race conditions, stale connection handling, and make startup/shutdown behavior more resilient and predictable.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Initialize response and protect shared state updates under the response_ready lock to avoid races. Only append non-None responses, discard in_flight task and notify waiters inside a finally block so deque/in_flight changes are observed atomically, and release the concurrency semaphore after notifying waiters so they re-check in_flight correctly. Also ensure reader_done is set while holding the lock and that notify is always called in the reader's finally block.
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go`:
- Around line 331-339: The InitPolicy, DestroyPolicy, and HealthCheck methods
access sm.client without synchronizing with Connect which writes sm.client under
connectMu; protect reads by acquiring the same mutex (use
connectMu.RLock()/RUnlock() or Lock()/Unlock() around accesses to sm.client)
before checking IsConnected/using sm.client in InitPolicy, DestroyPolicy, and
HealthCheck so there is no data race with Connect which assigns sm.client.
- Around line 280-285: The select branch in Execute that reads from respCh
currently maps any ctx.Done() to a timeout error; change the ctx.Done() handling
(the select case using respCh and ctx.Done()) to inspect ctx.Err() and return a
cancellation error when ctx.Err() == context.Canceled (or return ctx.Err()
directly) and only return the timeout formatted error when ctx.Err() ==
context.DeadlineExceeded (or timeout matches); update the return in that branch
to use ctx.Err() or a descriptive message instead of always using
fmt.Errorf("timeout ...", timeout) so cancellation and timeout are
differentiated.
In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 142-156: DestroyPolicy currently removes the instance from
self._store and calls instance.close() immediately, which races with
_execute_policy that resolves the same instance later; change the lifecycle to
use an active-execution counter and deferred teardown: add/ensure an
active_count (or similar) field on the instance and update _execute_policy to
increment this counter before using the instance and decrement after
on_request/on_response completes, then modify _destroy_policy_sync to mark the
instance as pending_destroy (or set a flag) instead of removing/closing
immediately and only remove and call instance.close() once active_count reaches
zero (or perform removal/close in the decrement path when pending_destroy is
set); keep returning proto.DestroyPolicyResponse(success=True) as before when
scheduling teardown.
- Around line 59-61: The close() method currently calls
self._executor.shutdown(wait=True) which can block forever if a submitted policy
(via ExecuteStream() using run_in_executor()) is hung; change shutdown to a
non-blocking/interruptible form—either self._executor.shutdown(wait=False) or
self._executor.shutdown(wait=True, cancel_futures=True) and ensure you also
apply a bounded timeout/cleanup logic so shutdown cannot block indefinitely;
update every other site that calls shutdown (the other executor shutdown calls
around ExecuteStream/run_in_executor usage) to the same
non-blocking/cancel_futures-with-timeout pattern and ensure any necessary flags
are available in the Python version used.
- Around line 393-395: The code currently ignores the return value of
self.server.add_insecure_port(f"unix:{self.socket_path}") which returns 0 on
bind failure; modify the startup sequence in the server initialization (around
server.start in server.py) to capture the return value from add_insecure_port,
check if it equals 0, and if so log an error referencing self.socket_path and
raise an exception (or exit) before calling await self.server.start() and before
logging "Python Executor ready"; ensure the error path prevents startup
continuing when the UDS bind fails.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ac134c4d-d4df-4c44-b462-4f32a8fbe682
📒 Files selected for processing (5)
gateway/gateway-builder/internal/discovery/discovery_test.gogateway/gateway-runtime/docker-entrypoint.shgateway/gateway-runtime/policy-engine/internal/pythonbridge/client.gogateway/gateway-runtime/python-executor/executor/server.pygateway/gateway-runtime/python-executor/main.py
🚧 Files skipped from review as they are similar to previous changes (1)
- gateway/gateway-runtime/docker-entrypoint.sh
| def close(self): | ||
| """Shutdown the thread pool executor.""" | ||
| self._executor.shutdown(wait=True) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Python asyncio, does cancelling a coroutine that is awaiting loop.run_in_executor(...)cancel the underlyingThreadPoolExecutorjob, and how doesThreadPoolExecutor.shutdown(wait=True, cancel_futures=True) behave?
💡 Result:
-
Cancelling an
asyncioTask that’s currentlyawait loop.run_in_executor(...)does not reliably stop the underlying thread job.run_in_executor()submits work to aconcurrent.futures.Executorand returns anasyncio.Futureyou can cancel/await, but cancelling that await only affects the awaiting side; the function already running in a worker thread generally keeps running becauseconcurrent.futures.Future.cancel()cannot cancel once execution has started. [1], [2], [3] -
What cancellation can do: if the executor job has not started yet, cancellation may succeed (because
Future.cancel()can cancel only before the call begins). If it is running (or finished),cancel()returnsFalseand the work continues. [1] -
ThreadPoolExecutor.shutdown(wait=True, cancel_futures=True)behavior:- Stops accepting new submissions.
- Cancels only futures that are still pending / not started.
- With
wait=True, it waits for already-started work to finish before returning; running (or completed) futures “won’t be cancelled” even withcancel_futures=True. [1]
Sources:
[1] Python docs: concurrent.futures (Future.cancel(), Executor.shutdown(cancel_futures=...)) https://docs.python.org/3/library/concurrent.futures.html
[2] Python docs: asyncio event loop run_in_executor() https://docs.python.org/3/library/asyncio-eventloop.html
[3] CPython issue documenting that cancelling run_in_executor does not stop the thread python/cpython#107505
🏁 Script executed:
cd gateway/gateway-runtime/python-executor/executor && head -n 65 server.py | tail -n 10Repository: wso2/api-platform
Length of output: 458
🏁 Script executed:
cd gateway/gateway-runtime/python-executor/executor && sed -n '180,200p' server.pyRepository: wso2/api-platform
Length of output: 931
🏁 Script executed:
cd gateway/gateway-runtime/python-executor/executor && sed -n '405,415p' server.pyRepository: wso2/api-platform
Length of output: 489
🏁 Script executed:
cd gateway/gateway-runtime/python-executor/executor && sed -n '20,70p' server.py | grep -n "ExecuteStream\|run_in_executor" -A 3 -B 1Repository: wso2/api-platform
Length of output: 43
close() can block shutdown indefinitely on stuck policy calls.
ExecuteStream() submits user policy code to the thread pool via run_in_executor(). When a stream is cancelled while work is running, the underlying thread job does not stop—concurrent.futures.Future.cancel() fails once execution has begun. Consequently, shutdown(wait=True) here blocks indefinitely waiting for that thread to complete. A single hung policy can therefore stall process exit well past the 5-second gRPC grace period.
Use shutdown(wait=False) or shutdown(wait=True, cancel_futures=True) with a timeout to allow graceful termination without indefinite blocking.
Also applies to: 188-192, 410-411
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 59 -
61, The close() method currently calls self._executor.shutdown(wait=True) which
can block forever if a submitted policy (via ExecuteStream() using
run_in_executor()) is hung; change shutdown to a non-blocking/interruptible
form—either self._executor.shutdown(wait=False) or
self._executor.shutdown(wait=True, cancel_futures=True) and ensure you also
apply a bounded timeout/cleanup logic so shutdown cannot block indefinitely;
update every other site that calls shutdown (the other executor shutdown calls
around ExecuteStream/run_in_executor usage) to the same
non-blocking/cancel_futures-with-timeout pattern and ensure any necessary flags
are available in the Python version used.
Ensure in-flight policy executions aren't closed while running and avoid nil Python client RPCs. Add active_count and pending_destroy to InstanceRecord and new store methods (acquire_for_execution, release_execution, mark_for_destruction) to track active uses and mark instances for delayed destruction. Update server to mark instances for destruction, defer close until active_count reaches zero, and wrap execution with acquire/release logic. Also add mutex-protected client lookup and nil checks in the Go pythonbridge client (InitPolicy, DestroyPolicy, HealthCheck) to prevent RPCs when disconnected.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
gateway/gateway-runtime/python-executor/executor/server.py (1)
251-258: Minor style suggestion: list unpacking instead of concatenation.Static analysis suggests using
[reader_task, *list(in_flight)]instead of concatenation, which is slightly more idiomatic.Suggested fix
- all_tasks = [reader_task] + list(in_flight) + all_tasks = [reader_task, *list(in_flight)]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 251 - 258, Replace the list concatenation used to build all_tasks with list unpacking to be more idiomatic: in the asyncio.CancelledError except block where reader_task and in_flight are combined (variables reader_task, in_flight, and all_tasks), create all_tasks using [reader_task, *list(in_flight)] instead of [reader_task] + list(in_flight); keep the surrounding logic (cancellation, shielded gather, re-raise) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 251-258: Replace the list concatenation used to build all_tasks
with list unpacking to be more idiomatic: in the asyncio.CancelledError except
block where reader_task and in_flight are combined (variables reader_task,
in_flight, and all_tasks), create all_tasks using [reader_task,
*list(in_flight)] instead of [reader_task] + list(in_flight); keep the
surrounding logic (cancellation, shielded gather, re-raise) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7b25c53e-4d86-469b-946f-df1824d80d11
📒 Files selected for processing (3)
gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.gogateway/gateway-runtime/python-executor/executor/instance_store.pygateway/gateway-runtime/python-executor/executor/server.py
Make Python bridge operations more robust: buildRequest/buildResponseRequest and buildSharedContext now return errors and use toProtoStruct with error handling; OnRequest/OnResponse log and handle build errors. Introduce sync.Once and closeErr for Close to make destruction safe for concurrent/multiple calls. Precompute PYTHON_POLICY_TIMEOUT at package init and use getTimeout wrapper. Update BridgeFactory to propagate param conversion errors. Change toProtoStruct to return (Struct, error) and adjust callers. Python executor: import Struct at module level, ensure concurrency semaphore is released if task creation fails, and remove redundant local import.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
gateway/gateway-runtime/python-executor/executor/server.py (2)
410-412:⚠️ Potential issue | 🔴 CriticalFail fast when Unix socket bind is rejected.
Line 410 ignores the return from
add_insecure_port(...). If bind fails (0), startup can continue and still log readiness on Line 412.Suggested fix
- self.server.add_insecure_port(f"unix:{self.socket_path}") + bind_result = self.server.add_insecure_port(f"unix:{self.socket_path}") + if bind_result == 0: + raise RuntimeError( + f"Failed to bind Python Executor to unix:{self.socket_path}" + ) await self.server.start()For grpc.aio.Server.add_insecure_port in Python, confirm the return value semantics on UDS: what does 0 mean, and should startup fail when 0 is returned?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 410 - 412, Check the return value of self.server.add_insecure_port(f"unix:{self.socket_path}") and fail fast if it returns 0 (indicating bind failure on the UDS); specifically, after calling add_insecure_port capture its return (e.g., rv = self.server.add_insecure_port(...)), and if rv == 0 then log an error via logger.error including the socket path and either raise an exception or abort startup (do not proceed to await self.server.start() or logger.info). This ensures startup does not report "Python Executor ready" when the Unix socket bind was rejected.
60-63:⚠️ Potential issue | 🟠 MajorAvoid potentially unbounded shutdown blocking on policy executor close.
Line 62 uses
shutdown(wait=True). If a policy call running viarun_in_executorgets stuck, shutdown can block indefinitely when Line 428 callsself._servicer.close().Suggested fix
def close(self): """Shutdown the thread pool executor.""" - self._executor.shutdown(wait=True) + # Don't block indefinitely on hung user policy code. + self._executor.shutdown(wait=False, cancel_futures=True)In Python asyncio, if a task awaiting loop.run_in_executor(...) is cancelled, does the underlying ThreadPoolExecutor job continue running? Also confirm shutdown(wait=True) vs shutdown(wait=False, cancel_futures=True) behavior for running tasks.Also applies to: 427-428
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/executor/server.py` around lines 60 - 63, The close method currently calls self._executor.shutdown(wait=True) which can block indefinitely if a run_in_executor job is stuck; change it to a non-blocking shutdown and cancel outstanding work by calling self._executor.shutdown(wait=False, cancel_futures=True) (or shutdown(wait=False) on older Python with additional cleanup) so _servicer.close() won't hang; also ensure any code using loop.run_in_executor handles cancellation correctly since cancelling the asyncio Task does not stop the underlying ThreadPoolExecutor worker unless cancel_futures=True is used.
🧹 Nitpick comments (2)
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
254-258: Potential key collision when flatteningAuthContext.Properties.Properties are merged directly into the map after reserved keys (
subject,issuer,auth_type, etc.). Ifac.Propertiescontains any of these keys, it will silently overwrite them. Consider adding a prefix or checking for collisions.♻️ Proposed fix - prefix properties to avoid collision
// Flatten properties for k, v := range ac.Properties { - m[k] = v + m["prop_"+k] = v }Alternatively, validate that properties don't contain reserved keys.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 254 - 258, The loop that flattens AuthContext.Properties into the map m (iterating ac.Properties) can silently overwrite reserved keys like "subject", "issuer", "auth_type", etc.; update the flattening in the function/method that builds m so it detects collisions or namespaces properties: either skip or throw/log on keys that match reserved names, or prepend a stable prefix (e.g., "properties." or "prop_") to each ac.Properties key before inserting into m; ensure you update all references to m that expect these property names accordingly and keep the insertion logic around the "for k, v := range ac.Properties" loop as the change point.gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go (1)
48-52: Consider sanitizing error messages returned to clients.The error message from Python is directly included in the response body. This could leak internal implementation details to API consumers. Consider logging the full error server-side and returning a generic message to clients.
♻️ Proposed fix
return policy.ImmediateResponse{ StatusCode: 500, Headers: map[string]string{"Content-Type": "text/plain"}, - Body: []byte(errResult.Message), + Body: []byte("Internal policy error"), }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go` around lines 48 - 52, The code in translator.go currently returns errResult.Message directly in policy.ImmediateResponse.Body (via errResult.Message) which can leak internal details; instead log the full errResult.Message (and stack/trace if available) to the server logger and set the response body to a generic message such as "Internal server error" or "An error occurred" before returning the policy.ImmediateResponse in the function that constructs the 500 response; keep the StatusCode and Headers unchanged but do not expose errResult.Message to clients.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@gateway/gateway-runtime/python-executor/executor/server.py`:
- Around line 410-412: Check the return value of
self.server.add_insecure_port(f"unix:{self.socket_path}") and fail fast if it
returns 0 (indicating bind failure on the UDS); specifically, after calling
add_insecure_port capture its return (e.g., rv =
self.server.add_insecure_port(...)), and if rv == 0 then log an error via
logger.error including the socket path and either raise an exception or abort
startup (do not proceed to await self.server.start() or logger.info). This
ensures startup does not report "Python Executor ready" when the Unix socket
bind was rejected.
- Around line 60-63: The close method currently calls
self._executor.shutdown(wait=True) which can block indefinitely if a
run_in_executor job is stuck; change it to a non-blocking shutdown and cancel
outstanding work by calling self._executor.shutdown(wait=False,
cancel_futures=True) (or shutdown(wait=False) on older Python with additional
cleanup) so _servicer.close() won't hang; also ensure any code using
loop.run_in_executor handles cancellation correctly since cancelling the asyncio
Task does not stop the underlying ThreadPoolExecutor worker unless
cancel_futures=True is used.
---
Nitpick comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 254-258: The loop that flattens AuthContext.Properties into the
map m (iterating ac.Properties) can silently overwrite reserved keys like
"subject", "issuer", "auth_type", etc.; update the flattening in the
function/method that builds m so it detects collisions or namespaces properties:
either skip or throw/log on keys that match reserved names, or prepend a stable
prefix (e.g., "properties." or "prop_") to each ac.Properties key before
inserting into m; ensure you update all references to m that expect these
property names accordingly and keep the insertion logic around the "for k, v :=
range ac.Properties" loop as the change point.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.go`:
- Around line 48-52: The code in translator.go currently returns
errResult.Message directly in policy.ImmediateResponse.Body (via
errResult.Message) which can leak internal details; instead log the full
errResult.Message (and stack/trace if available) to the server logger and set
the response body to a generic message such as "Internal server error" or "An
error occurred" before returning the policy.ImmediateResponse in the function
that constructs the 500 response; keep the StatusCode and Headers unchanged but
do not expose errResult.Message to clients.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7c41cf0a-2849-479a-9865-3860c8dafde3
📒 Files selected for processing (5)
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/client.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/translator.gogateway/gateway-runtime/python-executor/executor/server.py
Introduce a configurable execution timeout and use a shared ThreadPoolExecutor across the Python executor. Pass timeout via CLI (--timeout) and env (PYTHON_POLICY_TIMEOUT) and log it in the docker entrypoint. Enforce per-request timeouts with asyncio.wait_for and return a timeout error response; servicer no longer owns executor shutdown. Also: namespace auth properties with a "prop_" prefix to avoid reserved key collisions, improve context error handling in the pythonbridge client, handle missing base requirements.txt safely when merging, check UNIX socket bind result and improve shutdown/cleanup logic and logging.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
59-77:⚠️ Potential issue | 🟠 MajorRequest context is discarded in favor of
context.Background().Using
context.Background()loses the original request's cancellation signals and deadlines. If the HTTP request is cancelled, the Python execution will continue unnecessarily until the configured timeout expires.This is a limitation of the
policy.PolicySDK interface which doesn't passcontext.ContexttoOnRequest/OnResponse. WhileExecute()applies its own timeout, cancellation signals from the original request are lost.Consider tracking this as a future SDK enhancement to propagate request context.
🔧 Potential workaround - derive context with timeout
func (b *PythonBridge) OnRequest(ctx *policy.RequestContext, params map[string]interface{}) policy.RequestAction { req, err := b.buildRequest(ctx, "on_request", params) if err != nil { b.slogger.Error("Failed to build request", "error", err, "phase", "on_request") return b.errorResponse(err) } - resp, err := b.streamManager.Execute(context.Background(), req) + // TODO: Propagate proper context when SDK interface supports it. + // For now, apply the configured timeout to avoid unbounded waits. + execCtx, cancel := context.WithTimeout(context.Background(), getTimeout()) + defer cancel() + + resp, err := b.streamManager.Execute(execCtx, req)Apply the same pattern to
OnResponseat line 87.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 59 - 77, The code currently calls streamManager.Execute with context.Background(), discarding request cancellation; change it to derive a cancellable context from the incoming request context (use the request's SharedContext available on policy.RequestContext) and apply the same execution timeout used for Python calls (e.g., create ctxWithTimeout via context.WithTimeout(ctx.SharedContext, <appropriate timeout>) and defer cancel()) before calling b.streamManager.Execute(ctxWithTimeout, req) so cancellations/deadlines propagate; apply the identical pattern in the OnResponse handler as well.
🧹 Nitpick comments (2)
gateway/gateway-runtime/python-executor/main.py (1)
32-41: Chain the original exception for better traceability.The static analysis correctly flags that
raise ... from errshould be used to preserve the exception chain, making debugging easier.♻️ Proposed fix
def positive_int(value): """Validate that the value is a positive integer.""" try: ivalue = int(value) if ivalue <= 0: raise ValueError return ivalue - except (ValueError, TypeError): - raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") + except (ValueError, TypeError) as err: + raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from err🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/main.py` around lines 32 - 41, The positive_int validator currently raises argparse.ArgumentTypeError without chaining the original exception; modify the except block in function positive_int to capture the caught exception (e.g., except (ValueError, TypeError) as err:) and re-raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from err so the original traceback is preserved for debugging.gateway/gateway-builder/internal/policyengine/generator.go (1)
343-370: Consider usingfilepath.WalkDirfor better performance.
filepath.Walkcallsos.Lstaton every file, whilefilepath.WalkDiravoids the stat call when not needed. For large policy directories, this can improve copy performance.♻️ Proposed refactor using WalkDir
// copyDir copies a directory recursively func copyDir(src, dst string) error { - return filepath.Walk(src, func(path string, info os.FileInfo, err error) error { + return filepath.WalkDir(src, func(path string, d fs.DirEntry, err error) error { if err != nil { return err } // Compute relative path relPath, err := filepath.Rel(src, path) if err != nil { return err } dstPath := filepath.Join(dst, relPath) - if info.IsDir() { - return os.MkdirAll(dstPath, info.Mode()) + if d.IsDir() { + return os.MkdirAll(dstPath, 0755) } // Copy file data, err := os.ReadFile(path) if err != nil { return err } - return os.WriteFile(dstPath, data, info.Mode()) + info, err := d.Info() + if err != nil { + return err + } + return os.WriteFile(dstPath, data, info.Mode()) }) }Note: Add
"io/fs"to imports.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/generator.go` around lines 343 - 370, The copyDir function should be refactored to use filepath.WalkDir for better performance: replace filepath.Walk(src, func(path string, info os.FileInfo, err error) error { ... }) with filepath.WalkDir(src, func(path string, d fs.DirEntry, err error) error { ... }), compute relPath the same way, build dstPath, call d.IsDir() to handle directories and call d.Info() only when you need the FileInfo (to get Mode()) for os.MkdirAll(dstPath, info.Mode()) and for files use os.ReadFile(path) and os.WriteFile(dstPath, data, info.Mode()). Preserve existing error handling and return semantics of copyDir and add "io/fs" to imports.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/docker-entrypoint.sh`:
- Around line 112-117: The environment variable PYTHON_POLICY_TIMEOUT is set as
a bare number but the Go bridge (pythonbridge/client.go where time.ParseDuration
is used) expects a duration string; change the default export in
docker-entrypoint.sh to include a unit (e.g., export
PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30s}") and then update main.py’s
timeout handling to accept duration strings (parse with a duration parser and
fall back to integer seconds for backwards compatibility) so both consumers
interpret the value consistently; reference PYTHON_POLICY_TIMEOUT,
pythonbridge/client.go (time.ParseDuration usage) and main.py (current integer
parsing) when making the edits.
---
Duplicate comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 59-77: The code currently calls streamManager.Execute with
context.Background(), discarding request cancellation; change it to derive a
cancellable context from the incoming request context (use the request's
SharedContext available on policy.RequestContext) and apply the same execution
timeout used for Python calls (e.g., create ctxWithTimeout via
context.WithTimeout(ctx.SharedContext, <appropriate timeout>) and defer
cancel()) before calling b.streamManager.Execute(ctxWithTimeout, req) so
cancellations/deadlines propagate; apply the identical pattern in the OnResponse
handler as well.
---
Nitpick comments:
In `@gateway/gateway-builder/internal/policyengine/generator.go`:
- Around line 343-370: The copyDir function should be refactored to use
filepath.WalkDir for better performance: replace filepath.Walk(src, func(path
string, info os.FileInfo, err error) error { ... }) with filepath.WalkDir(src,
func(path string, d fs.DirEntry, err error) error { ... }), compute relPath the
same way, build dstPath, call d.IsDir() to handle directories and call d.Info()
only when you need the FileInfo (to get Mode()) for os.MkdirAll(dstPath,
info.Mode()) and for files use os.ReadFile(path) and os.WriteFile(dstPath, data,
info.Mode()). Preserve existing error handling and return semantics of copyDir
and add "io/fs" to imports.
In `@gateway/gateway-runtime/python-executor/main.py`:
- Around line 32-41: The positive_int validator currently raises
argparse.ArgumentTypeError without chaining the original exception; modify the
except block in function positive_int to capture the caught exception (e.g.,
except (ValueError, TypeError) as err:) and re-raise
argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from err so
the original traceback is preserved for debugging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 73d266db-4963-4a3a-a71f-984c14b48f21
📒 Files selected for processing (6)
gateway/gateway-builder/internal/policyengine/generator.gogateway/gateway-runtime/docker-entrypoint.shgateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/client.gogateway/gateway-runtime/python-executor/executor/server.pygateway/gateway-runtime/python-executor/main.py
| # Python Executor configuration | ||
| export PYTHON_POLICY_WORKERS="${PYTHON_POLICY_WORKERS:-4}" | ||
| export PYTHON_POLICY_MAX_CONCURRENT="${PYTHON_POLICY_MAX_CONCURRENT:-100}" | ||
| export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30}" | ||
| export PYTHON_EXECUTOR_SOCKET="${PYTHON_EXECUTOR_SOCKET:-/var/run/api-platform/python-executor.sock}" | ||
|
|
There was a problem hiding this comment.
Timeout format mismatch between Python and Go consumers.
PYTHON_POLICY_TIMEOUT is exported as a bare number (30), but the Go Python bridge at pythonbridge/client.go:404-407 parses it using time.ParseDuration(s), which requires a unit suffix (e.g., 30s). While the Go code falls back to 30 * time.Second on parse failure, this creates a silent configuration mismatch.
The Python executor (main.py) expects an integer, so it works correctly there.
🔧 Proposed fix - use duration format
# Python Executor configuration
export PYTHON_POLICY_WORKERS="${PYTHON_POLICY_WORKERS:-4}"
export PYTHON_POLICY_MAX_CONCURRENT="${PYTHON_POLICY_MAX_CONCURRENT:-100}"
-export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30}"
+export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30s}"
export PYTHON_EXECUTOR_SOCKET="${PYTHON_EXECUTOR_SOCKET:-/var/run/api-platform/python-executor.sock}"Then update main.py to parse duration strings, or alternatively update client.go to parse integer seconds.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gateway/gateway-runtime/docker-entrypoint.sh` around lines 112 - 117, The
environment variable PYTHON_POLICY_TIMEOUT is set as a bare number but the Go
bridge (pythonbridge/client.go where time.ParseDuration is used) expects a
duration string; change the default export in docker-entrypoint.sh to include a
unit (e.g., export PYTHON_POLICY_TIMEOUT="${PYTHON_POLICY_TIMEOUT:-30s}") and
then update main.py’s timeout handling to accept duration strings (parse with a
duration parser and fall back to integer seconds for backwards compatibility) so
both consumers interpret the value consistently; reference
PYTHON_POLICY_TIMEOUT, pythonbridge/client.go (time.ParseDuration usage) and
main.py (current integer parsing) when making the edits.
Stop parsing ProcessingMode from policy-definition.yaml at build time and instead obtain processing requirements from the Python executor during InitPolicy. Removed ProcessingMode fields and helpers from the builder and templates, added a ProcessingMode message to the python_executor proto and included it in InitPolicyResponse, and added runtime conversion logic (processingModeFromProto) in the BridgeFactory. Enforced a fixed python-executor socket path in the pythonbridge (no env override) and updated docker-entrypoint/main to reject --py.socket overrides and use the fixed socket path. Standardized the /health response (Status/Timestamp/Reason), integrated python executor health checking, and updated/added tests accordingly. Also removed local SDK replace and sample-policies build context/copy from the runtime image and updated go.sum to reference the SDK release.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (7)
gateway/gateway-builder/internal/policyengine/registry.go (1)
154-162:formatSystemParamsis dead code that silently discards parameters.This function always returns
"nil"even whenparamsis not nil. While the current template handlesSystemParametersinline and doesn't call this function, having it registered in theFuncMapcreates confusion and risk if someone tries to use it later.Either remove this function and its registration, or implement it properly.
Option 1: Remove dead code
- tmpl, err := template.New("plugin_registry").Funcs(template.FuncMap{ - "formatSystemParams": formatSystemParams, - }).Parse(templates.PluginRegistryTemplate) + tmpl, err := template.New("plugin_registry").Parse(templates.PluginRegistryTemplate)And remove lines 154-162 entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/policyengine/registry.go` around lines 154 - 162, The function formatSystemParams currently always returns "nil" and should either be removed along with its registration in the template FuncMap or implemented to produce a valid Go literal for the input map; update the code accordingly by modifying formatSystemParams to iterate over the params map and build a Go map literal string (e.g., "map[string]interface{}{...}" with properly quoted keys and Go-typed values) so templates can use it, or delete the formatSystemParams function and remove any reference to it from the FuncMap to avoid dead code and confusion.gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go (1)
402-414: Silent fallback on timeout parse failure.If
PYTHON_POLICY_TIMEOUTis set to a value without a unit suffix (e.g.,60instead of60s),time.ParseDurationfails and the code silently falls back to 30 seconds. Consider logging a warning or parsing integer-only values as seconds for better UX.Proposed fix
var pythonPolicyTimeout = func() time.Duration { if s := os.Getenv("PYTHON_POLICY_TIMEOUT"); s != "" { if d, err := time.ParseDuration(s); err == nil { return d } + // Try parsing as integer seconds for backward compatibility + if secs, err := strconv.Atoi(s); err == nil { + slog.Warn("PYTHON_POLICY_TIMEOUT should use duration format (e.g., '30s'), interpreting as seconds", + "value", s) + return time.Duration(secs) * time.Second + } + slog.Warn("Invalid PYTHON_POLICY_TIMEOUT value, using default", + "value", s, "default", "30s") } return 30 * time.Second }()This requires adding
"strconv"to imports.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go` around lines 402 - 414, The env parsing for PYTHON_POLICY_TIMEOUT in the package-level variable pythonPolicyTimeout silently falls back to 30s when time.ParseDuration fails (e.g., "60" -> error); update the initializer for pythonPolicyTimeout to (1) try time.ParseDuration as now, (2) if that fails attempt to parse the string as an integer number of seconds using strconv.Atoi (or strconv.ParseInt) and convert to time.Duration, and (3) log a warning via the package logger or log.Printf when the provided value is invalid and the fallback to default is used; ensure you add "strconv" to the imports and keep getTimeout() unchanged.gateway/gateway-runtime/proto/python_executor.proto (1)
20-22: Buf lint: package directory mismatch (informational).Buf reports that files with package
wso2.gateway.python.v1should be in awso2/gateway/python/v1directory. This is a structural lint rule that may conflict with your monorepo layout. If Buf is used in CI, consider either:
- Adding a
buf.yamlexception for this path- Reorganizing proto files to match the package structure
This doesn't affect runtime correctness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/proto/python_executor.proto` around lines 20 - 22, Buf lint warns that the proto package declaration "package wso2.gateway.python.v1" does not match its directory layout; resolve this by either adding a Buf exception for this proto file in your Buf configuration (buf.yaml) to allow the current placement, or move the file so its directory path matches the package (e.g., wso2/gateway/python/v1). Ensure the change references the same proto by checking the "package wso2.gateway.python.v1" declaration (and the existing option go_package) so imports and generated code continue to match after the change.gateway/gateway-runtime/python-executor/sdk/policy.py (1)
177-185: Consider makingclose()abstract or suppressing the linter warning.The
close()method is intentionally a default no-op (as documented), but Ruff flags it as B027 because it's an empty method in an ABC without@abstractmethod. Two options:
- Keep as-is and add
# noqa: B027to silence the warning- Make it
@abstractmethodwith a default implementation usingsuper()patternThe current design is valid—it mirrors typical resource-cleanup patterns where
close()is optional. Adding a brief inline comment or noqa would clarify intent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/sdk/policy.py` around lines 177 - 185, The empty close() method in the ABC is intentionally a no-op but triggers Ruff B027; to silence this, add an explicit inline comment explaining it's a deliberate no-op and append "# noqa: B027" to the method (i.e., update the close(self) -> None: block to include the explanatory comment and the noqa marker), or alternatively convert close into an `@abstractmethod` with a default implementation pattern if you want subclasses to be forced to override; choose the noqa approach for minimal change and clarity.gateway/gateway-runtime/python-executor/main.py (1)
42-43: Chain the exception for better debugging context.When re-raising as
ArgumentTypeError, preserve the original exception context usingfrom. This helps with debugging by showing the full exception chain.Proposed fix
except (ValueError, TypeError): - raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") + raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/python-executor/main.py` around lines 42 - 43, The except block that currently does "except (ValueError, TypeError): raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer")" should capture the original exception and chain it when re-raising; change it to "except (ValueError, TypeError) as e:" and re-raise the ArgumentTypeError with "from e" (i.e., raise argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from e) so the original exception context is preserved for debugging while keeping the same message and using the existing variable names.gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
286-303: Error messages may leak internal details to clients.The error response bodies include the raw error message:
fmt.Sprintf("Policy execution error: %v", err). This could expose internal implementation details, stack traces, or sensitive information to API clients.Consider sanitizing error messages for client responses while logging full details server-side:
func (b *PythonBridge) errorResponse(err error) policy.RequestAction { b.slogger.Error("Policy execution failed", "error", err) return policy.ImmediateResponse{ StatusCode: 500, Headers: map[string]string{"Content-Type": "text/plain"}, Body: []byte("Internal policy error"), } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 286 - 303, The errorResponse and errorResponseAction functions currently include raw err details in client-facing bodies; change both to log the full error server-side (e.g., b.slogger.Error or equivalent) and return sanitized, generic client messages (e.g., "Internal policy error") while preserving status codes and headers (Content-Type: text/plain); update errorResponse to log the error then return policy.ImmediateResponse with generic Body, and update errorResponseAction to log the error then return policy.UpstreamResponseModifications with a generic Body and StatusCode pointer, ensuring no internal error strings are sent to clients.gateway/gateway-builder/internal/discovery/manifest.go (1)
237-256:DetectRuntimesilently defaults to "go" on directory read errors.If
os.ReadDirfails (e.g., permission denied), the function returns"go"without logging the error. This could mask legitimate filesystem issues and lead to confusing downstream errors when Go discovery fails on what's actually a Python directory.Consider logging the error or propagating it:
Proposed fix
func DetectRuntime(policyDir string) string { goMod := filepath.Join(policyDir, "go.mod") if _, err := os.Stat(goMod); err == nil { return "go" } entries, err := os.ReadDir(policyDir) if err != nil { + slog.Warn("Failed to read policy directory for runtime detection, defaulting to go", + "path", policyDir, + "error", err) return "go" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 237 - 256, DetectRuntime currently swallows os.ReadDir errors and always returns "go"; change DetectRuntime to return (string, error), keep the existing go.mod check, and if os.ReadDir(policyDir) returns an error return "", the wrapped error (or annotate with context) instead of defaulting to "go"; preserve the logic that returns "python" when a .py file is found and otherwise return "go" with nil error; update any call sites that call DetectRuntime to handle the (runtime, error) return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.go`:
- Around line 103-119: The processingModeFromProto function currently only
checks for empty strings but must also validate that each string corresponds to
a known enum constant; update processingModeFromProto (taking
*proto.ProcessingMode and returning policy.ProcessingMode) to map/validate
pm.RequestHeaderMode, pm.RequestBodyMode, pm.ResponseHeaderMode, and
pm.ResponseBodyMode against the supported values (the
policy.HeaderProcessingMode and policy.BodyProcessingMode enum constants) and
return an error if any value is unrecognized; implement the checks with explicit
branch logic (switch or lookup) for each field and only construct and return
policy.ProcessingMode when all four fields are valid.
---
Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/manifest.go`:
- Around line 237-256: DetectRuntime currently swallows os.ReadDir errors and
always returns "go"; change DetectRuntime to return (string, error), keep the
existing go.mod check, and if os.ReadDir(policyDir) returns an error return "",
the wrapped error (or annotate with context) instead of defaulting to "go";
preserve the logic that returns "python" when a .py file is found and otherwise
return "go" with nil error; update any call sites that call DetectRuntime to
handle the (runtime, error) return.
In `@gateway/gateway-builder/internal/policyengine/registry.go`:
- Around line 154-162: The function formatSystemParams currently always returns
"nil" and should either be removed along with its registration in the template
FuncMap or implemented to produce a valid Go literal for the input map; update
the code accordingly by modifying formatSystemParams to iterate over the params
map and build a Go map literal string (e.g., "map[string]interface{}{...}" with
properly quoted keys and Go-typed values) so templates can use it, or delete the
formatSystemParams function and remove any reference to it from the FuncMap to
avoid dead code and confusion.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 286-303: The errorResponse and errorResponseAction functions
currently include raw err details in client-facing bodies; change both to log
the full error server-side (e.g., b.slogger.Error or equivalent) and return
sanitized, generic client messages (e.g., "Internal policy error") while
preserving status codes and headers (Content-Type: text/plain); update
errorResponse to log the error then return policy.ImmediateResponse with generic
Body, and update errorResponseAction to log the error then return
policy.UpstreamResponseModifications with a generic Body and StatusCode pointer,
ensuring no internal error strings are sent to clients.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/client.go`:
- Around line 402-414: The env parsing for PYTHON_POLICY_TIMEOUT in the
package-level variable pythonPolicyTimeout silently falls back to 30s when
time.ParseDuration fails (e.g., "60" -> error); update the initializer for
pythonPolicyTimeout to (1) try time.ParseDuration as now, (2) if that fails
attempt to parse the string as an integer number of seconds using strconv.Atoi
(or strconv.ParseInt) and convert to time.Duration, and (3) log a warning via
the package logger or log.Printf when the provided value is invalid and the
fallback to default is used; ensure you add "strconv" to the imports and keep
getTimeout() unchanged.
In `@gateway/gateway-runtime/proto/python_executor.proto`:
- Around line 20-22: Buf lint warns that the proto package declaration "package
wso2.gateway.python.v1" does not match its directory layout; resolve this by
either adding a Buf exception for this proto file in your Buf configuration
(buf.yaml) to allow the current placement, or move the file so its directory
path matches the package (e.g., wso2/gateway/python/v1). Ensure the change
references the same proto by checking the "package wso2.gateway.python.v1"
declaration (and the existing option go_package) so imports and generated code
continue to match after the change.
In `@gateway/gateway-runtime/python-executor/main.py`:
- Around line 42-43: The except block that currently does "except (ValueError,
TypeError): raise argparse.ArgumentTypeError(f"'{value}' is not a positive
integer")" should capture the original exception and chain it when re-raising;
change it to "except (ValueError, TypeError) as e:" and re-raise the
ArgumentTypeError with "from e" (i.e., raise
argparse.ArgumentTypeError(f"'{value}' is not a positive integer") from e) so
the original exception context is preserved for debugging while keeping the same
message and using the existing variable names.
In `@gateway/gateway-runtime/python-executor/sdk/policy.py`:
- Around line 177-185: The empty close() method in the ABC is intentionally a
no-op but triggers Ruff B027; to silence this, add an explicit inline comment
explaining it's a deliberate no-op and append "# noqa: B027" to the method
(i.e., update the close(self) -> None: block to include the explanatory comment
and the noqa marker), or alternatively convert close into an `@abstractmethod`
with a default implementation pattern if you want subclasses to be forced to
override; choose the noqa approach for minimal change and clarity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a7043647-8fd0-4365-a04b-ff23ee216068
⛔ Files ignored due to path filters (1)
gateway/gateway-runtime/policy-engine/internal/pythonbridge/proto/python_executor.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (23)
gateway/build-lock.yamlgateway/gateway-builder/internal/discovery/discovery_test.gogateway/gateway-builder/internal/discovery/manifest.gogateway/gateway-builder/internal/policyengine/generator.gogateway/gateway-builder/internal/policyengine/registry.gogateway/gateway-builder/pkg/types/policy.gogateway/gateway-builder/templates/plugin_registry.go.tmplgateway/gateway-runtime/Dockerfilegateway/gateway-runtime/docker-entrypoint.shgateway/gateway-runtime/policy-engine/cmd/policy-engine/main.gogateway/gateway-runtime/policy-engine/internal/admin/handlers.gogateway/gateway-runtime/policy-engine/internal/admin/handlers_test.gogateway/gateway-runtime/policy-engine/internal/admin/types.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/client.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/factory.gogateway/gateway-runtime/proto/python_executor.protogateway/gateway-runtime/python-executor/executor/server.pygateway/gateway-runtime/python-executor/main.pygateway/gateway-runtime/python-executor/proto/python_executor_pb2.pygateway/gateway-runtime/python-executor/sdk/policy.pygateway/sample-policies/slugify-body-python/policy-definition.yamlgateway/sample-policies/slugify-body-python/policy.py
✅ Files skipped from review due to trivial changes (3)
- gateway/gateway-runtime/policy-engine/internal/admin/types.go
- gateway/sample-policies/slugify-body-python/policy-definition.yaml
- gateway/gateway-runtime/python-executor/proto/python_executor_pb2.py
🚧 Files skipped from review as they are similar to previous changes (5)
- gateway/gateway-builder/pkg/types/policy.go
- gateway/sample-policies/slugify-body-python/policy.py
- gateway/gateway-runtime/Dockerfile
- gateway/gateway-builder/internal/policyengine/generator.go
- gateway/build-lock.yaml
Refactor policy runtime detection to return (string, error) and propagate failures from DiscoverPoliciesFromBuildFile; update tests to check for errors. Harden ProcessingMode parsing with explicit enum validation and clearer error messages. Simplify plugin registry template parsing and remove an unused formatting helper. Improve Python bridge error handling by logging failures and returning generic internal-error responses (including Content-Type header for response actions). Chain exceptions in python-executor argparse error for proper context.
Merges 1032 upstream commits into the Python Policy feature branch. Conflict resolutions: - build-manifest.yaml: took upstream (v1.x policy versions) - plugin_registry.go.tmpl: merged upstream SDK path with Python loops - Dockerfile: merged upstream base (Go 1.26.1, sdk-core, wso2 user, llm-pricing) with Python feature (python-deps stage, python-executor, PYTHONPATH) - sample-policies/build.yaml: upstream policies + Python sample Post-merge updates: - Renamed build-lock.yaml refs to build-manifest.yaml in feature tests Note: pythonbridge SDK imports still reference sdk/gateway/policy/v1alpha and need manual migration to sdk/core/policy/v1alpha2.
…ha2 interface Introduce a redesigned pythonbridge that implements streaming RPCs and richer execution metadata (trace, deadline) and converts between Go and proto streaming payloads. Add StreamManager improvements (UDS dialer, suppressed late responses, typed StreamRequest/StreamResponse handling) and refactor bridge methods to handle header/body/stream phases, chunking and needs-more decisions. Add Makefile targets for proto generation and Python unit tests; regenerate/extend proto files and add python-executor SDK, executor code and unit tests. Remove sample/default Python policy YAMLs and update policy engine tests to verify copying of python-executor base files and bridge behavior.
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
gateway/gateway-builder/internal/discovery/manifest.go (1)
243-262: Consider returning an error when no runtime indicators are found.The default fallback to
"go"at line 261 may mask configuration errors. If a directory lacks bothgo.modand.pyfiles, it defaults to Go and fails later inValidateDirectoryStructure. While downstream validation will catch this, returning an error here would provide a clearer diagnostic earlier.🔧 Optional: Return explicit error for ambiguous directories
for _, e := range entries { if !e.IsDir() && filepath.Ext(e.Name()) == ".py" { return "python", nil } } - return "go", nil + return "", fmt.Errorf("unable to detect runtime for %s: no go.mod or .py files found", policyDir) }Note: This would require updating callers to handle the error case explicitly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-builder/internal/discovery/manifest.go` around lines 243 - 262, DetectRuntime currently defaults to returning "go" when neither go.mod nor .py files are found, which can mask config issues; change DetectRuntime to return an explicit error instead of "go" in the ambiguous case (update the final return to something like return "", fmt.Errorf(...)) so callers get an immediate clear diagnostic; ensure the error message references the policyDir and mentions missing runtime indicators; update any callers (e.g., callers that call DetectRuntime before ValidateDirectoryStructure) to handle the returned error appropriately.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@gateway/gateway-builder/internal/discovery/manifest.go`:
- Around line 243-262: DetectRuntime currently defaults to returning "go" when
neither go.mod nor .py files are found, which can mask config issues; change
DetectRuntime to return an explicit error instead of "go" in the ambiguous case
(update the final return to something like return "", fmt.Errorf(...)) so
callers get an immediate clear diagnostic; ensure the error message references
the policyDir and mentions missing runtime indicators; update any callers (e.g.,
callers that call DetectRuntime before ValidateDirectoryStructure) to handle the
returned error appropriately.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9d1bdd5d-79f8-4a69-ab83-295ba67d70cc
📒 Files selected for processing (11)
.gitignoregateway/gateway-builder/cmd/builder/main.gogateway/gateway-builder/internal/buildfile/generator.gogateway/gateway-builder/internal/discovery/discovery_test.gogateway/gateway-builder/internal/discovery/manifest.gogateway/gateway-builder/internal/discovery/policy.gogateway/gateway-builder/internal/policyengine/generator.gogateway/gateway-builder/internal/policyengine/policyengine_test.gogateway/gateway-builder/internal/policyengine/registry.gogateway/gateway-builder/pkg/types/policy.gogateway/gateway-builder/templates/plugin_registry.go.tmpl
✅ Files skipped from review due to trivial changes (3)
- .gitignore
- gateway/gateway-builder/internal/discovery/discovery_test.go
- gateway/gateway-builder/internal/policyengine/generator.go
🚧 Files skipped from review as they are similar to previous changes (2)
- gateway/gateway-builder/internal/policyengine/policyengine_test.go
- gateway/gateway-builder/pkg/types/policy.go
Refine the bridge comment to note that the engine gates phase execution via Mode and that the bridge also performs internal checks as defense-in-depth. Add TestBuildPolicyChain_UsesBridgeModeForParticipation to ensure Kernel.BuildPolicyChain respects a bridge's ProcessingMode when determining RequiresRequestHeader and SupportsRequestStreaming. The test registers a dummy python policy and exercises buffered vs streaming modes, and adds necessary imports for kernel and registry.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
166-188:⚠️ Potential issue | 🟠 MajorPropagate real request context into the needs-more RPCs.
These two paths still use
context.Background(), so streaming decision calls lose cancellation, deadlines, and trace propagation. That means a canceled request can keep waiting on the Python executor until the default timeout expires. If the current streaming-policy interface does not expose a context, the fix likely needs to thread one through that interface or persist the active request context on the bridge before issuing these calls.#!/bin/bash set -euo pipefail echo "Streaming policy interface signatures:" rg -n -C2 'NeedsMore(Request|Response)Data\(' sdk gateway/gateway-runtime/policy-engine echo echo "Current bridge call sites using context.Background():" rg -n -C2 'NeedsMore(Request|Response)Data|context\.Background\(' \ gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.goAlso applies to: 216-238
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 166 - 188, The NeedsMoreRequestData path is using context.Background() which drops cancellation, deadlines and traces; replace those Background contexts by propagating the active request context into the streaming decision calls: thread a context parameter through the NeedsMoreRequestData/NeedsMoreResponseData call sites (or store the active request context on the bridge before invoking buildNeedsMoreRequestDataRequest / buildNeedsMoreResponseDataRequest), then pass that context into buildNeedsMoreRequestDataRequest, streamManager.Execute and any translator calls instead of context.Background(); update the signatures impacted (e.g. NeedsMoreRequestData, buildNeedsMoreRequestDataRequest, streamManager.Execute invocation sites and the symmetric response methods) so cancellations and deadlines on the incoming request are honored and traces propagate to the Python executor.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/Dockerfile`:
- Around line 124-128: The python-deps stage uses envoyproxy/envoy:v1.35.3 which
mismatches the production stage's envoyproxy/envoy:v1.37.1 and can break
compiled C extensions; update the Dockerfile so both stages use the same image
version—either change the FROM line for the python-deps stage to match the
production tag (production) or introduce a top-level ARG (e.g., ENVOY_VERSION)
and reference it from both FROM lines (python-deps and production) to guarantee
consistency.
In `@gateway/gateway-runtime/Makefile`:
- Around line 165-167: The test-proto Makefile target currently regenerates
protobuf bindings and then compiles them, which can silently overwrite committed
generated files; modify the test-proto target to (1) run the existing
regeneration commands (the ones that produce the outputs referenced by the
Makefile variable $(PROTO_GENERATED)), then (2) run a diff check (e.g., git diff
--exit-code or diff -u) comparing the working tree against $(PROTO_GENERATED)
(or explicitly diff the regenerated files) and fail if there are differences so
CI fails on checked-in codegen drift; update the target that currently runs in
test-proto to include this diff/assertion step after regeneration.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 558-568: The DestroyPolicy response currently only logs when
resp.GetSuccess() is false, causing Close() to return nil; change the block
handling DestroyPolicy by treating a non-success response as an error: set
b.closeErr = fmt.Errorf("destroy Python policy instance rejected by executor:
%s", resp.GetErrorMessage()), log the warning with the error, and return so the
Close() caller observes the failure; update any surrounding logic that relies on
b.closeErr (e.g., Close(), closeOnce) to propagate that stored error rather than
swallowing it.
---
Duplicate comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 166-188: The NeedsMoreRequestData path is using
context.Background() which drops cancellation, deadlines and traces; replace
those Background contexts by propagating the active request context into the
streaming decision calls: thread a context parameter through the
NeedsMoreRequestData/NeedsMoreResponseData call sites (or store the active
request context on the bridge before invoking buildNeedsMoreRequestDataRequest /
buildNeedsMoreResponseDataRequest), then pass that context into
buildNeedsMoreRequestDataRequest, streamManager.Execute and any translator calls
instead of context.Background(); update the signatures impacted (e.g.
NeedsMoreRequestData, buildNeedsMoreRequestDataRequest, streamManager.Execute
invocation sites and the symmetric response methods) so cancellations and
deadlines on the incoming request are honored and traces propagate to the Python
executor.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 90f02c2f-5b35-4af0-8e8c-a8fdb6206318
📒 Files selected for processing (6)
gateway/gateway-runtime/Dockerfilegateway/gateway-runtime/Makefilegateway/gateway-runtime/policy-engine/internal/admin/handlers_test.gogateway/gateway-runtime/policy-engine/internal/admin/types.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/factory_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- gateway/gateway-runtime/policy-engine/internal/admin/types.go
- gateway/gateway-runtime/policy-engine/internal/admin/handlers_test.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Set bridge.closeErr and return when DestroyPolicy is rejected by the Python executor, and log the wrapped error. Add unit test coverage with a fake executor client to verify Close returns the executor error, is idempotent, and only calls DestroyPolicy once. Also add a git-diff check to Makefile's test-proto target to ensure generated protobuf files are up-to-date.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go (1)
136-157: Consider asserting that the unexpected-RPC stubs are never hit.
ExecuteStream,HealthCheck, andInitPolicyreturn errors but don't fail the test if called. Swappingerrors.New(...)fort.Fatalf(...)(via a captured*testing.T) would make accidental regressions — e.g.,Close()starting to issue a health check — surface immediately instead of being swallowed by the caller's error path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go` around lines 136 - 157, The fakePythonExecutorClient's unexpected-RPC stubs (ExecuteStream, HealthCheck, InitPolicy) should fail the test if invoked: add a captured *testing.T field to fakePythonExecutorClient and, in ExecuteStream/HealthCheck/InitPolicy, call t.Fatalf with a clear "unexpected <MethodName> call" message (then return zero values to satisfy signatures); update the test that constructs fakePythonExecutorClient to pass the *testing.T, and ensure the test imports testing; keep DestroyPolicy unchanged.gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go (1)
525-547: Error parameter is unused in*ErrorActionhelpers.Each helper takes
err errorbut never reads it — the caller already logs the error at the call site. Either drop the parameter for clarity, or thread the error into the immediate-response body/headers (e.g., a structured error code) so operators can correlate 500s with the underlying failure without having to cross-reference logs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around lines 525 - 547, These helper methods (requestBodyErrorAction, responseBodyErrorAction, requestHeaderErrorAction, responseHeaderErrorAction, streamingRequestErrorAction, streamingResponseErrorAction) accept an err parameter that is never used; remove the unused err parameter from their signatures and update all call sites to stop passing an error, keeping the same return types (policy.RequestAction, policy.ResponseAction, policy.RequestHeaderAction, policy.ResponseHeaderAction, policy.StreamingRequestAction, policy.StreamingResponseAction); ensure any callers or tests that previously relied on the old signatures are updated accordingly and run the build/tests to confirm no remaining references to the removed parameter.gateway/gateway-runtime/Makefile (1)
153-163: Consider aligning the Go and Pythonprotocinvocations.The Go step
cds intoproto/and passes../$(GO_PROTO_OUT), while the Python step runs from the Makefile directory and passes$(PROTO_FILE)directly. Both work, but the asymmetry makes the recipe harder to follow and prevents reusing a common include-path variable. Running both invocations from the same working directory (e.g., keeping the Makefile CWD and using-I protofor both) would make the target easier to extend and bring it closer to the 5-line body threshold flagged bycheckmake.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-runtime/Makefile` around lines 153 - 163, The proto Makefile recipe should run both protoc invocations from the same CWD and use a common include path; remove the cd proto in the Go step and change both commands to pass -I proto, call the proto file as proto/python_executor.proto (or use $(PROTO_FILE) with -I proto), and adjust Go outputs to --go_out=$(GO_PROTO_OUT) and --go-grpc_out=$(GO_PROTO_OUT) (remove the ../ prefix), while preserving the PATH="$(GO_BIN):$$PATH" prefix for the Go invocation and keeping the Python invocation using $(PYTHON) -m grpc_tools.protoc; update the inputs and output flags in the proto target so both invocations are symmetric and reuse the same include-path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 166-189: NeedsMoreRequestData and NeedsMoreResponseData currently
call b.streamManager.Execute with context.Background(), dropping cancellation,
deadlines and traces; update these helpers to use a proper context: either (A)
store a per-execution context on the bridge when
OnRequestBodyChunk/OnResponseBodyChunk (or an earlier lifecycle method) runs and
read that context inside NeedsMoreRequestData/NeedsMoreResponseData before
calling buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and
streamManager.Execute (e.g., use b.execCtx or similar), or (B) if a
per-execution context isn’t available yet, derive a bounded context with
context.WithTimeout(context.Background(), getTimeout()) and pass that into
buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and
streamManager.Execute to cap work; make sure to replace the context.Background()
usages in NeedsMoreRequestData and NeedsMoreResponseData with the chosen context
so cancelation and timeouts propagate to the Python executor.
---
Nitpick comments:
In `@gateway/gateway-runtime/Makefile`:
- Around line 153-163: The proto Makefile recipe should run both protoc
invocations from the same CWD and use a common include path; remove the cd proto
in the Go step and change both commands to pass -I proto, call the proto file as
proto/python_executor.proto (or use $(PROTO_FILE) with -I proto), and adjust Go
outputs to --go_out=$(GO_PROTO_OUT) and --go-grpc_out=$(GO_PROTO_OUT) (remove
the ../ prefix), while preserving the PATH="$(GO_BIN):$$PATH" prefix for the Go
invocation and keeping the Python invocation using $(PYTHON) -m
grpc_tools.protoc; update the inputs and output flags in the proto target so
both invocations are symmetric and reuse the same include-path.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go`:
- Around line 136-157: The fakePythonExecutorClient's unexpected-RPC stubs
(ExecuteStream, HealthCheck, InitPolicy) should fail the test if invoked: add a
captured *testing.T field to fakePythonExecutorClient and, in
ExecuteStream/HealthCheck/InitPolicy, call t.Fatalf with a clear "unexpected
<MethodName> call" message (then return zero values to satisfy signatures);
update the test that constructs fakePythonExecutorClient to pass the *testing.T,
and ensure the test imports testing; keep DestroyPolicy unchanged.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go`:
- Around line 525-547: These helper methods (requestBodyErrorAction,
responseBodyErrorAction, requestHeaderErrorAction, responseHeaderErrorAction,
streamingRequestErrorAction, streamingResponseErrorAction) accept an err
parameter that is never used; remove the unused err parameter from their
signatures and update all call sites to stop passing an error, keeping the same
return types (policy.RequestAction, policy.ResponseAction,
policy.RequestHeaderAction, policy.ResponseHeaderAction,
policy.StreamingRequestAction, policy.StreamingResponseAction); ensure any
callers or tests that previously relied on the old signatures are updated
accordingly and run the build/tests to confirm no remaining references to the
removed parameter.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 168ce9db-9c10-4df8-9a4f-53581358bc58
📒 Files selected for processing (5)
gateway/gateway-runtime/Dockerfilegateway/gateway-runtime/Makefilegateway/gateway-runtime/policy-engine/internal/admin/handlers.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.gogateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- gateway/gateway-runtime/Dockerfile
- gateway/gateway-runtime/policy-engine/internal/admin/handlers.go
| func (b *bridge) NeedsMoreRequestData(accumulated []byte) bool { | ||
| if b.mode.RequestBodyMode != policy.BodyModeStream { | ||
| return false | ||
| } | ||
|
|
||
| req, err := b.buildNeedsMoreRequestDataRequest(context.Background(), accumulated) | ||
| if err != nil { | ||
| b.slogger.Error("Failed to build request streaming decision payload", "error", err) | ||
| return false | ||
| } | ||
|
|
||
| resp, err := b.streamManager.Execute(context.Background(), req) | ||
| if err != nil { | ||
| b.slogger.Error("Failed to execute Python request streaming decision", "error", err) | ||
| return false | ||
| } | ||
|
|
||
| decision, err := b.translator.ToGoNeedsMoreDecision(resp) | ||
| if err != nil { | ||
| b.slogger.Error("Failed to translate Python request streaming decision", "error", err) | ||
| return false | ||
| } | ||
| return decision | ||
| } |
There was a problem hiding this comment.
NeedsMoreRequestData/NeedsMoreResponseData drop the request context.
Both hooks call b.streamManager.Execute(context.Background(), ...), so cancellation, deadlines, and tracing from the in-flight request cannot propagate into the Python executor. The SDK interface (NeedsMoreRequestData(accumulated []byte) bool) doesn't accept a context.Context, so a local fix here is limited, but two options are worth considering:
- Store a per-execution context on the bridge when
OnRequestBodyChunk/OnResponseBodyChunk(or an earlier phase) runs and reuse it from these helpers, or - Derive a bounded context via
context.WithTimeout(context.Background(), getTimeout())to at least cap executor work when the caller has abandoned the request.
Right now a hung or slow Python executor will keep processing streaming-decision RPCs for a request the gateway has already cancelled.
Also applies to: 216-239
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gateway/gateway-runtime/policy-engine/internal/pythonbridge/bridge.go` around
lines 166 - 189, NeedsMoreRequestData and NeedsMoreResponseData currently call
b.streamManager.Execute with context.Background(), dropping cancellation,
deadlines and traces; update these helpers to use a proper context: either (A)
store a per-execution context on the bridge when
OnRequestBodyChunk/OnResponseBodyChunk (or an earlier lifecycle method) runs and
read that context inside NeedsMoreRequestData/NeedsMoreResponseData before
calling buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and
streamManager.Execute (e.g., use b.execCtx or similar), or (B) if a
per-execution context isn’t available yet, derive a bounded context with
context.WithTimeout(context.Background(), getTimeout()) and pass that into
buildNeedsMoreRequestDataRequest/buildNeedsMoreResponseDataRequest and
streamManager.Execute to cap work; make sure to replace the context.Background()
usages in NeedsMoreRequestData and NeedsMoreResponseData with the chosen context
so cancelation and timeouts propagate to the Python executor.
Related Discussion
Purpose
The WSO2 API Gateway currently supports only Go-based policies. This limits policy authors to Go, which has a steep learning curve and lacks the rich machine-learning and data-processing ecosystem available in Python. API developers and AI/ML practitioners need to write gateway policies using Python — for example, to apply NLP transformations, call ML model APIs, manipulate structured data with libraries like
pandas, or perform text processing withspacy,slugify, etc.This PR introduces a Python Policy Runtime that allows developers to write, package, and deploy gateway policies in Python while integrating seamlessly with the existing Go-based Policy Engine. Python policies follow the same lifecycle model as Go policies (one instance per route, factory pattern,
on_request/on_response/close) and participate in the same policy chain execution.Goals
sdk/policy.py) that mirrors the Gopolicy/v1alphainterface, exposing the same context types (RequestContext,ResponseContext,SharedContext), action types (UpstreamRequestModifications,ImmediateResponse,UpstreamResponseModifications), and processing mode enums.PythonBridge,BridgeFactory,StreamManager) that integrates Python policies transparently into the existing Policy Engine kernel — no kernel changes required.python_policy_registry.py, mergerequirements.txtfiles, copy Python policy sources, and register Python policies viaBridgeFactoryin the generatedplugin_registry.go.pipPackagefield inbuild.yaml, enabling teams to publish and consume Python policies from PyPI or private indices.slugify-body-python) that demonstrates the full Python policy lifecycle: factory, request-phase body transformation, graceful error handling, andclose().Approach
The implementation follows a co-process sidecar pattern:
New Components
1. Python Executor (
gateway/gateway-runtime/python-executor/)A Python gRPC server that:
InitPolicy,DestroyPolicy,ExecuteStream(bidirectional), andHealthCheck.InitPolicy: Called once per route during chain building. Invokes theget_policy(metadata, params)factory and stores the instance with a UUIDinstance_id.DestroyPolicy: Called when a route is removed. Callsinstance.close()and removes it from the store.ExecuteStream: Hot-path bidirectional stream. Uses an asyncio reader task +ThreadPoolExecutorfor concurrent policy execution. Responses are out-of-order; the Go side correlates byrequest_id.PolicyLoader: Importspython_policy_registry.py(generated by Gateway Builder) and cachesget_policy()factory functions.PolicyInstanceStore: Thread-safedictmappinginstance_id → Policy.Translator: Converts betweenprotobufmessages and Python SDK types.2. Python Policy SDK (
gateway/gateway-runtime/python-executor/sdk/policy.py)Mirrors Go's
policy/v1alphainterface:PolicyABC with abstracton_request(),on_response(), and default no-opclose().SharedContext,RequestContext,ResponseContext,Body.UpstreamRequestModifications,ImmediateResponse,UpstreamResponseModifications.ProcessingMode,HeaderProcessingMode,BodyProcessingMode.3. gRPC Contract (
gateway/gateway-runtime/proto/python_executor.proto)Defines
PythonExecutorServicewith:ExecuteStream(stream ExecutionRequest) → stream ExecutionResponse— hot-path execution.InitPolicy(InitPolicyRequest) → InitPolicyResponse— instance lifecycle.DestroyPolicy(DestroyPolicyRequest) → DestroyPolicyResponse— instance lifecycle.HealthCheck(HealthCheckRequest) → HealthCheckResponse— readiness probe.4. Go Python Bridge (
gateway/gateway-runtime/policy-engine/internal/pythonbridge/)PythonBridge(bridge.go): Implementspolicy.Policy. On creation callsInitPolicyRPC to get aninstance_id. OnOnRequest/OnResponsebuilds anExecutionRequestproto, delegates toStreamManager, merges returnedupdated_metadataback intoSharedContext, and translates theExecutionResponseto a Go action. ImplementsClose()to callDestroyPolicyRPC.BridgeFactory(factory.go): Registered as thePolicyFactoryfor Python policies in the generatedplugin_registry.go. CallsInitPolicyto create the Python instance before returning thePythonBridge. Also exposesPythonHealthAdapterfor the admin health endpoint.StreamManager(client.go): Singleton managing the persistent bidirectionalExecuteStreamgRPC connection to the Python Executor over UDS. Thread-safe: usessendMuto serializestream.Send()calls and apendingReqsmap to correlaterequest_id → response channel. Also providesInitPolicy,DestroyPolicy, andHealthCheckunary RPC calls. Configurable viaPYTHON_EXECUTOR_SOCKETandPYTHON_POLICY_TIMEOUTenv vars.Translator(translator.go): ConvertsExecutionResponseproto to Go SDKRequestAction/ResponseAction, including error handling.5. Gateway Builder Extensions (
gateway/gateway-builder/)discovery/): A policy directory with nogo.modbut at least one.pyfile is classified asruntime: "python".pipPackagesupport (pkg/types/manifest.go,discovery/python_module.go): NewpipPackagefield inbuild.yaml(format:<package>==<version>[@<index-url>]) downloads a Python policy from PyPI or a private index usingpip install --targetwith cross-platform executable resolution (pip3→pip→python3 -m pip).policyengine/generator.go): Copies the Python Executor base, copies policy source directories intooutput/python-executor/policies/, generatespython_policy_registry.py, and merges allrequirements.txtfiles.plugin_registry.go.tmpl: GeneratesBridgeFactoryregistrations for Python policies, readingProcessingModefrompolicy-definition.yamlat build time. Imports thepythonbridgepackage only when Python policies are present ({{- if .HasPythonPolicies }}).validation/structure.go):ValidatePythonDirectoryStructurechecks forpolicy-definition.yamland at least one.pyfile.6. Docker & Entrypoint (
gateway/gateway-runtime/)Dockerfile: Added a newpython-depsbuild stage that installs Python dependencies from the mergedrequirements.txtinto/app/python-libsusingpip install --target. The production stage copies the Python Executor (/app/python-executor) and pre-installed libs (/app/python-libs), setsPYTHONPATH, and installspython3in the final image. Thepython3-pipandpython3-venvtools are not present in the production image.docker-entrypoint.sh: Conditionally starts the Python Executor whenmain.pyis present in the image (i.e., at least one Python policy was compiled). Waits for the UDS socket to be ready before starting the Policy Engine. Handles graceful SIGTERM shutdown of all three processes (Python Executor, Policy Engine, Envoy). Configurable viaPYTHON_POLICY_WORKERS,PYTHON_POLICY_TIMEOUT,PYTHON_EXECUTOR_SOCKET.7. Admin Health Endpoint (
gateway/gateway-runtime/policy-engine/internal/admin/)types.go: AddedPythonExecutorHealth { Status, LoadedPolicies }toHealthResponse.handlers.go: AddedPythonHealthCheckerinterface;HealthHandleroptionally calls it.main.go: InstantiatesPythonHealthAdapterif the Python Executor socket exists at startup.8. Sample Policy (
gateway/sample-policies/slugify-body-python/)A canonical reference implementation (
SlugifyBodyPolicy) that:python-slugify(a third-party PyPI package) to convert request body text into URL-safe slugs.ProcessingModewithrequestBodyMode: BUFFER, parameter parsing, body decoding/encoding, error handling, andclose().policy-definition.yaml,requirements.txt, and a heavily commentedpolicy.py.User stories
slugify-body-python) that shows the full Python policy lifecycle, including how to use third-party Python libraries.Documentation
N/A — Documentation for the Python policy authoring guide, SDK reference, and
pipPackagebuild.yaml field should be added as follow-up work. Theslugify-body-python/policy.pyfile is heavily commented and serves as the inline developer guide for this release.Automation tests
Unit tests
Gateway Builder (
gateway/gateway-builder):TestDetectRuntime_GoPolicy/TestDetectRuntime_PythonPolicy/TestDetectRuntime_GoTakesPrecedence/TestDetectRuntime_EmptyDirectory/TestDetectRuntime_NonexistentDirectory— runtime auto-detection logic.TestParsePipPackageRef_SimplePackage/TestParsePipPackageRef_WithPrivateIndex/TestParsePipPackageRef_MissingVersion/TestParsePipPackageRef_EmptyVersion/TestParsePipPackageRef_EmptyPackageName— pip package reference parsing.TestDiscoverPoliciesFromBuildFile_PythonAutoDetect/TestDiscoverPoliciesFromBuildFile_MixedGoAndPython— build file discovery with Python policies.TestGenerateCode_Success/TestGenerateCode_EmptyPolicies/TestGenerateCode_MultiplePolicies— code generation including thegeneratePythonExecutorBasestep.Note: The
pythonbridgeGo package (gateway/gateway-runtime/policy-engine/internal/pythonbridge/) does not have unit tests yet — these require a live Python Executor or test doubles. Unit tests forPythonBridge,StreamManager,BridgeFactory, andTranslatorare planned as follow-up work.Integration tests
No dedicated integration test feature file for Python policies has been added yet (the
slugify-body-pythonpolicy is compiled and registered inbuild.yaml, but no BDD scenario exercises it end-to-end). Integration tests for the Python policy runtime — verifying request body slugification,SharedContextmetadata propagation, policy lifecycle (InitPolicy / DestroyPolicy), and Python Executor health endpoint — are planned as follow-up work.Security checks
--targetdirectory inside the Docker image.pipandvenvare absent from the production image.wso2user (uid=10001).ExecutionRequestproto parameters are passed asgoogle.protobuf.Struct(structured, typed); no shell interpolation or SQL involved.banditis recommended as follow-up.)Samples
slugify-body-python— Canonical Python Policy ReferenceThis policy is the authoritative sample for Python policy development. It:
python-slugifylibrary."Hello World! Héllo?"→"hello-world-hello"get_policy(metadata, params)factory — parsesseparatorandmaxLengthparams once.on_request(ctx, params)hot path — guard → decode → validate → transform → respond.close()— no-op here, but shows where to put cleanup logic.400 Bad RequestImmediateResponsefor empty bodies instead of silently passing them.requirements.txt— demonstrates how policy-level pip dependencies are merged into the executor's dependency set at build time.Usage in
build.yaml:Usage in an API definition:
PyPI-distributed policy (
pipPackage)Python policies can be distributed as PyPI packages and consumed via:
Related PRs
N/A
Test environment
envoyproxy/envoy:v1.35.3)linux/amd64,linux/arm64)