Skip to content

implement HTTP action capability without deduplication#111

Merged
jinhoonbang merged 30 commits into
mainfrom
PRODCRE-412-implement-http-action-capability-without-turn-taking
Jul 2, 2025
Merged

implement HTTP action capability without deduplication#111
jinhoonbang merged 30 commits into
mainfrom
PRODCRE-412-implement-http-action-capability-without-turn-taking

Conversation

@jinhoonbang

@jinhoonbang jinhoonbang commented Jun 5, 2025

Copy link
Copy Markdown
Contributor
  • create a new standalone HTTP action capability
  • implement both direct and gateway-based outbound requests

https://smartcontract-it.atlassian.net/browse/PRODCRE-410

@jinhoonbang jinhoonbang requested a review from a team as a code owner June 5, 2025 16:44
@jinhoonbang jinhoonbang requested a review from Copilot June 5, 2025 16:44

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Implements the HTTP Actions capability module without deduplication, providing both direct and gateway-based request dispatch, with full validation, backoff retry, and test coverage.

  • Introduces new http module with go.mod and dependencies
  • Adds common proxy (direct HTTP client) and gateway outbound proxy (with retry/backoff)
  • Implements input validation/defaults for HTTP actions and wires up the service/CLI entrypoint
  • Covers behavior with unit tests for validation, gateway proxy, and backoff logic

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
http/go.mod New module declaration and required dependencies
http/common/service_config.go Defines service and gateway connection configuration structs
http/common/proxy.go Direct HTTP client implementation with limits
http/gateway/gateway_outbound_proxy_test.go Tests for gateway proxy’s connection retry, send, and backoff
http/cmd/action/main.go CLI entrypoint to serve the HTTP action capability
http/action/validate.go Input validation and default application for HTTP requests
http/action/validate_test.go Unit tests covering validation rules and limits
http/action/action.go Service implementation wiring proxy client based on ProxyMode
Comments suppressed due to low confidence (1)

http/common/proxy.go:17

  • The comment indicates 1 MB but the value 10 * 1024 * 1024 actually sets ~10 MB. Update the comment to reflect 10 MB or adjust the constant to match the intended size.
const defaultMaxBodyLength = 10 * 1024 * 1024 // 1 MB

Comment thread http/action/action.go Outdated
Comment thread http/action/validate.go Outdated
Comment thread http/gateway/gateway_outbound_proxy_test.go Outdated
Comment thread http/gateway/gateway_outbound_proxy_test.go Outdated
jinhoonbang and others added 4 commits June 6, 2025 02:03
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Comment thread http/common/proxy.go Outdated
for k, v := range input.Headers {
var values []string
values = append(values, v)
inputHeaders[k] = values

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Just noting that if a user assigns the same header key with multiple values e.g.

Accepted-key: 123
Accepted-key: 456

This logic will only propagate Accepted-key: 456, which could be confusing. Sometimes you might see headers declared that way, although the Input.Headers type being a map might discourage users from doing so.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input.Headers type being a map might discourage users from doing so

yeah +1 to this. Since headers are a map and header value is defined as a string, users are encouraged to set headers as:

Accepted-key: 123,456

Comment thread http/action/action.go
_ core.PipelineRunnerService,
_ core.RelayerSet,
_ core.OracleFactory,
gc core.GatewayConnector,

@vreff vreff Jun 9, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is core.GatewayConnector a new feature capabilities have?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It existed in core (already used for functions, web api trigger and target in DAG workflows, etc)

New thing here is that core.GatewayConnector is now exposed as a gRPC service (PR) so that standard capabilities (LOOP plugins) have access to it.

case <-ctx.Done():
return "", ctx.Err()
default:
gateway, err := selector.NextGateway()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long can this node stay connected to the selected gateway? Must it establish a connection with a fresh gateway for every single request?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like "no" this is only done in the initialization of the capability, and not per-request. Which is great!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Gateway nodes and workflow nodes have long-lived WS connections

Comment thread http/action/validate.go Outdated
Comment thread http/action/validate.go Outdated
Comment thread http/common/proxy.go Outdated
Comment thread http/common/proxy.go Outdated
Comment thread http/common/proxy.go Outdated
Comment thread http/common/proxy.go Outdated
Comment thread http/common/proxy.go Outdated
for k, v := range input.Headers {
var values []string
values = append(values, v)
inputHeaders[k] = values

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines +108 to +114
workflowAllow, globalAllow := p.outgoingRateLimiter.AllowVerbose(metadata.WorkflowOwner)
if !workflowAllow {
return nil, errors.New(errorOutgoingRatelimitWorkflowOwner)
}
if !globalAllow {
return nil, errors.New(errorOutgoingRatelimitGlobal)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather global allow be based on how many workflows you have. It's unclear to auditors (or users who dont' have implicit trust of your workflow) of a workflow that you can make it fail intentionally by sending HTTP requests in another workflow.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the suggestion to increase the limit proportional to the number of workflows deployed for given workflow owner?

In my mind, these limits exist for 2 reasons:

  1. prevent malicious users from DoSing other users
  2. prevent workflow node(s) from DoSing gateway nodes

for #1, rate limiting based on orgID seems most logical. Since orgID is not available, workflow owner is the next best option. globalAllow exists to prevent #2.

Also, I imagine we would have rate-limiting in other parts of the stack like the engine. These rate-limits at capability-level are additional security measures. One good thing is that we can also introduce ability to override at workflow owner level (today, all workflowOwners get the same limit) so that we can react to malicious users if needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When workflows can impact other workflows, it's hard to reason about what a workflow does. For example, If I need to prove my workflow, and you expect that as long as the weather network is up (and I have PoR on my funds that are locked in), you're covered for insurance, if I can create a second workflow that intentionally DoSes my other one it could prevent the insurance from paying out properly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synced offline. Not a blocker for now. This specific concern is around DoS vectors among workflows that belong to the same workflow owner, not cross-tenant.

Comment thread http/gateway/gateway_outbound_proxy.go Outdated
}
}

func (p *gatewayOutboundProxy) getMessageID(metadata capabilities.RequestMetadata) string {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this ID do?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to track outbound HTTP request/response between gateway node and workflow node. Workflow node sends a outbound request to the gateway with this messageID and waits for gateway messages with the same messageID.

Comment on lines +276 to +278
} else if !globalAllow {
errorMsg = errorIncomingRatelimitGlobal
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I DoS other workflows?

Comment thread http/common/proxy.go Outdated
}
}

func (h *httpClientProxy) SendRequest(ctx context.Context, metadata capabilities.RequestMetadata, input *httpactions.Request) (*httpactions.Response, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this client it seems like it would be straightforward to incorporate batched requests. Is there anything about the Gateway client that would add any difficulty?

From our end some important use cases include attested batch RPC calls like fetching events from many different sources at once, fetching feed data from many different APIs at once, etc... Would be quite natural as a behavior this capability would support.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these batched requests defined by workflow authors via SDK or is this something else?

If former, I think it should be trivial to support batched requests by introducing a new interface in the SDK like:

SendBatchRequests(
    ctx context.Context,
    metadata capabilities.RequestMetadata,
    inputs []*http.Request
) ([]*http.Response, error)   

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, defined by the workflow author + SDK as a single call to the capability. I think it will be more optimal down the road as our TEE will produce an attestation of the result which is ~1kb in size, so will be great for us to have support for this vs attesting each individual request.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. sounds like something that we can easily converge on.

Comment thread http/common/proxy.go Outdated
// Build Outputs
outputs := &httpactions.Response{
StatusCode: uint32(resp.StatusCode), //nolint:gosec // G115
Headers: headers,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a field users can define to return/not return response headers? I'm nervous about passing response headers around as those may be sensitive and it is not in scope for our team to conceal them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as of now. Is there something particular about the header or does the same concern apply to body too?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the response body is returned in plaintext and so not all use cases will be a good fit for confidential HTTP, like fetching secret data from a database. We will align user expectations with that. However, important use cases like submitting payments / transactions etc.. will be fine since they generally return a transaction ID rather than anything sensitive, while the sensitive details exit in the request body & headers.

The reasoning around response headers being worrisome is that it's not obvious what information might be returned there and it will change case-by-case, compared to the response body which should be obvious for the user one way or another. Moreover, I would assume most users will not be using response headers.

Allowing it as a request variable returnResponseHeaders would make this a two-way door decision for us & will still allow users who need the response headers to opt-in and accept the risks.

@jinhoonbang jinhoonbang Jun 16, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense I think we should follow up with product on this. Tracking this as an open question here

@jinhoonbang jinhoonbang Jun 24, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vreff I gave this a little more thought - redacting/filtering of response headers would happen in the gateway node or the capability node. Isn't that a concern for the user? If sensitive data is included in the header, malicious gateway node and/or the capability node would have access to it. This makes me think ultimately, users must be responsible for ensuring their endpoint doesn't return any sensitive data in both the body and the header.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, in the non-TEE case you're not really safe (as is the case with all your secrets without a TEE :p). When you incorporate TEEs to execute the request they will simply not propagate the response headers back to the outside world.

An alternative is that we can simply do this ourselves once we add our NewConfidentialHTTPClientProxy to this capability, but then the responseHeaders field in the capability output object will always just be empty, which may confuse users. In my opinion a returnResponseHeaders field that is false by default and has some documentation around it will be the most user-friendly way to deal with response headers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. In that case, what about adding returnResponseHeaders to the SDK when we start offering "TEE" mode / confidential HTTP? Adding a new field is backwards-compatible and would be less confusing to the users initially instead having a field that doesn't do anything in non-TEE mode.

@vreff vreff Jun 27, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's backwards-compatible that's good with me. That's just a property of mapstructure.Decode I guess? If we are adding a new boolean field that is false by default:

  • new capability using the old SDK will receive a falsy value for that field (nil)
  • old capability using the new SDK will ignore it.

Is that correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that sounds right 👍

@jinhoonbang jinhoonbang requested a review from a team June 25, 2025 10:19
@jinhoonbang jinhoonbang requested a review from a team as a code owner June 26, 2025 15:24
Comment thread http/action/common/proxy.go Outdated
Comment thread http/action/common/proxy.go Outdated
}

func (h *httpClientProxy) SendRequest(ctx context.Context, metadata capabilities.RequestMetadata, input *httpactions.Request) (*httpactions.Response, error) {
req, err := http.NewRequestWithContext(ctx, input.Method, input.Url, bytes.NewReader(input.Body))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to sanitize the request, disallow domains like localhost, private IP ranges etc. This was already done for PoR HTTP Fetch and passed an audit. Make sure to include all that work here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

httpClientProxy is only experimental so I haven't prioritized that work. Should we create a ticket for now and come back to it?

on the other hand, gatewayOutboundProxy is meant to be used for production and all requests will go through the safeurl client in the gateway (related PR)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, ended up addressing this comment in this PR along with some unit tests

Comment thread http/action/common/service_config.go Outdated
// LimitsConfig groups HTTP-related configuration fields for both requests and responses.
type LimitsConfig struct {
// MaxTimeoutMs is the timeout for HTTP requests in milliseconds.
MaxTimeoutMs uint32 `toml:"timeoutMs" json:"timeoutMs" yaml:"timeoutMs" mapstructure:"timeoutMs"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just "TimeoutMs"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maxTimeoutMs is ok because timeoutMs refers to optional timeout set by the workflow author and maxTimeoutMs is the upper bound.

@jinhoonbang jinhoonbang requested review from bolekk, nolag and vreff June 27, 2025 10:06
Comment thread http/README.md Outdated
Comment on lines +26 to +37
[limits]
timeoutMs = 10000 # Request timeout in milliseconds
maxResponseBytes = 1048576 # Maximum allowed response size in bytes
headerCount = 50 # Maximum number of headers
maxHeaderKeyLength = 256 # Maximum length of a header key
maxRequestBytes = 1048576 # Maximum allowed request size in bytes
maxResponseBytes = 1048576 # Maximum allowed response size in bytes

[gatewayConnection]
initialIntervalMs = 100 # Initial retry interval in milliseconds
maxElapsedTimeMs = 10000 # Maximum total retry time in milliseconds
multiplier = 2.0 # Backoff multiplier for retries

@DeividasK DeividasK Jun 27, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great documentation! Are all of these required, or do we have defaults for some?

Also, does it come from the capability config from Capabilities Registry, or do we expect all job specs we distribute to match?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have defaults for some. that reminds me I should add links to the defaults in the readme

TIL capabilities registry has a bytes config. I don't see a reason why config couldn't come from a Capabilities Registry per DON?

type ServiceConfig struct {
// RateLimiter configuration for messages incoming to this node from the gateway.
// The sender is a Gateway node, which is identified by the Gateway ID.
RateLimiter ratelimit.RateLimiterConfig `json:"incomingRateLimiter"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe call it IncomingRateLimiter?

case resp := <-responseCh:
lggr.Debugw("received response from gateway")
if resp.ErrorMessage != "" {
if isRateLimitError(resp.ErrorMessage) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this extra wrapping instead of just sending the error directly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like they can be merged. Will address this in the next PR

@jinhoonbang jinhoonbang added this pull request to the merge queue Jul 2, 2025
Merged via the queue into main with commit 213bb9f Jul 2, 2025
20 checks passed
@jinhoonbang jinhoonbang deleted the PRODCRE-412-implement-http-action-capability-without-turn-taking branch July 2, 2025 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants