in_splunk: Implement handling remote addr feature#11398
in_splunk: Implement handling remote addr feature#11398
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds configurable remote-address extraction to the Splunk input: new config options to enable injection and set the record key, per-request extraction from X-Forwarded-For or peer address, propagation into payload processing, and per-request state to avoid leakage. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler as "Splunk Request Handler"
participant Headers as "HTTP Header Parser"
participant Resolver as "XFF / Peer Resolver"
participant Processor as "Payload Processor"
participant Emitter as "Record Emitter"
Client->>Handler: Send HTTP request (may include X-Forwarded-For)
Handler->>Headers: Parse request headers
Headers->>Resolver: Lookup "x-forwarded-for"
alt XFF present
Resolver->>Resolver: Extract first IP from XFF
else XFF absent
Resolver->>Resolver: Use peer connection address
end
Resolver->>Handler: Provide remote address
Handler->>Processor: Process payload + remote address
Processor->>Emitter: Append remote_addr key to record (if enabled)
Emitter->>Emitter: Emit enriched log event
Handler->>Handler: Clear per-request remote address state
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba50be0c16
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: In splunk_prot_handle_ng() the per-request fields
context->current_remote_addr and context->current_remote_addr_len are set but
never reset, allowing stale addresses to persist across requests; update the
function to ensure these fields are cleared before every return (or funnel
returns through a single cleanup label), e.g., after using
extract_remote_address(), when falling back to peer
(flb_connection_get_remote_address(parent_session->connection)), and prior to
any early exits: set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0 (or free/reset as appropriate) so the same
cleanup performed in splunk_prot_handle() is applied here.
🧹 Nitpick comments (5)
plugins/in_splunk/splunk.h (1)
76-79: Consider usingconst char *instead offlb_sds_tfor borrowed pointers.
current_remote_addris assigned non-owned pointers from either the XFF header value orflb_connection_get_remote_address()insplunk_prot.c. Usingflb_sds_tis misleading since it implies an owned/allocated string that should be managed withflb_sds_*functions.For clarity and to prevent accidental misuse:
♻️ Suggested change
/* Remote address */ - flb_sds_t current_remote_addr; + const char *current_remote_addr; size_t current_remote_addr_len;plugins/in_splunk/splunk_prot.c (4)
265-290: Const-correctness issue in output parameter.The function assigns
const char *values (fromextract_xff_valueandflb_connection_get_remote_address) to*out, butoutis declared aschar **. This discards theconstqualifier and may cause compiler warnings.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len) {Also update the corresponding field type in
splunk.hand call sites insplunk_prot_handle()andsplunk_prot_handle_ng().
424-428: Unused parameters in function signature.The
remote_addrandremote_addr_lenparameters are added to the signature but never used. The function usesctx->current_remote_addrandctx->current_remote_addr_lendirectly at lines 478-480.Either use the passed parameters or remove them from the signature to avoid confusion:
♻️ Option 1: Remove unused parameters
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm)♻️ Option 2: Use the passed parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }
775-780: Unused parameters in function signature.Similar to
process_flb_log_append(), theremote_addrandremote_addr_lenparameters are not used within this function. The downstreamprocess_raw_payload_pack()accessesctx->current_remote_addrdirectly.Consider removing these unused parameters for consistency:
♻️ Suggested change
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request)
1115-1118: Missing cleanup on early return paths.The per-request remote address is cleared at the end of successful processing, but multiple early return paths (lines 861, 928, 974, 1040, 1066, 1088, 1104) skip this cleanup. While the state is re-initialized at the start of each request (lines 1003-1004), for defensive coding it would be cleaner to use a
goto cleanuppattern to ensure consistent cleanup.Alternatively, since the state is always re-initialized at the start of
splunk_prot_handle(), this might be acceptable as-is. Just ensure this initialization always happens before any potential use.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)
424-481: Unused parameters:remote_addrandremote_addr_lenare never referenced.The function signature was updated to accept
remote_addrandremote_addr_len, but the implementation usesctx->current_remote_addrandctx->current_remote_addr_lendirectly (lines 479-480). Either use the parameters or remove them from the signature.♻️ Option 1: Remove unused parameters (simpler)
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm) {And update call sites accordingly.
♻️ Option 2: Use the parameters instead of context fields
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }
775-814: Unused parameters:remote_addrandremote_addr_lenare never referenced.Similar to
process_flb_log_append, these parameters are added to the signature but never used. The underlyingprocess_raw_payload_packreads fromctx->current_remote_addrdirectly.♻️ Suggested fix - remove unused parameters
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request) {Update the call site at line 1027 accordingly.
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1455-1459: In splunk_prot_handle_ng() the cleanup lines refer to
an undefined variable ctx; replace those uses with the correct function-local
variable name context (i.e., set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0) so the per-request remote address is
cleared on the correct struct instance.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
265-290: Const-correctness issue: discardingconstqualifier.The function accepts
char **outbut assignsconst char *values to it (fromextract_xff_valueandflb_connection_get_remote_address). This silently discards theconstqualifier. Consider changing the output parameter type to preserve const-correctness.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len) { const char *value = NULL; size_t len = 0; extract_xff_value(xff_value, xff_value_len, &value, &len); if (value == NULL && connection != NULL) { value = flb_connection_get_remote_address(connection); if (value != NULL) { len = strlen(value); } } if (value == NULL || len == 0) { return -1; } - *out = value; + *out = value; *out_len = len; return 0; }Also update the callers (
splunk_prot_handleandsplunk_prot_handle_ng) to declarehvalasconst char *:- char *hval = NULL; + const char *hval = NULL;And update the context fields if they aren't already
const char *.
7435927 to
ae2ad69
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: Add a NULL-check for request->stream->parent before
dereferencing parent_session->connection: after assigning parent_session =
(struct flb_http_server_session *) request->stream->parent, verify
parent_session != NULL and return an error (e.g., -1) or perform appropriate
error handling if it is NULL; then continue with the existing logic that uses
parent_session->connection (used by extract_remote_address and
flb_connection_get_remote_address) to avoid a crash if the parent session is
missing.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
424-481: Use the passedremote_addrparameters to avoid shared mutable state.
Right nowprocess_flb_log_append()ignores its new parameters and re-readsctx->current_remote_addr. Using the parameters makes the function’s contract explicit and reduces reliance on shared state.♻️ Suggested change
- if (ret == FLB_EVENT_ENCODER_SUCCESS) { - ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); - } + if (ret == FLB_EVENT_ENCODER_SUCCESS) { + ret = append_remote_addr(ctx, remote_addr, remote_addr_len); + }Also applies to: 775-780
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)
424-486: Unused parameters:remote_addrandremote_addr_lenare passed but ignored.The function signature accepts
remote_addrandremote_addr_lenparameters (lines 427-428), but the implementation at lines 482-485 usesctx->current_remote_addrandctx->current_remote_addr_lendirectly instead. This inconsistency makes the API misleading.🐛 Proposed fix: use the passed parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }Alternatively, if the intent is to always use the context's current address, remove the unused parameters from the function signature.
780-819: Unused parameters inprocess_hec_raw_payload.The function signature was extended to include
remote_addrandremote_addr_len(lines 784-785), but these parameters are never used in the function body. The call toprocess_raw_payload_packat line 816 doesn't pass them, andprocess_raw_payload_packreads fromctx->current_remote_addrdirectly.Either remove the unused parameters from the signature, or if they were intended for future use, add a comment explaining this.
♻️ Proposed fix: remove unused parameters
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request)Then update the call site at line 1032 accordingly.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
265-290: Const-correctness issue: output parameter should beconst char **.The function assigns a
const char *(fromextract_xff_valueandflb_connection_get_remote_address) to*out, but the parameter is declared aschar **. This casts away const, which could lead to undefined behavior if callers attempt to modify the returned string.♻️ Proposed fix
-static int extract_remote_address(const char *xff_value, - size_t xff_value_len, - struct flb_connection *connection, - char **out, - size_t *out_len) +static int extract_remote_address(const char *xff_value, + size_t xff_value_len, + struct flb_connection *connection, + const char **out, + size_t *out_len)This will require updating the callers to use
const char *for the corresponding local variables (hvalinsplunk_prot_handleandsplunk_prot_handle_ng).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 454-456: The call to
flb_log_event_encoder_set_body_from_msgpack_object is passing the wrong argument
(ctx) — replace the first parameter with the log encoder instance by passing
&ctx->log_encoder instead of ctx so the function receives a pointer to the
flb_log_event_encoder (change the call in the else branch where
flb_log_event_encoder_set_body_from_msgpack_object(ctx, record) is used).
🧹 Nitpick comments (3)
plugins/in_splunk/splunk_prot.c (3)
265-290: Consider improving const-correctness.The function assigns
const char *valuetochar **out, discarding the const qualifier. Since the returned pointer refers to either header data or the connection's address string (both effectively read-only), the output parameter should beconst char **outto preserve type safety.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len)This would require updating the callers to use
const char *for the address variables as well.
424-428: Unused parameters:remote_addrandremote_addr_lenare never used.The function signature was updated to accept
remote_addrandremote_addr_lenparameters, but the body usesctx->current_remote_addrandctx->current_remote_addr_leninstead. Either use the passed parameters or remove them from the signature to avoid confusion.♻️ Option 1: Use the parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }♻️ Option 2: Remove unused parameters
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm)And update all call sites to remove the extra arguments.
Also applies to: 482-486
783-785: Unused parameters:remote_addrandremote_addr_lenare never used.Similar to
process_flb_log_append, these parameters are added to the signature but never used in the function body. The called functionprocess_raw_payload_packusesctx->current_remote_addrdirectly. Consider removing these parameters for consistency.
|
@cosmo0920 thanks, would you please reduce or squash the commits ? |
c012c36 to
0b419ab
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/in_splunk/splunk_prot.c (1)
781-820: 🛠️ Refactor suggestion | 🟠 MajorUnused parameters
remote_addrandremote_addr_leninprocess_hec_raw_payload.These parameters are accepted but never referenced. The downstream
process_raw_payload_packreads fromctx->current_remote_addrdirectly. Either use the parameters or drop them from the signature to avoid confusion.
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1387-1407: In splunk_prot_handle_ng ensure
context->current_remote_addr and context->current_remote_addr_len are always
cleaned up before any early return: either move the existing cleanup into a
single exit/cleanup block and jump to it (e.g., add a cleanup label and use goto
for the early-return paths) or insert the same cleanup call(s) immediately
before each early return (the spots around the current early returns that
reference request handling); specifically reference and clear
context->current_remote_addr and reset context->current_remote_addr_len so no
leaked pointer remains.
- Around line 1008-1025: The per-request remote address
(ctx->current_remote_addr and ctx->current_remote_addr_len) is set by
extract_remote_address or flb_connection_get_remote_address but not cleared on
multiple early-return paths, causing state leakage across keep-alive requests;
modify the function to centralize cleanup (e.g., introduce a cleanup label and
jump to it on every early return) or explicitly reset ctx->current_remote_addr =
NULL and ctx->current_remote_addr_len = 0 before each return so that the state
is always cleared; update the code paths that follow the
extract_remote_address/flb_connection_get_remote_address logic to use the
centralized cleanup (or inline clears) and ensure existing final cleanup at the
original end of the function covers all exit paths.
- Around line 1008-1025: The code stores per-request state in struct flb_splunk
fields current_remote_addr/current_remote_addr_len; instead, resolve the remote
address into local variables and pass them as explicit parameters to downstream
functions (e.g., change calls that currently rely on ctx->current_remote_addr to
accept (remote_addr, remote_addr_len)), update extract_remote_address usage to
populate local vars, and modify process_hec_raw_payload and any other functions
invoked from the two handlers to accept the new parameters and use them instead
of ctx fields; remove setting/clearing of ctx->current_remote_addr and
ctx->current_remote_addr_len and ensure all call sites are updated to pass the
address/length through the call stack.
- Around line 424-428: The function process_flb_log_append currently declares
parameters remote_addr and remote_addr_len but reads ctx->current_remote_addr
and ctx->current_remote_addr_len instead; update the function body (locations
where ctx->current_remote_addr / ctx->current_remote_addr_len are used, e.g.,
around the existing lines ~484–486) to use the passed-in remote_addr and
remote_addr_len parameters instead so the signature is actually used, and remove
any compiler-warning workarounds if present; ensure callers already pass the
correct values (no caller changes needed if they do).
- Around line 162-227: http2 header values from nghttp2 are transient and are
being stored as dangling pointers; update the HTTP/2 handling so header values
are duplicated before storing: in the code path where http2 headers are added
(see http2_header_callback and calls to flb_http_request_set_header) allocate
and pass an owned copy of value (e.g. use cfl_sds_create_len or equivalent
strdup-like allocator used for "host"/"content-type") instead of the raw nghttp2
buffer; alternatively modify flb_http_request_set_header to always duplicate the
supplied value when request type is HTTP/2 (so flb_http_request_get_header
returns a stable pointer later). Ensure the duplicated buffer length is used
(not strlen on temporary buffer) and free the copy when the request is torn
down.
🧹 Nitpick comments (2)
plugins/in_splunk/splunk_prot.c (1)
265-290: Const-correctness:extract_remote_addressassignsconst char *tochar **out.
valueis declaredconst char *(line 271) andflb_connection_get_remote_addressalso returnsconst char *(line 277), but the output parameteroutischar **. This silently drops theconstqualifier, potentially allowing callers to modify read-only memory. Consider changingouttoconst char **.tests/runtime/in_splunk.c (1)
30-30: Relative include path for plugin internal header.
#include "../../plugins/in_splunk/splunk_prot.h"is fragile and will break if the test file moves. This is a minor concern but consistent with how other plugin tests in this repo reference internal headers.
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
0b419ab to
f014367
Compare
Currently, in_splunk does not handle remote address.
This could be inconvenient to track remote address for traceability.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
in_splunk: Add remote_addr related parameters' descriptions fluent-bit-docs#2360
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Tests