Fix rust helper double logging as error broken connections#3792
Merged
cataphract merged 2 commits intomasterfrom Apr 16, 2026
Merged
Fix rust helper double logging as error broken connections#3792cataphract merged 2 commits intomasterfrom
cataphract merged 2 commits intomasterfrom
Conversation
estringana
approved these changes
Apr 15, 2026
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 3124fa7 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback! |
ce915da to
6fc6fd2
Compare
The special ForcefulDisconnect error was only being used in read, not
write paths, resulting in forceful disconnects (broken pipe, connection
reset) being logged as errors and sent via telemetry.
There was another problem in that these were being double-logged.
Fix both problems. Tested with:
```
--- a/appsec/helper-rust/src/client.rs
+++ b/appsec/helper-rust/src/client.rs
@@ -912,11 +912,11 @@ async fn send_command_resp(
cmd: protocol::CommandResponse<'_>,
) -> anyhow::Result<()> {
Fix rust helper double logging as error broken connections
debug!("Sending command: {:?}", cmd);
- match framed.send(cmd).await {
- Ok(_) => Ok(()),
- Err(err) => {
- error!("Error sending command: {}", err);
- Err(err)?
+ // Allows integration tests to kill the PHP process while the helper is mid-response,
+ // reproducing the broken-pipe scenario without relying on precise timing.
+ if let Ok(delay_str) = std::env::var("_DD_APPSEC_HELPER_SEND_RESP_DELAY_MS") {
+ if let Ok(ms) = delay_str.parse::<u64>() {
+ tokio::time::sleep(tokio::time::Duration::from_millis(ms)).await;
}
}
framed.send(cmd).await.map_err(|err| {
```
and:
```
package com.datadog.appsec.php.integration
import com.datadog.appsec.php.TelemetryHelpers
import com.datadog.appsec.php.docker.AppSecContainer
import groovy.util.logging.Slf4j
import org.junit.jupiter.api.Tag
import org.junit.jupiter.api.Test
import org.junit.jupiter.api.condition.DisabledIf
import org.testcontainers.junit.jupiter.Container
import org.testcontainers.junit.jupiter.Testcontainers
import java.net.http.HttpRequest
import java.net.http.HttpResponse
import static com.datadog.appsec.php.integration.TestParams.getPhpVersion
import static com.datadog.appsec.php.integration.TestParams.getVariant
@testcontainers
@slf4j
@tag("musl")
@DisabledIf('isDisabled')
class BrokenPipeTests {
// Only meaningful with the Rust helper, which is the default on PHP >= 8.5
static boolean disabled = !TestParams.phpVersionAtLeast('8.5') || variant.contains('zts')
@container
public static final AppSecContainer CONTAINER =
new AppSecContainer(
workVolume: this.name,
baseTag: 'nginx-fpm-php',
phpVersion: phpVersion,
phpVariant: variant,
www: 'base',
) {
@OverRide
void configure() {
super.configure()
// The delay gives the test time to kill the PHP process between the helper
// receiving the command and writing the response back, reproducing the
// broken-pipe scenario from the wild crash report.
withEnv('_DD_APPSEC_HELPER_SEND_RESP_DELAY_MS', '3000')
}
}
@test
void 'broken pipe logged when PHP process dies before helper writes response'() {
// Fire a request that triggers WAF evaluation (Arachni is in the default rules)
HttpRequest req = CONTAINER.buildReq('/hello.php')
.header('User-Agent', 'Arachni/v1').GET().build()
def responseFuture = CONTAINER.httpClient.sendAsync(req, HttpResponse.BodyHandlers.ofString())
// Give PHP enough time to receive the HTTP request and send the request_init
// command to the helper before we kill the process.
Thread.sleep(500)
// Kill the PHP-FPM worker. FPM master will respawn it. The helper is now sleeping
// for _DD_APPSEC_HELPER_SEND_RESP_DELAY_MS ms before it tries to write the response,
// at which point it will get EPIPE.
def killResult = CONTAINER.execInContainer(
'bash', '-c',
'pkill -KILL -f "php-fpm: pool www" 2>/dev/null; pkill -KILL -f "php-fpm: pool" 2>/dev/null; true')
log.info('Kill result: exit={} stderr={}', killResult.exitCode, killResult.stderr)
// Wait for the helper's delay to expire and the error to be written to the log.
// The delay is 3000 ms; 4000 ms leaves a comfortable margin.
Thread.sleep(4000)
responseFuture.cancel(true)
CONTAINER.clearTraces()
def logResult = CONTAINER.execInContainer('cat', '/tmp/logs/helper.log')
def brokenPipeWarnLine = logResult.stdout.readLines().find { line ->
line.contains('[WARN]') &&
(line.toLowerCase().contains('broken pipe') || line.toLowerCase().contains('connectivity issue'))
}
assert brokenPipeWarnLine != null :
"Expected a [WARN] line about broken pipe / connectivity issue in helper.log but found none.\n" +
"helper.log contents:\n${logResult.stdout}"
// Verify the broken pipe is NOT reported as a telemetry error. Since it is classified
// as a connectivity issue (ForcefulDisconnect) and logged at WARN, TelemetryAwareLogger
// should not forward it. Error telemetry is sent eagerly, so 5s is sufficient.
def telemetryLogs = TelemetryHelpers.waitForLogs(CONTAINER, 5) { List<TelemetryHelpers.Logs> messages ->
false // collect for full duration
}.collectMany { it.logs }
assert !telemetryLogs.any { log ->
log.message.toLowerCase().contains('broken pipe') ||
log.message.toLowerCase().contains('epipe') ||
log.message.toLowerCase().contains('os error 32')
} : 'Broken pipe should not generate any telemetry log'
}
}
```
…PIPE Rename is_incomplete_stream_error to is_forceful_disconnect_error and extend it to also treat ECONNRESET and EPIPE as forceful disconnects. On Linux, ECONNRESET is delivered to the peer when a Unix socket is closed while its receive buffer is non-empty (unix_release_sock), indicating the client crashed or was killed after we sent our response — a connectivity issue, not a protocol error. Also remove redundant `git config --global --add safe.directory '*'` calls from integration Docker build tasks.
6fc6fd2 to
3124fa7
Compare
Benchmarks [ tracer ]Benchmark execution time: 2026-04-15 16:38:03 Comparing candidate commit 3124fa7 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 193 metrics, 1 unstable metrics. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The special ForcefulDisconnect error was only being used in read, not write paths, resulting in forceful disconnects (broken pipe, connection reset) being logged as errors and sent via telemetry.
There was another problem in that these were being double-logged.
Fix both problems. Tested with:
and:
Description
Reviewer checklist