[BUG]: AppSec In App WAF blocking causes uncaught exceptions with Next.js when headers already sent

### Tracer Version(s)

5.42.0

### Node.js Version(s)

20.12.1

### Bug Report

## Environment

### Versions

- **Tracer Version**: 5.42.0
- **Node.js Version**: 20.12.1
- **Next.js Version**: 14.2.7
- **Platform**: AWS Fargate (Linux/ARM64)

### Resource Allocation

- **CPU**: 1024 units (1 vCPU)
- **Memory**: 3072 MB
- **Datadog Agent**: 256 CPU units

### Features Enabled

```
DD_APPSEC_ENABLED=true      # AppSec In App WAF
DD_IAST_ENABLED=true        # Interactive Application Security Testing
DD_PROFILING_ENABLED=true   # Continuous Profiler
DD_LOGS_INJECTION=true      # Log Correlation
DD_RUNTIME_METRICS=true     # Runtime Metrics Collection
DD_DBM_PROPAGATION=full     # Database Monitoring
```

## Bug Report

We've identified critical issues in the AppSec In App WAF blocking functionality that affect high-traffic Next.js applications:

1. **Race Condition and Crashes**: When the WAF attempts to block requests after Next.js has already sent headers (common with non-existent routes like `/admin.php`), it throws unhandled exceptions (`Headers have already been sent`). With our error handling configuration (which uses `process.exit` for unhandled exceptions), this causes application crashes.

2. **Status Code Discrepancy**: When WAF successfully blocks a request with HTTP 403, the Datadog traces incorrectly record it as HTTP 404. This creates inconsistency between what clients experience and what appears in our monitoring.

3. **Performance Impact**: Our workaround solutions have introduced high CPU usage in the "DD AppSec In App WAF Context" span, creating a performance bottleneck.

## Patch Evolution

We've tried two different patch approaches:

### First Approach (Effective but Performance-Heavy)

```javascript
try {
  // 1. Check headers first
  if (res.headersSent) {
    log.warn('[ASM] Cannot send blocking response when headers have already been sent')
    return false
  }

  // 2. Get blocking data and send response
  const { body, headers, statusCode } = getBlockingData(req, null, actionParameters)
  for (const headerName of res.getHeaderNames()) {
    res.removeHeader(headerName)
  }
  res.writeHead(statusCode, headers)
  res.constructor.prototype.end.call(res, body)

  // 3. Mark as blocked and cleanup
  responseBlockedSet.add(res)
  rootSpan.setTag('appsec.blocked', 'true')
  abortController?.abort()

  return true
} catch (err) {
  rootSpan?.setTag('_dd.appsec.block.failed', 1)
  log.error('[ASM] Blocking error', err)
  return false
}
```

### Current Minimal Approach

```javascript
// Current minimal patch - prevents crashes only
if (!res || res.headersSent || res.finished) {
  log.warn('[ASM] Cannot send blocking response when headers have already been sent')
  return false
}
```

**Key Differences**:

- First approach successfully blocked requests but had performance overhead
- Current approach prevents crashes but fails to block when race condition occurs
- Both approaches show similar CPU usage patterns

## Root Cause Investigation

We're investigating deeper issues with the middleware/instrumentation timing:

1. **Next.js Middleware Timing**: Headers appear to be sent before AppSec evaluation completes
2. **Trace Status Capture**: Status codes are recorded before AppSec modifications
3. **Monkey Patching**: Potential issues with how response methods are patched

### Performance Metrics

Load testing revealed consistent performance impact:

| CPU Usage | Mean Response Time | p95 Response Time |
| --------- | ------------------ | ----------------- |
| <50%      | 77-84ms            | ~500ms            |
| 50-70%    | 250-450ms          | ~1600ms           |
| >70%      | 600-900ms          | >2300ms           |

Performance zones identified:

- **Optimal**: Up to 50-60 req/sec (Response time < 500ms at p95)
- **Degraded**: 80-100 req/sec (Response time < 1600ms at p95)
- **Critical**: >110 req/sec (Unpredictable performance)

## Questions

1. **Response Handling**:

   - How should AppSec In App WAF handle requests where headers are already sent?
   - What is the correct point in the request lifecycle to perform WAF evaluation?

2. **Status Code Capture**:

   - How can we ensure trace spans capture the final response status, it may be a bug or configuration issue?
   - Is there a way to update span data after AppSec modifies the response?

3. **Performance & Timing**:
   - Are there recommended approaches for response tracking that minimize overhead?
   - How should AppSec integrate with Next.js routing to avoid race conditions?

## Current Investigation

We're examining several areas that may contribute to the timing issues:

1. **DD-Trace Instrumentation**:

   - Response hooks
   - HTTP instrumentation
   - Next.js specific code

2. **Request Flow Analysis**:

   ```
   Client → HTTP Server → Next.js Router → [Middleware] → AppSec WAF → Response
                    ↑
   Headers may be sent here, before WAF evaluation
   ```

3. **Status Code Capture Timing**:
   ```mermaid
   sequenceDiagram
       Client->>DD-Trace: Request
       DD-Trace->>AppSec: Process Request
       AppSec-->>Client: Return 403 (Actual Response)
       DD-Trace-->>Datadog: Report 404 (Incorrect Trace)
       Note over DD-Trace,Datadog: Status Code Mismatch
   ```

We believe the core issue may be related to the timing and order of HTTP method instrumentation rather than just the AppSec module itself.


### Reproduction Code

_No response_

### Error Logs

_No response_

### Tracer Config

_No response_

### Operating System

AWS Fargate (Linux/ARM64)

### Bundling

Next.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG]: AppSec In App WAF blocking causes uncaught exceptions with Next.js when headers already sent #5452

Tracer Version(s)

Node.js Version(s)

Bug Report

Environment

Versions

Resource Allocation

Features Enabled

Bug Report

Patch Evolution

First Approach (Effective but Performance-Heavy)

Current Minimal Approach

Root Cause Investigation

Performance Metrics

Questions

Current Investigation

Reproduction Code

Error Logs

Tracer Config

Operating System

Bundling

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG]: AppSec In App WAF blocking causes uncaught exceptions with Next.js when headers already sent #5452

Description

Tracer Version(s)

Node.js Version(s)

Bug Report

Environment

Versions

Resource Allocation

Features Enabled

Bug Report

Patch Evolution

First Approach (Effective but Performance-Heavy)

Current Minimal Approach

Root Cause Investigation

Performance Metrics

Questions

Current Investigation

Reproduction Code

Error Logs

Tracer Config

Operating System

Bundling

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions