Incorrect encoding handling in Content-Disposition header

### Version

v22.15.0

### Platform

```text
Linux horsleyli-1q8kg06lua 5.4.119-19.0009.44 #1 SMP Tue May 7 20:08:55 CST 2024 x86_64 x86_64 x86_64 GNU/Linux
```

### Subsystem

http

### What steps will reproduce the bug?

```js
const net = require('net');
const http = require('http');

const contentLengthComesFirst = process.argv.includes('--content-length-comes-first');

const backend = net.createServer((socket) => {
  socket.once('data', (data) => {
    // Build raw HTTP response with binary-safe headers
    const filename = '漏洞.txt';
    const response =
      'HTTP/1.1 200 OK\r\n' +
      (contentLengthComesFirst ? `Content-Length: 12\r\n`:'') +
      // Raw UTF-8 bytes for Chinese filename (without URL encoding)
      `Content-Disposition: attachment; filename="${Buffer.from(filename).toString('binary')}"; filename*=UTF-8''${encodeURIComponent(filename)}\r\n` +
      `Content-Type: application/octet-stream\r\n` +
      (!contentLengthComesFirst ? `Content-Length: 12\r\n`:'') +
      'Connection: close\r\n\r\n' +
      'file content';

    const responseBuffer = Buffer.from(response, 'binary');
    socket.end(responseBuffer);
  });
});

backend.listen(() => {
  const proxy = http.createServer((req, res) => {
    const options = {
      hostname: 'localhost',
      port: backend.address().port,
      method: req.method,
      headers: req.headers,
      path: '/backend'
    };
    const proxyReq = http.request(options, (proxyRes) => {
      res.statusCode = proxyRes.statusCode;
      res.statusMessage = proxyRes.statusMessage;
      for (const header in proxyRes.headers) {
        res.setHeader(header, proxyRes.headers[header]);
      }
      
      // Handle the 'data' event to ensure the response is sent correctly
      proxyRes.on('data', (chunk) => {
        res.write(chunk);
      });
      // Handle the 'end' event to finish the response
      proxyRes.on('end', () => {
        res.end();
      });
    });
    
    req.pipe(proxyReq);
  }).listen(() => {
    const client = net.connect(proxy.address().port, () => {
      client.write(`GET /proxy HTTP/1.1\r\nHost: localhost:${backend.address().port}\r\n\r\n`);
    });

    let responseData = Buffer.alloc(0);
    client.on('data', (chunk) => {
      responseData = Buffer.concat([responseData, chunk]);
    });
    client.on('end', () => {
      const startFlag = Buffer.from('filename="');
      const endFlag = Buffer.from('"');
      const startIndex = responseData.indexOf(startFlag) + startFlag.length;
      const endIndex = responseData.indexOf(endFlag, startIndex);
      const filenameBuffer = responseData.slice(startIndex, endIndex);
      console.log('filename Buffer:', filenameBuffer.toString('hex'));
      console.log('filename utf8:', filenameBuffer.toString('utf8'));

      proxy.close(() => backend.close());
    });
  });
});

```

![Image](https://github.com/user-attachments/assets/39a2fb04-2f11-47d8-b5e8-4c1fd4dc9730)

When the Content-Length header appears after Content-Disposition header in an HTTP response, Node.js' encoder behaves correctly. However, when Content-Length is placed **BEFORE** Content-Disposition, the encoder exhibits abnormal behavior which corruption utf8 char in header processing. 

### How often does it reproduce? Is there a required condition?

always

### What is the expected behavior? Why is that the expected behavior?

Expected Behavior:
Regardless of header order (Content-Length before or after Content-Disposition), the HTTP parser should preserve raw byte values in filename parameter

Technical Justification:
1. RFC 7230 Section 3.2.2:
   - "The order in which header fields with differing field names are received is not significant"
   - Content-Length position should not affect header parsing

2. RFC 6266 Section 4.1:
   - "If both filename and filename* are present, filename* should be used"
   - UTF-8 encoding must be properly handled

3. Binary Safety:
   - filename parameter's raw bytes (e6 bc 8f e6 b4 a9) should remain intact
   - No double-encoding (Latin-1 → UTF-8 conversion) should occur

4. Test Case Consistency:
   Both scenarios should produce identical outputs:
   - Hex dump of original UTF-8 bytes
   - Proper Chinese character decoding

### What do you see instead?

Utf8 character e6bc8fe6b49e (漏洞) was encoded to 0f1e when Content-Length is placed BEFORE Content-Disposition

### Additional information

https://github.com/nodejs/node/blob/3f5899f60a3049229da195c3acde6443c3ba0a04/lib/_http_outgoing.js#L598-L607

This code snippet in Node.js demonstrates a special handling for the Content-Disposition header's encoding when the Content-Length header is present. The logic converts the Content-Disposition header value into a Latin1-encoded Buffer ​​only if Content-Length has already been processed​​ and exists in the headers (self._contentLength is truthy).

Since headers are processed sequentially in caller, the outcome depends on the ​​order of header definitions​​:
+ If Content-Length is set ​​before​​ Content-Disposition, the condition self._contentLength is met, and the encoding logic applies.
+ If Content-Length is set ​​after​​ Content-Disposition, self._contentLength is undefined during the Content-Disposition processing, so the encoding step is skipped

	if (isContentDispositionField(key) && self._contentLength) {
	// The value could be an array here
	if (ArrayIsArray(value)) {
	for (let i = 0; i < value.length; i++) {
	value[i] = Buffer.from(value[i], 'latin1');
	}
	} else {
	value = Buffer.from(value, 'latin1');
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Incorrect encoding handling in Content-Disposition header #58240

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior? Why is that the expected behavior?

What do you see instead?

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Incorrect encoding handling in Content-Disposition header #58240

Description

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior? Why is that the expected behavior?

What do you see instead?

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions