Description
The dechunk filter removes the first line if it starts with a hex character. This is incorrect and used in attacks. I propose it just passes the data through literally if the first line does not contain a valid length.
The dechunk filter is meant to decode HTTP chunked transfer encoding. It is used in the fopen URL wrapper. If the response contains "Transfer-Encoding: Chunked", it is passed through the dechunk filter.
It reads a length prefix and then so many bytes, and it works correctly:
var_dump(file_get_contents(
"php://filter/dechunk/resource=data:text/plain,5\r\nhello\r\n"
));
// string(5) "hello"
If passed something that does not look chunked, it passes the data through literally:
var_dump(file_get_contents(
"php://filter/dechunk/resource=data:text/plain,mango"
));
// string(5) "mango"
This makes a bit of sense. It is clearly not chunked encoding, so it is probably not encoded, and this is the best it can do.
However, if the string starts with a character which is valid hex character (0-9, a-f, A-F), it removes the content:
var_dump(file_get_contents(
"php://filter/dechunk/resource=data:text/plain,apple"
));
// string(0) ""
This is a technical artifact of how the filter works. Besides that it is unexpected, it is also heavily misused in filter chain attacks. It creates an oracle that tells the attacker whether the string starts with a hex character.
I propose to change the filter so that it passes the data through literally if the first line is not a valid length.
var_dump(file_get_contents(
"php://filter/dechunk/resource=data:text/plain,apple"
));
// string(0) "apple"
This would make its behavior more consistent, and make it harder for attackers to abuse this function.
I think this will make filter chain attacks slightly harder, but not impossible.
This change it unlikely to break anyones program:
- dechunk is not documented;
- the happy flow of actually decoding chunked data keeps working the same;
- dechunk already returns the input unchanged for some inputs (not starting with hex characters).
Perhaps it should also raise a warning or notice when the encoding is incorrect, but that is not what this issue is about.
PHP Version
PHP 8.5.5 (cli) (built: Apr 7 2026 16:24:10) (NTS)
Copyright (c) The PHP Group
Built by Homebrew
Zend Engine v4.5.5, Copyright (c) Zend Technologies
with Zend OPcache v8.5.5, Copyright (c), by Zend Technologies
Operating System
No response
Description
The dechunk filter removes the first line if it starts with a hex character. This is incorrect and used in attacks. I propose it just passes the data through literally if the first line does not contain a valid length.
The dechunk filter is meant to decode HTTP chunked transfer encoding. It is used in the fopen URL wrapper. If the response contains "Transfer-Encoding: Chunked", it is passed through the dechunk filter.
It reads a length prefix and then so many bytes, and it works correctly:
If passed something that does not look chunked, it passes the data through literally:
This makes a bit of sense. It is clearly not chunked encoding, so it is probably not encoded, and this is the best it can do.
However, if the string starts with a character which is valid hex character (0-9, a-f, A-F), it removes the content:
This is a technical artifact of how the filter works. Besides that it is unexpected, it is also heavily misused in filter chain attacks. It creates an oracle that tells the attacker whether the string starts with a hex character.
I propose to change the filter so that it passes the data through literally if the first line is not a valid length.
This would make its behavior more consistent, and make it harder for attackers to abuse this function.
I think this will make filter chain attacks slightly harder, but not impossible.
This change it unlikely to break anyones program:
Perhaps it should also raise a warning or notice when the encoding is incorrect, but that is not what this issue is about.
PHP Version
Operating System
No response