Skip to content
This repository has been archived by the owner on Jan 30, 2020. It is now read-only.

Content Type Header Parsing #211

Merged
merged 5 commits into from
Jun 7, 2018
Merged

Content Type Header Parsing #211

merged 5 commits into from
Jun 7, 2018

Conversation

rodnaph
Copy link
Contributor

@rodnaph rodnaph commented May 16, 2018

When parsing the Content-Type header there is an assumption that the type and key/value pairs can be split by semi-colon, but this is incorrect when a value contains a semi-colon (eg. name="foo; bar"). This format is covered by the RFC (https://tools.ietf.org/html/rfc2045#page-12, Must be in quoted-string to use within parameter values), and is seen in data in the wild. This patch adds support for this.

Currently an exception is thrown in strict mode because of an undefined offset (Undefined offset: 1). I've updated the code to use parse_str to better handle this data, which I think is an improvement over the regexp splitting, but still not ideal without a proper lexer/tokenizer obv.

I've also enabled strict mode for PHPUnit by default to expose these kinds of bugs in the test suite.

@rodnaph rodnaph mentioned this pull request May 16, 2018
@bendavies
Copy link

@weierophinney any chance to take a look? thanks!

@weierophinney
Copy link
Member

@rodnaph Please rebase off of current master, and then look at https://github.com/zendframework/zend-mail/blob/79f0f72fbaf606f4a2db322bb2fe2e381d2e9d53/src/Header/AddressListParser.php — this class provides a parser for address list items currently, and I think the approach it presents (a proper parser) is likely a better fit for the fix you're trying to implement.

(Also, run composer cs-check and/or composer cs-fix before pushing again, to get rid of CS issues.)

@rodnaph
Copy link
Contributor Author

rodnaph commented Jun 7, 2018

@weierophinney Thanks - the parser class works for this case if you can allow a minor customisation (though the name is now perhaps a little misleading if it's used in both locations).

* @return array
*/
public static function parse($value)
public static function parse($value, $delims = self::CHAR_DELIMS)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array type hint?

rodnaph and others added 2 commits June 7, 2018 11:19
The purpose of the class is to parse a string into a list of values,
splitting on a set of delimiters.
@weierophinney
Copy link
Member

Since the AddressListParser has not yet been released, I've taken the liberty of renaming it to ListParser, and pushing the changes back to your branch. With those in place, I think this is ready to go.

@weierophinney weierophinney merged commit c2b4f76 into zendframework:master Jun 7, 2018
weierophinney added a commit that referenced this pull request Jun 7, 2018
weierophinney added a commit that referenced this pull request Jun 7, 2018
@@ -42,6 +42,10 @@ All notable changes to this project will be documented in this file, in reverse

### Fixed

- [#211](https://github.com/zendframework/zend-mail/pull/211) fixes how the `ContentType` header class parses the value it receives. Previously,
it was incorrectly splitting the value on semi-colons that were inside quotes; in now correctly

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it now correctly ignores them.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants