This is in part a follow-up to #1118, but creating a new issue as that was specifically about serialization order. I'd like to suggest (1) that #1118 mat not be correct per RFC6265, and (2) that it may be better for werkzeug.http.parse_cookie to use MultiDict by default.
1: Relying on serialization order in the Cookie header is not consistent with RFC6265
In #1118, @dorfsmay noted that browsers may order cookies from from specific scope to more generic, and that using the first instance of a duplicate key rather than the last instance of a duplicate key would better align werkzeug with practice.
But there are two places in RFC6265 that seem to suggest that serialization order should not be relied upon:
Although cookies are serialized linearly in the Cookie header,
servers SHOULD NOT rely upon the serialization order. In particular,
if the Cookie header contains two cookies with the same name (e.g.,
that were set with different Path or Domain attributes), servers
SHOULD NOT rely upon the order in which these cookies appear in the
Later on, the RFC suggests that (contra #1118) cookies SHOULD (but not MUST) be sorted based on path length and creation time; section 5.4:
The user agent SHOULD sort the cookie-list in the following
Cookies with longer paths are listed before cookies with
Among cookies that have equal-length path fields, cookies with
earlier creation-times are listed before cookies with later
NOTE: Not all user agents sort the cookie-list in this order, but
this order reflects common practice when this document was
written, and, historically, there have been servers that
(erroneously) depended on this order.
Given the leeway permitted by the RFC, it seems odd that werkzeug have a strong opinion about cookie order. Which leads me to suggest that...
2: werkzeug.http.parse_cookie should use MultiDict instead of TypeConversionDict by default
Currently, parse_cookie uses TypeConversionDict to hold cookies parsed from the header. This struct supports only a single value per key, which forces werkzeug to choose how to interpret key order (as above).
A less opinionated approach would be to use MultiDict instead. This is a subclass of TypeConversionDict, so no loss of functionality, and supports multiple values per key. It also has the nice behavior of using a list internally to store values, so order can be preserved. Since MultiDict. __getitem__ returns the first value, if the values were added to the MultiDict in serialization order this also would preserve the current behavior for users who do not currently pass their own struct to parse_cookie.
Ultimately this would leave the question of how to interpret cookie serialization order up to the application.
FWIW, this bit me recently in a case involving integration with a legacy system; client browsers were sending duplicate keys in different orders, causing what appeared to be unpredictable behavior. Using a MultiDict worked exceptionally well.
The text was updated successfully, but these errors were encountered: