Skip to content

Proposal: parse_cookie should use MultiDict by default #1562

@erickpeirson

Description

@erickpeirson

This is in part a follow-up to #1118, but creating a new issue as that was specifically about serialization order. I'd like to suggest (1) that #1118 mat not be correct per RFC6265, and (2) that it may be better for werkzeug.http.parse_cookie to use
MultiDict by default.

1: Relying on serialization order in the Cookie header is not consistent with RFC6265

In #1118, @dorfsmay noted that browsers may order cookies from from specific scope to more generic, and that using the first instance of a duplicate key rather than the last instance of a duplicate key would better align werkzeug with practice.

But there are two places in RFC6265 that seem to suggest that serialization order should not be relied upon:

In section 4.2.2:

Although cookies are serialized linearly in the Cookie header,
servers SHOULD NOT rely upon the serialization order. In particular,
if the Cookie header contains two cookies with the same name (e.g.,
that were set with different Path or Domain attributes), servers
SHOULD NOT rely upon the order in which these cookies appear in the
header.

Later on, the RFC suggests that (contra #1118) cookies SHOULD (but not MUST) be sorted based on path length and creation time; section 5.4:

  1. The user agent SHOULD sort the cookie-list in the following
    order:

    • Cookies with longer paths are listed before cookies with
      shorter paths.

    • Among cookies that have equal-length path fields, cookies with
      earlier creation-times are listed before cookies with later
      creation-times.

    NOTE: Not all user agents sort the cookie-list in this order, but
    this order reflects common practice when this document was
    written, and, historically, there have been servers that
    (erroneously) depended on this order.

Given the leeway permitted by the RFC, it seems odd that werkzeug have a strong opinion about cookie order. Which leads me to suggest that...

2: werkzeug.http.parse_cookie should use MultiDict instead of TypeConversionDict by default

Currently, parse_cookie uses TypeConversionDict to hold cookies parsed from the header. This struct supports only a single value per key, which forces werkzeug to choose how to interpret key order (as above).

A less opinionated approach would be to use MultiDict instead. This is a subclass of TypeConversionDict, so no loss of functionality, and supports multiple values per key. It also has the nice behavior of using a list internally to store values, so order can be preserved. Since MultiDict. __getitem__ returns the first value, if the values were added to the MultiDict in serialization order this also would preserve the current behavior for users who do not currently pass their own struct to parse_cookie.

Ultimately this would leave the question of how to interpret cookie serialization order up to the application.

FWIW, this bit me recently in a case involving integration with a legacy system; client browsers were sending duplicate keys in different orders, causing what appeared to be unpredictable behavior. Using a MultiDict worked exceptionally well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions