Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DATE_FORMAT_COOKIE definition #6783

Closed
wants to merge 2 commits into from
Closed

Fix DATE_FORMAT_COOKIE definition #6783

wants to merge 2 commits into from

Conversation

mdhia
Copy link
Contributor

@mdhia mdhia commented Mar 17, 2021

In both http://curl.haxx.se/rfc/cookie_spec.html , https://docs.microsoft.com/de-de/windows/win32/wininet/http-cookies?redirectedfrom=MSDN and https://tools.ietf.org/html/rfc7234#section-5.3 the cookie datetime is specified as Mon, DD-Mon-YYYY HH:MM:SS GMT. However, the current definition returns Monday, DD-Mon-YYY HH:MM:SS GMT. Therefore, the "l" in "l, d-M-Y H:i:s T" must be changed to "D".

@Girgias
Copy link
Member

Girgias commented Mar 17, 2021

Can you add a test for this?

@derickr
Copy link
Member

derickr commented Mar 17, 2021

I'm OK with that, but I would like to mention that when I wrote the code (back in 2004), the general definition was "l", and not "D" :-)

@mdhia
Copy link
Contributor Author

mdhia commented Mar 17, 2021

I thought so, but I am happy that you agree with the change :)

@mdhia
Copy link
Contributor Author

mdhia commented Apr 7, 2021

@derickr Is there anything missing?

@nikic
Copy link
Member

nikic commented Apr 26, 2021

@derickr Just to double check, this is okay to merge (for master)?

@mvorisek
Copy link
Contributor

mvorisek commented Jul 8, 2021

Is this to be merged into PHP 8.1?

@mdhia
Copy link
Contributor Author

mdhia commented Sep 23, 2021

@nikic @derickr Is there anything against merging the PR? Or is something still missing?

@nikic nikic closed this in ac34648 Sep 23, 2021
@GPHemsley
Copy link

GPHemsley commented Sep 29, 2021

The most recent Cookie spec is:
https://datatracker.ietf.org/doc/html/rfc6265

The date format used is specified here:
https://datatracker.ietf.org/doc/html/rfc2616#section-3.3.1

This was later obsoleted by:
https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.1.1

There are 3 date formats listed there, 2 of which are considered obsolete. It is not clear to me which of these is meant to be represented by this constant (which I expect would be duplicated by other constants), but I don't think this change matches any of them?

@nikic
Copy link
Member

nikic commented Sep 29, 2021

@GPHemsley True. This used to match one of the "obsolete" formats, and now it matches none of them, because it combines short weekday with - separators...

nikic added a commit that referenced this pull request Sep 29, 2021
This reverts commit ac34648.

As pointed out on GH-6783, the new format doesn't match any of
the specified formats. Previously the constant generated

    Thursday, 14-Jul-2005 22:30:41 BST

which is obsolete. Now it generates

    Thu, 14-Jul-2005 22:30:41 BST

which is not specified at all. The correct version would be:

    Thu, 14 Jul 2005 22:30:41 BST

Reverting the change for now.
@nikic
Copy link
Member

nikic commented Sep 29, 2021

I've reverted the change for now in 9d8f97d.

@GPHemsley
Copy link

It looks like there's a fun little history here...

The original RFC specifying cookies is RFC 2109, published February 1997. It does not define the "Expires" attribute directly, but makes reference to "Netscape's original proposal" having done so and reproduces the date format as Wdy, DD-Mon-YY HH:MM:SS GMT:
https://datatracker.ietf.org/doc/html/rfc2109#section-10.1.2

RFC 2109 does not indicate where to find "Netscape's original proposal", but the curl link above indicates that it "is the original spec once found on netscape.com" and the Wayback Machine has this snapshot from June 1997. Both versions of the document indicate having been inspired by "RFC 822, RFC 850, RFC 1036, and RFC 1123" to cite a date format of Wdy, DD-Mon-YYYY HH:MM:SS GMT.

RFC 822 (published August 1982) defines a date format that can be described as Wdy, DD Mon YY HH:MM:SS ZZZ:
https://datatracker.ietf.org/doc/html/rfc822#section-5

RFC 1123 (published October 1989) modifies that to use 4-digit years and thus Wdy, DD Mon YYYY HH:MM:SS ZZZ:
https://datatracker.ietf.org/doc/html/rfc1123#page-55

Meanwhile, RFC 850 (published June 1983) specifies that a date format "must be acceptable both to the ARPANET and to the getdate routine". It cites Weekday, DD-Mon-YY HH:MM:SS TIMEZONE as one such format but indicates that Wdy Mon DD HH:MM:SS YYYY ("ctime format") is unacceptable:
https://datatracker.ietf.org/doc/html/rfc850#section-2.1.4

RFC 1036 (published December 1987) updates that to say that a date format "must be acceptable both in RFC-822 and to the getdate(3) routine that is provided with the Usenet software" (like Wdy, DD Mon YY HH:MM:SS TIMEZONE) and that "ctime(3) format" (still Wdy Mon DD HH:MM:SS YYYY) "is not acceptable because it is not a valid RFC-822 date".
https://datatracker.ietf.org/doc/html/rfc1036#section-2.1.2

RFC 2965 (published October 2000) updates RFC 2109 and removes direct reference to the "Expires" attribute:
https://datatracker.ietf.org/doc/html/rfc2965#section-9.1

This is the state of things as of July 2005, when e6c1ff2 introduces the DATE_COOKIE constant as an alias for DATE_FORMAT_RFC1123 (D, d M Y H:i:s T).

In May 2006, fb92e33 changes DATE_FORMAT_RFC1123 to be D, d M Y H:i:s O and then 61fc424 changes DATE_COOKIE to use DATE_FORMAT_RFC850 (l, d-M-y H:i:s T).

In April 2011, RFC 6265 replaces RFC 2965 as the cookie spec and officially defines the "Expires" attribute for the first time. It does so, however, by directly referencing "<rfc1123-date, defined in [RFC2616], Section 3.3.1>":
https://datatracker.ietf.org/doc/html/rfc6265#section-4.1.1

RFC 2616 (published June 1999) describes three date formats: "RFC 822, updated by RFC 1123" (Sun, 06 Nov 1994 08:49:37 GMT), "RFC 850, obsoleted by RFC 1036" (Sunday, 06-Nov-94 08:49:37 GMT), and "ANSI C's asctime() format" (Sun Nov 6 08:49:37 1994). The rfc1123-date production mentioned by RFC 6265 represents that first format and is defined as wkday "," SP date1 SP time SP "GMT":
https://datatracker.ietf.org/doc/html/rfc2616#section-3.3.1

In November 2013, 23ab257 changes DATE_COOKIE to use DATE_FORMAT_COOKIE, which is newly-created but has a value (l, d-M-Y H:i:s T) identical to DATE_FORMAT_RFC850.

This would nominally be where the cookie history stops, but, in June 2014, RFC 7231 replaces RFC 2616 and redefines its preferred date format in terms of RFC 5322, describing it as simply "IMF-fixdate" (Sun, 06 Nov 1994 08:49:37 GMT). In doing so, it eliminates the production named rfc1123-date without explicitly mentioning that the IMF-fixdate production is its successor, leaving RFC 6265 with a dangling pointer. It also further de-emphasizes the two obsolete date formats, now described as "obsolete RFC 850 format" (Sunday, 06-Nov-94 08:49:37 GMT) and "ANSI C's asctime() format" (Sun Nov 6 08:49:37 1994):
https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.1.1

To round things out, RFC 5322 (published October 2008) specifies its full date format here:
https://datatracker.ietf.org/doc/html/rfc5322#section-3.3

@boesing
Copy link

boesing commented Sep 30, 2021

So again, working with dates is a major pain.
If I see correct, both values (the one provided in this patch and the value provided prior this patch) are obsolete.

If I understand https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.1.1 correct, 2021-01-01 01:02:03 formatted with D, d M Y H:i:s \G\M\T would be the preferred value of a datetime representative in HTTP responses (and thus, for cookies).

This would require a userland implementation to ensure a DateTime(Immutable) contains the GMT timezone because the preferred format mentioned above already exists as a constant in DateTimeInterface::RFC7231.

Maybe userland code then has to create a dedicated logic to ensure this behavior as the format will not enforce the proper timezone:
https://3v4l.org/7TA1X#v8.1rc2

Imho, the format is quite "misleading" as I would prefer D, d M Y H:i:s T. This would create an invalid timestamp for RFC7231 unless the proper DateTimeZone is set to the date object we are "formatting".

Is this something which needs to be proposed to internals? Maybe worth having some kind of tooling around this? Is that even something PHP itself wants to address? Feedback is highly appreciated.

@mvorisek
Copy link
Contributor

mvorisek commented Sep 30, 2021

I have checked 3 major websites - google.com, facebook.com, microsoft.com. They all use the D, d-M-Y H:i:s T format with GMT timezone.

@GPHemsley
Copy link

One of the problems faced here is that most of the aforementioned date formats have multiple acceptable variations, so settling on a single value (as in a constant) is a decision being made outside the spec.

And to further complicate things, I've discovered that RFC 822, RFC 1123, RFC 2822, and RFC 5322 allow 1-digit days (j) but RFC 2616 and RFC 7231 restrict their "rfc1123-date"/"IMF-fixdate" definition to require 2-digit days (d).

Here are the constants compared with my interpretation of each spec:
https://3v4l.org/3pbah

And @boesing, I agree with you that \G\M\T instead of T is questionable given that the date may have a non-GMT timezone attached. And that speaks again to the question of what these constants are intending to capture.

But in any case, this issue was originally specific to DATE_COOKIE, and the research indicates that, if the Netscape spec was what this constant was meant to capture, then the original change to D, d-M-Y H:i:s T was indeed correct (modulo the timezone question).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants