New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BaseHTTPServer reinventing rfc822 date formatting #51619
Comments
While digging through Lib/BaseHTTPServer.py, I noticed that the |
Thanks for the patch. Per bpo-2849, use of rfc822 should be gone from the stdlib. Please re-open if you disagree. |
Following the last link on bpo-2859, I’ve found that “rfc822.formatdate(time)” can be changed to “email.utils.formatdate(time, usegmt=True)”. I’ll make a real diff in a few days if noone beats me to it. Regards |
The issue is not invalid. The code duplication should be removed, but using the email module as Éric suggests. Reopening. |
Instead of “email.utils.formatdate(time, usegmt=True)” we can simply use time.strftime() and clean the code in a better way. The duplication is there in date_time_string() as well as log_date_time_string(). Submitting the patch for review. |
Nice catch. The patch looks good to me and applies correctly on my trunk copy. There seems to be no test about this in the test suite; do you have a little test script to compare old and new code? On a sidenote, I find all this business with time.time, time.gmtime, time.localtime and time.strftime always confusing. We have datetime objects now, would it be ok to use them in this module? Regards |
There are a couple problems with this patch. The first is that fixing date_time_string by using strftime rather than email.utils.formatdate is suboptimal from a code reuse standpoint. The reason is that the date in HTTP message headers is required to conform to the same RFC standard as that generated by formatdate, so it is better to use the same routine to generate it. That way any bug fixes needed to handle RFC compliance are centralized in once place. The second problem with the patch is that strftime generates locale-aware week and month names, but per RFC the header timestamps must use English names (see for example msg53731 in bpo-665194; the comment about locale applies to both strftime and strptime). If bpo-665194 were implemented formatdate could use it, and then BaseHTTPServer could also use it directly. But absent that it should use email.util.formatdate. (That issue should also answer Éric's question about whether we can use DateTime here: not yet.) Now, the logging routine is a different story. That timestamp isn't required to follow the RFC, and one could argue that it makes sense for its timestamp to use the locale. (One could also ask whether BaseHTTPServer should use the logging module, but that is a whole separate issue.) We definitely should have a unit test before applying this patch, that makes sure the timestamp gets generated without error. Checking the detailed format of the timestamp can be assumed to be covered by the unit tests for formatdate. (I don't think those tests are completely adequate; for example they don't test that the date remains in English if the locale is different, but again that is a different issue.) |
“One could also ask whether BaseHTTPServer should use the logging Cheers |
Quoting from the docstring of trunk/Lib/email/utils.py -> formatdate() "We cannot use strftime() because that honors the locale and RFC 2822 requires that day and month names be the English abbreviations." So yes, I do agree that email.utils.formatdate() should be used instead of time.strftime() to remove duplicate codes and be compliant with RFC. |
Seems that earlier patch was incorrect. Rectifying and submitting the correct patch. |
You could get a minor speedup by doing “from email.utils import formatdate”. Do we have tests know to check that the patch does not break anything? Can this still go into 2.7? |
I guess I shall do that. |
Opinions are nice, tests are better! :) |
Agreed. |
The skeleton is good but you have to change one thing. Your test should 1a) Write a test that checks that the current code produces right values End of HOWTO write a regression test :) |
I do not like the idea that BaseHTTPServer depends on email package, which in turn may depend on another package etc. Having date formatting function inside of email package breaks "single responsibility" principle that would be nice to have in stdlib. |
The HTTP RFCs reference the email RFCs for the date format, so the email package is the logical place for this function: email is the correct responsible party. In any case, the function resides in email.utils, which has no dependencies on anything else in the email package. Of course, it does have dependencies on other parts of the python standard library, but I hardly think you'd want every module re-implementing every stdlib function. |
I think it is now fixed by my patch in http://bugs.python.org/issue747320 |
There is an up-to-date patch for Python 3 in bpo-747320. Closing this as a duplicate. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: