-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Inconsistency between dangling '%' handling in time.strftime() and datetime.strftime() #79247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A call to A similar call to Similar inputs like '%D %' behave similarly. I might take a crack at fixing this, but first I wanted to see what the official guidance is. Seems to me like similar error handling behavior between the functions would be desirable. |
for me, yep normally we should provide the same behavior. now, if you want, you can submit a PR but before your PR, you have to sign the CLA. thanks |
I think it would be a good idea to make this more consistent. We should run through a multi-release deprecation cycle, since it might break existing, working code. And we could only start that in 3.8. |
Ok, seems reasonable. What branch would I submit a PR against? On Thu, Oct 25, 2018 at 1:11 PM Eric V. Smith <report@bugs.python.org>
|
I am not sure time.strftime("%") should raise an error. There is an explicit test case and it's mentioned as platform dependent in the comment to raise a ValueError or succeed. So I don't know if it should be changed despite the inconsistency and there is any reason behind this. The error regarding datetime module comes from SVN version and I couldn't get to know the original reason behind it and why the same was not carried over to time module. I agree with Eric that raising a DeprecationWarning for this and then removing it in later versions if we are going forward with this since we are making a platform dependent error as an expected error across platforms. In the below test case "%" doesn't raise ValueError on my Mac OS and Ubuntu machine. Line 240 in 9e95eb0
def test_strftime_format_check(self):
# Test that strftime does not crash on invalid format strings
# that may trigger a buffer overread. When not triggered,
# strftime may succeed or raise ValueError depending on
# the platform.
for x in [ '', 'A', '%A', '%AA' ]:
for y in range(0x0, 0x10):
for z in [ '%', 'A%', 'AA%', '%A%', 'A%A%', '%#' ]:
try:
time.strftime(x * y + z)
except ValueError:
pass I am adding @belopolsky who might have thoughts on the change. Thanks for the report. |
Hmm, if there's a test for this, then that does complicate the decision. Is this behavior documented anywhere? If so, then we shouldn't change it. If we do decide to go forward with a change, it should be in the master branch, which will become 3.8. |
After a little more thinking: maybe we should just document this behavior, make it official, and not change it. |
From a pure usability standpoint I'd prefer for datetime to match the time Of course I defer to the dev team on this, but I want to make clear where On Thu, Oct 25, 2018 at 1:22 PM Karthikeyan Singaravelan <
|
Michael: I understand the inconsistency but since there is a test that says ValueError is platform dependent then making it as an intentional error there might be breakage. I am not against changing this but if it's done then it should be done with DeprecationWarning for 3.8 and then later removed on other versions. Some more information : Further, I looked into timemodule.c in CPython that says that it supports some common formats and "Other codes may be available on your platform. See documentation for the C library strftime function." . I looked into freebsd strftime there is an explicit comment if conversion char is undefined then the behavior is also undefined and to just print it out. Related issue that has the patch to an external implementation that refers to the same comment : https://bugs.python.org/issue3173 Meanwhile datetime strftime uses wrap_strftime that defines the custom error message when format ends with raw % and does some more error reporting. # datetime strftime error : cpython/Modules/_datetimemodule.c Line 1518 in 9e95eb0
# Freebsd https://github.com/freebsd/freebsd/blob/277918494930ec3fb0c7fdbd4d35060a3bc6d181/lib/libc/stdtime/strftime.c#L572 case '%':
Initially I thought this is the relevant code that is printing the '%' but looking at the loop itself if the first character is "%" followed by '\0' indicating that it's just '%' then it breaks out of the loop and just returns '%' which I hope is happening on my system. I don't think the above case of printing out the character itself in the comment i.e. "%" is done here. The above are based on my limited knowledge of C though so feel free to correct me if I am wrong on the above or took it out of context. So maybe this can be documented that for time.strftime the behavior is undefined when the conversion char is undefined and is based on the underlying operating system internals. Also a note that time.strftime with just '%' is system dependent meanwhile datetime.strftime '%' produces a ValueError. I think the same is noted in the test that this platform dependent depending on the implementation of strftime like in Windows. So if we are going to make '%' as an error from Python like datetime.strftime in time.strftime too then lies the breakage since Python behaves different from the underlying OS strftime implementation it uses for time module. Hope it helps and maybe someone else with a better understanding of C has a better explanation. |
Did a little digging. Seems that there are two versions of the datetime Both define a wrap_strftime function that replace %z, %Z and %f format Here's the strange thing. The C datetime module raises a ValueError on a cpython/Modules/_datetimemodule.c Lines 1517 to 1520 in 3df8540
and the python version is here Line 196 in 9e95eb0
So to summarize, it seems unnecessary to throw an error on a dangling % in Let me know if I'm off base here, or if this is a fair assessment. On Thu, Oct 25, 2018 at 2:35 PM Karthikeyan Singaravelan <
|
Michael Saah, when you reply by email, *please* delete the quoted post you are replying to (except possibly for a relevant line or two.). The quotation duplicates what is already on the web page and makes it harder to scroll through posts on the web page. |
Appologies, will do. |
Thanks for the details. The C implementation should be same as Python implementation which in this case differs as per your analysis if I am understanding it right and IIRC there is a PEP (PEP-399 I think) to enforce that C and Python implementation should behave the same. |
Summary to accompany my patch: Modules/_datetimemodule.c and Lib/datetime.py do not behave identically. This situation leads to a scenario in which, for example, "%D %" passed to To summarise, there are two problems: (1) datetime does not comply with This PR attempts to fix this problem by removing the case check from the There was much talk on the issue thread about there existing a test case |
Paul Ganssle asked me to look at PR 10692. This issue is about consistency, so I don't understand this part of the change:
self.skipTest('time module does not support trailing %') Would why datetime have the same behavior on all platforms, but time.strftime('%') may or may not raise an exception depending on the libc? Can't we get the same behavior on all platforms and the same behavior in time and datetime module. Honestly, I have no preference between always raising an exception or always success (just copy trailing "%"). This issue reminds me the old bpo-16322: time.strftime("%z") fails to format properly the timezone name. I would suggest to "preprocess" the input string passed to the C function strftime() / wcsftime() to replace %z or %Z with the timezone name, but only pass format substrings? Something similar can be done for the trailing "%": pass a substring (without the trailing %) to strftime() / wcsftime(), and later append "%". |
Hi Victor, thanks for taking a look.
If I understand the call stack correctly, datetime does not have the To be honest, I can't claim to understand the strftime
I like this idea, as it gets around the ill-defined parameters of |
I agree with Victor on this. In the future, I'd really like to see us do our best to add cross-platform uniformity to Python's strftime and strptime support. If there really is a platform out there that doesn't support a trailing That said, I don't think this should be a blocker on Michael's PR. I think that his contribution by itself improves on the current state of things and there's no pressing *need* to solve them both at the same time. Unless I'm misunderstanding, I think the existing PR is a prerequisite for solving the problem on all platforms anyway. Michael - do you think you can / would you like to add the functionality that Victor mentioned to your existing PR? If not, I recommend we merge the current PR and open a new issue for "Lone trailing % not supported on all platforms". |
I'd be happy to do so, but can't commit to a timeline at the moment. I'll leave that as a judgement call to you. |
The behavior of strftime() with non-ASCII is not portable: bpo-34512. A solution to make time.strftime() more portable would be to split the format string, format each "%xxx" substring separately but don't pass substrings between "%xxx" to strftime(). |
I proposed two different implementations to make time.strftime() more portable, so it seems like it's more complex than what I expected. I merged the datetime change since this one is self-sufficient, so someone can work on a time change on top of it. |
The new test added by changeset 454b3d4 fails on Android: ====================================================================== Traceback (most recent call last):
File "/data/local/tmp/python/lib/python3.8/test/datetimetester.py", line 1400, in test_strftime_trailing_percent
self.assertEqual(t.strftime('%'), '%')
AssertionError: '' != '%'
+ % The implementation of strftime() on Android does not seem to be posix compliant:
>>> import time
>>> time.strftime('A%Q')
'AQ'
>>> time.strftime('%')
'' However the new test is not about testing posix compliance and the following patch fixes this test failure on Android while still testing that the changes made by this changeset cause a trailing '%' to not raise the exception anymore: diff --git a/Lib/test/datetimetester.py b/Lib/test/datetimetester.py
index 715f0ea6b4..ae1a97f0b4 100644
--- a/Lib/test/datetimetester.py
+++ b/Lib/test/datetimetester.py
@@ -1397,8 +1397,10 @@ class TestDate(HarmlessMixedComparison, unittest.TestCase):
_time.strftime('%')
except ValueError:
self.skipTest('time module does not support trailing %')
- self.assertEqual(t.strftime('%'), '%')
- self.assertEqual(t.strftime("m:%m d:%d y:%y %"), "m:03 d:02 y:05 %")
+ trailing_percent = _time.strftime('%')
+ self.assertEqual(t.strftime('%'), trailing_percent)
+ self.assertEqual(t.strftime("m:%m d:%d y:%y %"),
+ "m:03 d:02 y:05 %s" % trailing_percent)
def test_format(self):
dt = self.theclass(2007, 9, 10) |
Xavier de Gaye: That's why I asked to stop relying on the exact behavior of strftime() of the libc to get portable behavior :-/ See my previous comments. IMHO the correct fix is to strip trailing % from the format string, call strftime() and then concatenate the trailing %. |
While I agree with Victor that reworking time.strftime to be more portable Xavier's bug shows that my test methodology didn't take into account the |
I think the proposed change to the test will work, or we can mark the test as an expected failure on Android (on the theory that the test *should* work because we want the behavior normalized, but we are not living up to that). In either case, I think a separate issue for normalizing the behavior of |
No please, don't do that :-( |
Interesting, I don't feel terribly strongly about it, but I would have thought that you'd be more in favor of that solution, maybe we have a different definition of "expected failure"? Usually in my projects, I use xfail if I have *tests* for a bug, but no fix for it yet. The xfail-ing test serves two purposes: 1. it notifies me if the bug is incidentally fixed (so that I can remove the xfail and it becomes a regression test, and I close the bug report) and 2. it allows me to encode acceptance criteria for fixing the bug directly into the test suite. I do personally like the idea of separate tests for "is this consistent across platforms" and "does this throw an error", but it is true that once it's possible to pass the consistency test it *also* serves as a test that no errors are thrown. |
FWIW (unsurprisingly) the new test added here is broken on Cygwin, whose libc's (newlib) behavior in this undefined case. So I get: >>> from datetime import date
>>> t = date(2005, 1, 1)
>>> t.strftime("%Y") # ok
'2005'
>>> t.strftime("%%") # ok
'%'
>>> t.strftime("%") # undefined behavior
''
>>> t.strftime("%Y %") # undefined behavior; discards the whole format string
''
>>> t.strftime("%Y%Q") # undefined format; discards the whole format string
'' This behavior is user-hostile I think; it should raise a ValueError instead of just return an empty string. I would have suggested the same for the trailing '%' case, though I understand the goal of this issue was consistency. Also worth noting that both before and after this patch: >>> import time
>>> time.strftime('%')
'' So the question of consistency between the interfaces, which was the main point of this issue, was already resolved in this case, and the *inconsistency* observed was dependent on system-dependent behavior. For now I might propose doing away with this test in its current form, and just test assert t.strftime('%') == time.strftime('%') or something like that. I agree with Victor that trying to make the strftime experience consistent across system-dependent quirks is a worthy goal, but that goes deeper than just this trailing '%' case. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: