Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-34481: Fix surrogate-handling in strftime #8983

Closed
wants to merge 10 commits into from

Conversation

pganssle
Copy link
Member

@pganssle pganssle commented Aug 28, 2018

This is intended to be re-based against #8878 - that PR adds tests, this one adds the fix.

I am making this PR before that one is merged mainly because this is some crazy platform-specific behavior, and I want to test it on the full cross-platform test suite.

https://bugs.python.org/issue34481

https://bugs.python.org/issue34481

izbyshev and others added 6 commits August 23, 2018 20:46
…gs in datetime classes

A follow-up of bpo-34454.
Now that the bug is fixed, the try/catch can be removed.
This is now also a more complete fix for bpo-6697.
This will cut down on some of the per-environment variability of this
function.
This passes a backslash-escaped unicode string to strftime in the event
that locale encoding fails. To ensure that the relevant string is
round-trippable, all backslashes in the original string are
double-escaped, and then unescaped when the result is decoded.
This reverts commit c1381d5.
@pganssle
Copy link
Member Author

I have no idea how to reproduce this failure. It works fine on my home computer (Arch Linux), on a debian VPS and a debian WSL. All are succeeding.

I think that this will start succeeding on Linux as well if I re-enable wcsftime, but I'm quite concerned that it's failing in a way that I don't understand and can't reproduce.

@pganssle
Copy link
Member Author

pganssle commented Sep 4, 2018

I'm able to reproduce this issue only on Ubuntu 14. For some reason with '\ud800', this function is not raising an error in Ubuntu 14. However, a simple POC:

#include <stdlib.h>
#include <stdio.h>

int main() {
    const wchar_t str[] = L"\xd800";
    char buffer[32];

    int rv = wcstombs(NULL, str, 0);

    printf("%d\n", rv);
}

Works fine on both Ubuntu 14 and other Linux (returns -1). I am honestly deeply puzzled at this. I also am not sure how to make a minimal triggering example, because I'm not sure what the pure python equivalent of PyUnicode_EncodeLocale is. '\ud800'.encode(errors="surrogateescape") doesn't hit this code path. Anyone have an idea?

@pganssle
Copy link
Member Author

pganssle commented Sep 5, 2018

This is too annoying for me to continue working on it. Anyone else is free to pick up where I left off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants