Skip to content

bpo-31484: Cache single-character strings outside of the Latin1 range.#3600

Closed
serhiy-storchaka wants to merge 2 commits into
python:masterfrom
serhiy-storchaka:unicode-bmp-char-cache
Closed

bpo-31484: Cache single-character strings outside of the Latin1 range.#3600
serhiy-storchaka wants to merge 2 commits into
python:masterfrom
serhiy-storchaka:unicode-bmp-char-cache

Conversation

@serhiy-storchaka
Copy link
Copy Markdown
Member

@serhiy-storchaka serhiy-storchaka commented Sep 15, 2017

@serhiy-storchaka serhiy-storchaka added the type-feature A feature request or enhancement label Sep 15, 2017
@zhangyangyu zhangyangyu self-requested a review September 17, 2017 16:12
Copy link
Copy Markdown
Member

@methane methane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread Objects/unicodeobject.c Outdated
/* Single character Unicode objects can be shared when using this
constructor. */
if (size == 1) {
return unicode_char((Py_UCS4)*u);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    def test_issue17223(self):
        # this used to crash
        if sizeof_wchar == 4:
            # U+FFFFFFFF is an invalid code point in Unicode 6.0
            invalid_str = b'\xff\xff\xff\xff'
        else:
            # PyUnicode_FromUnicode() cannot fail with 16-bit wchar_t
            self.skipTest("specific to 32-bit wchar_t")
        a = array.array('u', invalid_str)
        self.assertRaises(ValueError, a.tounicode)
        self.assertRaises(ValueError, str, a)

PyUnicode_FromWideChar() should raise ValueError when u is out of unicode range.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not it do this? unicode_char() is an implementation of Python's chr().

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(ch <= MAX_UNICODE); is in top of unicode_char()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Comment thread Objects/unicodeobject.c Outdated
/* Single character Unicode objects can be shared when using this
constructor. */
if (size == 1) {
return unicode_char((Py_UCS4)*u);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(ch <= MAX_UNICODE); is in top of unicode_char()

@bedevere-bot
Copy link
Copy Markdown

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I didn't expect the Spanish Inquisition!. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be poked with soft cushions!

@serhiy-storchaka
Copy link
Copy Markdown
Member Author

I expect the Japan Inquisition!

@taleinat
Copy link
Copy Markdown
Contributor

taleinat commented Sep 6, 2018

@serhiy-storchaka, shouldn't this go in? Just add a NEWS entry...

@serhiy-storchaka
Copy link
Copy Markdown
Member Author

I have doubts because of diverse feedback on this proposition. This PR adds complexity in the code, and the gain is small. Most likely I'll close this PR unless other core devs have good need of this microoptimization.

@csabella
Copy link
Copy Markdown
Contributor

@serhiy-storchaka, based on your last comment, do you think this should be closed now?

@taleinat
Copy link
Copy Markdown
Contributor

@serhiy-storchaka, let's close this PR?

@serhiy-storchaka serhiy-storchaka removed their assignment Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting merge type-feature A feature request or enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants