Skip to content

bpo-37864: Correct and deduplicate "isprintable" docs; add test. #15300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

gnprice
Copy link
Contributor

@gnprice gnprice commented Aug 15, 2019

We had the definition of what makes a character "printable"
documented in three places, giving two different definitions.

The definition in the comment on _PyUnicode_IsPrintable was
inverted; correct that.

With that correction, the two definitions turn out to be equivalent --
but to confirm that, you have to go look up, or happen to know, that
those are the only five "Other" categories and only three "Separator"
categories in the Unicode character database. That makes it hard for
the reader to tell whether they really are the same, or if there's
some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of
the subtle details, in favor of referring to the Python-level docs.
That ensures it's explicit that these are all meant to agree, and
also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with
other tweaks, to hopefully add a bit more clarity to that one
newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with
this definition.

https://bugs.python.org/issue37864

We had the definition of what makes a character "printable"
documented in three places, giving two different definitions.

The definition in the comment on `_PyUnicode_IsPrintable` was
inverted; correct that.

With that correction, the two definitions turn out to be equivalent --
but to confirm that, you have to go look up, or happen to know, that
those are the only five "Other" categories and only three "Separator"
categories in the Unicode character database.  That makes it hard for
the reader to tell whether they really are the same, or if there's
some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of
the subtle details, in favor of referring to the Python-level docs.
That ensures it's explicit that these are all meant to agree, and
also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with
other tweaks, to hopefully add a bit more clarity to that one
newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with
this definition.
@terryjreedy
Copy link
Member

"Generated files not up to date
M Objects/clinic/unicodeobject.c.h
M Objects/unicodeobject.c"

I believe you have to regenerate these files and include the new versions in the branch and the new diffs in this PR. If 'how' is not in the devguide, ask on core-mentorship list.

@gnprice
Copy link
Contributor Author

gnprice commented Aug 20, 2019

@terryjreedy Ah, thanks for spotting that!

Indeed, clinic needed to be run after I'd edited a docstring. Fixed now.

@gnprice
Copy link
Contributor Author

gnprice commented Aug 21, 2019

(I also just added a NEWS entry, which should turn that red X into a happy green check-mark.)

@encukou
Copy link
Member

encukou commented Feb 14, 2025

Merged in #130118, with another regen-all.

Sorry for the wait!

@encukou encukou closed this Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants