Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: refers to Unicode 'class' when it should be 'category' #44715

Open
aprice2704 opened this issue Mar 1, 2021 · 2 comments
Open

spec: refers to Unicode 'class' when it should be 'category' #44715

aprice2704 opened this issue Mar 1, 2021 · 2 comments
Assignees
Milestone

Comments

@aprice2704
Copy link

@aprice2704 aprice2704 commented Mar 1, 2021

What version of Go are you using (go version)?

Spec dated: Feb 10, 2021
Apologies: I couldn't find where to submit corrections to the spec.

I was determining whether underscore is considered uppercase or not. 'Lu' is a unicode category not a unicode class, it seems, and this distinction actually matters in this instance. Additionally, it might be helpful to mention that '_' is considered lower case when saying that is it considered a letter in the "Numbers and Letters" section.

What did you expect to see?

Exported identifiers
An identifier may be exported to permit access to it from another package. An identifier is exported if both:

the first character of the identifier's name is a Unicode upper case letter (Unicode category "Lu"); and
the identifier is declared in the package block or it is a field name or method name.
All other identifiers are not exported.

also perhaps:

The underscore character _ (U+005F) is considered a lower case letter.

What did you see instead?

Exported identifiers
An identifier may be exported to permit access to it from another package. An identifier is exported if both:

the first character of the identifier's name is a Unicode upper case letter (Unicode class "Lu"); and
the identifier is declared in the package block or it is a field name or method name.
All other identifiers are not exported.

and also

The underscore character _ (U+005F) is considered a letter.

ref: https://en.wikipedia.org/wiki/Unicode_character_property#General_Category

@seankhliao seankhliao changed the title Language spec refers to Unicode 'class' when it should be 'category' spec: refers to Unicode 'class' when it should be 'category' Mar 1, 2021
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Mar 2, 2021
@griesemer griesemer modified the milestones: Backlog, Go1.16.1, Go1.17 Mar 2, 2021
@aprice2704
Copy link
Author

@aprice2704 aprice2704 commented Mar 3, 2021

Finding a more definitive reference than wikipedia was harder than expected; however, one may go to the report linked and search for "General_Category" to confirm:

https://unicode.org/reports/tr44/

The table in section 5.1, for instance, indicates category is the correct term, though to be fair it is a little fuzzy.

Sorry for the extraordinarily picky point. A change of larger practical import is to add the 'lower case' to the line about '_'

(I have a colleague who wishes to use variable names starting with _. I find them aesthetically displeasing and wanted to find a real reason to discourage him -- that many folks may not know whether _ is upper or lower case and thus whether the variable is exported or not ;P Clarifying this point in the spec sabotages my case somewhat, but does improve it, albeit microscopically)

@griesemer
Copy link
Contributor

@griesemer griesemer commented Mar 3, 2021

Thanks, @aprice2704 for this report. Making it clear that _ is considered a lower-case letter seems worthwhile (and fixing the class vs category issue as well). A cursory glance over the spec doesn't seem to state that explicitly.

Using _ may not be explicitly recommended style, but it can be useful. For instance, if we anticipate some API to be exported at some point in the future, but maybe not in the very next release, it can make sense to use the future, capitalized identifier names and prefix them temporarily with _. This sort of shows what the actual future names looks like, but for the _.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants