Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regular expressions - whitespace #1005

Closed
ChristianGruen opened this issue Feb 7, 2024 · 6 comments
Closed

regular expressions - whitespace #1005

ChristianGruen opened this issue Feb 7, 2024 · 6 comments
Labels
Editorial Minor typos, wording clarifications, example fixes, etc. Propose Closing with No Action The WG should consider closing this issue with no action XQFO An issue related to Functions and Operators

Comments

@ChristianGruen
Copy link
Contributor

There is some confusion about the rationale for defining the multi-character escape for whitespaces in a recent discussion on Slack:

  • \s is limited to [#x20\t\n\r]
  • In contrast, \w\ covers [#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}], i.e., considers the full Unicode range

Do we know the reason?

I assume it’s both too late and out of scope to change that in our specs, but maybe we can improve the XQFO spec and…

  • mention why \s does not include \p{Zs} or \p{Z}
  • add an example for looking up non-breaking spaces… for example:
matches(
  string-join(('my', 'pleasure'), char(0xA0)),
  '\p{Z}'
)
@ChristianGruen ChristianGruen added XQFO An issue related to Functions and Operators Editorial Minor typos, wording clarifications, example fixes, etc. labels Feb 7, 2024
@ChristianGruen ChristianGruen changed the title regular expression addition - whitespace regular expressions - whitespace Feb 7, 2024
@ndw
Copy link
Contributor

ndw commented Feb 7, 2024

I think Syd Bauman's proposed rational: that \s matches the S production in XML is almost certainly correct.

It's too late to add such a note to the XML Schemas specification, but we could certainly add one here.

@Arithmeticus
Copy link
Contributor

Equivalent reminders exist within constituent definitions in the serialization spec (1.1, definition), XSLT spec (4 Data Model, para 4, definition), but not the data model spec.

There is a rather opaque reminder in the XPath specs A.3.5.

The XQFO spects have a reminder in the rules and a note of fn:normalize-space, in the 5.6.2 flags documentation for x, a note for fn:tokenize, a note for fn:parse-ietf-date. So, here there, but not everywhere, and not captured as a definition as the other two specs do.

In all, there are already a lot of reminders about what whitespace means scattered throughout the specs. So the failure to communicate is not because no one tried.

One impulse is to keep spreading reminders. Another is to consolidate them and make them more prominent (via hyperlinks). I don't know.

@ChristianGruen ChristianGruen self-assigned this Feb 7, 2024
@michaelhkay
Copy link
Contributor

michaelhkay commented Feb 7, 2024

One impulse is to keep spreading reminders

My father taught me a useful principle on this: if you want to bring things to a reader's attention, you need to reduce the amount of text, not to increase it. Very hard to achieve in practice, but certainly, saying things several times in different places can be counter-productive.

@michaelhkay
Copy link
Contributor

It's not easy to add tutorial information for regular expressions given that we specify them by reference to the XML Schema specification, with modifications and additions. There's nowhere obvious to put the requested information without adding a lot of other non-normative material to the spec, and on the whole, I think that would be counter-productive. I'm going to propose closing with no action.

@michaelhkay michaelhkay added the Propose Closing with No Action The WG should consider closing this issue with no action label Feb 13, 2024
@ChristianGruen
Copy link
Contributor Author

ChristianGruen commented Feb 13, 2024

I'm going to propose closing with no action.

+1. The best place would be the XML Schema spec anyway.

@ndw
Copy link
Contributor

ndw commented Feb 21, 2024

The CG agreed to close this issue without action at meeting 066

@ndw ndw closed this as completed Feb 21, 2024
@ChristianGruen ChristianGruen removed their assignment Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Editorial Minor typos, wording clarifications, example fixes, etc. Propose Closing with No Action The WG should consider closing this issue with no action XQFO An issue related to Functions and Operators
Projects
None yet
Development

No branches or pull requests

4 participants