Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upEditorial: nested surrogate pairs? #969
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
jmdyck commentedAug 6, 2017
In the 'Patterns' clause, the Syntax section contains the paragraph:
This strikes me as very odd. The wording is exactly parallel to that for resolving the dangling else problem, suggesting that surrogate pairs nest somehow. I.e., given
the last
TrailSurrogateshould be 'associated' with the firstLeadSurrogate. But what would that even mean? (Note that the relevant semantics make no mention of "associated" or "corresponding" Surrogates.)Now, granted, the grammar is formally ambiguous here (
Alternative[+U]derives\u LeadSurrogate \u TrailSurrogatein two distinct ways), and so requires some disambiguation. But rather than the quoted paragraph, I'd prefer one of these approaches:\u LeadSurrogate \u TrailSurrogate, it must be parsed as a singleAtomrather than two.[+U] u LeadSurrogate [lookahead != \u TrailSurrogate](One objection to the last approach is that
TrailSurrogateis not a terminal symbol, and so this is not a 'legal' lookahead sequence. However, this doesn't bother me much: we could enlarge the definition of lookahead sequences if we wanted.)