Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[selectors][css-syntax] "parse a selector" missing trailing parenthesis #492

Closed
gsnedders opened this issue Sep 15, 2016 · 7 comments
Closed
Labels
Closed Rejected as Invalid Commenter Satisfied Commenter has indicated satisfaction with the resolution / edits. css-syntax-3 selectors-4 Current Work Tested Memory aid - issue has WPT tests

Comments

@gsnedders
Copy link
Member

gsnedders commented Sep 15, 2016

This is really @oyvind-stenhaug's issue, but hey. :)

Given an instruction to parse a selector where the source is :not([class], should this return failure or the selector? Notably, myself and @tabatkins seem to disagree on this.

Notably, step one of "parse a selector" is "Let selector be the result of parsing source as a <selector-list>. If it does not match the grammar, return failure.".

So, does :not([class] match the grammar?

  • We start at <selector-list>, so we match against <complex-selector-list>
  • Match against <complex-selector>#
  • Match against <compound-selector>
  • Match against <simple-selector>+
  • None of <type-selector>, <id-selector>, <class-selector>, <attribute-selector> match the first char, so match against <pseudo-class-selector>
  • Match against ':' <function-token> <any-value> ')', which is all fine and well until we need to match the ')'. Now the first big question is what exactly we do when we hit <function-token>? The cross-reference links to this which doesn't directly include a grammar, language, or machine. Do we "consume a token" until one is emitted, and if it is a <function-token> continue to match against that fragment of the grammar and if it is not try the next possible alternation or, if none exists, fail?

Regardless, following the grammar, we end up not matching, so we should return failure.

Now, per @tabatkins, this isn't the intent. Apparently, https://drafts.csswg.org/css-syntax/#rule-defs should apply, and specifically:

When defining a function or a block, the ending token must be specified in the grammar, but if it’s not present in the eventual token stream, it still matches.

However, Selectors provides a grammar to match against, and gives no indication that there should be any error handling done if <selector-list> does not directly match. Selectors only refers to specific tokens from Syntax, and hence it isn't obvious that anything outwith "consume a token" applies to the grammar specified in Selectors, such as whether ':' <function-token> <any-value> ')' is considered to define "a function".

I guess what would satisfy me for this (I'll let Øyvind speak for himself!) is text similar to what we have in Media Queries:

Informal descriptions of the media query syntax appear in the prose and railroad diagrams in previous sections. The formal media query syntax is described in this section, with the rule/property grammar syntax defined in [CSS3SYN] and [CSS3VAL].

Obviously we don't need quite that, perhaps something more like:

The formal selector syntax is described in this section, with the rule grammar syntax defined in [CSS3SYN].

This clarifies the status of the grammar by providing a normative reference to [CSS3SYN], and makes it clear that "Defining Grammars for Rules and Other Values" in it applies to that grammar.

Obviously this doesn't deal with the other two issues: what exactly one does when one hits <function-token> in the grammar, and what is considered to define a function.

@tabatkins
Copy link
Member

Regardless, following the grammar, we end up not matching, so we should return failure.

Incorrect. It is literally impossible for you to fail to match a function/block due to a missing ending token like this, because that information is completely erased by the parsing process. Token streams don't escape the Syntax spec; everything (explicitly or implicitly) uses one of the parsing entry points, and those collect functions/blocks automatically. (In this case, this is unfortunately implicit; I didn't link the word "parse" to https://drafts.csswg.org/css-syntax/#parse-grammar like I should have. That's an easy fix.)

You can't pay attention to just Selectors. The grammar that Selectors is using to define the selectors grammar is not derived from universal first principles; it's defined in the Syntax and V&U specs. This is somewhat implicit; we don't explicitly ref Syntax and V&U every time we use some of the value definition grammar, but everything in CSS uses it. I'm happy to add some refs, but it's not exactly a killer issue; this is definitely in the "everyone knows what we mean" camp.

Syntax, then, defines how blocks and functions work, via the paragraph you quoted - you have to include the closing token in grammars, but they're not actually required when matching things against the grammar, because it's literally impossible to require such; the data model returned by the Syntax parsing algos doesn't include such information, just "functions" and "blocks" with contents. Whether or not the ending token was present is completely erased by the time you get to the point of comparing a value against a CSS grammar.

and what is considered to define a function.

I guess I could put some redundant text being super-explicit about this? This runs real close to the "uh, obviously it means X" thing; we rely on a lot of implicit context in our definitions. I understand that you're coming at it from an "assume nothing, test everything" perspective, but still. ^_^

@gsnedders
Copy link
Member Author

gsnedders commented Sep 15, 2016

You can't pay attention to just Selectors.

Why not? Why should I, when implementing document.querySelector, think to look at any actual CSS spec? Why can I not just read Selectors and follow all cross-references to implement it? Given Selectors is a relatively stand-alone spec, I was assuming we were just using some BNF syntax definition rather than some module with CSS in the name! (I don't know if this was @oyvind-stenhaug's misunderstanding, too.)

And I'll point out we don't have interop here so I don't think we're in "everyone knows what we mean" camp, unless this has changed over time? Equally, given both me and Øyvind misunderstood this, I think that's obviously not true. If two people who are/were in the WG get this wrong, then heaven help someone outwith of it trying to implement it!

I guess I could put some redundant text being super-explicit about this? This runs real close to the "uh, obviously it means X" thing; we rely on a lot of implicit context in our definitions. I understand that you're coming at it from an "assume nothing, test everything" perspective, but still. ^_^

I won't deny that bit is down the anal end of the spectrum. :)

@oyvind-stenhaug
Copy link

It sounds like @gsnedders and I are on the same page.

I must admit that I initially didn't even look at the <function-token> and <ident-token> references, I just assumed that <ident-token> couldn't possibly allow a leading parenthesis, so since my string didn't contain any ), how could it possibly match the grammar?

Seems to me that parse a selector might be better off not referring to a grammar, but to some description of the actual algorithm. I haven't been following along in the goings-on here, but browsing throught the EDs now I don't really understand why Selectors has this grammar, and not a parser algorithm à la Syntax.

@tabatkins
Copy link
Member

And I'll point out we don't have interop here so I don't think we're in "everyone knows what we mean" camp, unless this has changed over time?

Bingo. Syntax post-dates querySelector implementations; implementations conforming to Syntax are relatively recent. I'll note that, prior to Syntax, every implementation error-corrected in a unique and ideosyncratic way (as the only description of how to handle CSS that didn't match the 2.1 grammar was a paragraph of vague error-handling guidelines).

I must admit that I initially didn't even look at the <function-token> and <ident-token> references, I just assumed that <ident-token> couldn't possibly allow a leading parenthesis, so since my string didn't contain any ), how could it possibly match the grammar?

Admitting that you didn't actually read the grammar details and instead just went off a gut feeling about how it should work does not strengthen your case about the spec being unclear. ^_^

Anyway, I'm adding an explicit link to the Syntax "parse" definition right now.

Seems to me that parse a selector might be better off not referring to a grammar, but to some description of the actual algorithm. I haven't been following along in the goings-on here, but browsing throught the EDs now I don't really understand why Selectors has this grammar, and not a parser algorithm à la Syntax.

Parsing algorithms are preferred when you have to do error-handling; grammars are terrible for recovering from errors. When errors instead just result in something being invalid, tho, grammars can often be much more compact, and make it easier to be consistent about corner details. (If we required parsing algos for everything in CSS, you can bet we'd have a super-inconsistent treatment of whether whitespace is required between tokens or not.) This is why it's fine for CSS properties to specify their value space with a grammar rather than bespoke parsing algorithms. Selectors are in the same boat - you never need to error-recover during Selectors parsing, you just parse it if it's valid and fail if it's not.

@gsnedders
Copy link
Member Author

b46e417 added this. I think that's good enough for now? @oyvind-stenhaug, look good to you (when you get back from holiday!)?

mathiasbynens referenced this issue in mathiasbynens/tibia.com-extension Jan 5, 2017
@annevk
Copy link
Member

annevk commented Jan 5, 2017

FWIW, I agree with @tabatkins with regards to how things should be. Whether they're defined that way...

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Nov 9, 2017
Discussed in
w3c/csswg-drafts#492

BUG=755015

Change-Id: I31172814ad0db99b2ede246f92c4611dab07c756
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Nov 9, 2017
Discussed in
w3c/csswg-drafts#492

Change-Id: I31172814ad0db99b2ede246f92c4611dab07c756
Reviewed-on: https://chromium-review.googlesource.com/759918
Commit-Queue: Eric Willigers <ericwilligers@chromium.org>
Reviewed-by: Darren Shen <shend@chromium.org>
Cr-Commit-Position: refs/heads/master@{#515117}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Nov 9, 2017
Discussed in
w3c/csswg-drafts#492

Change-Id: I31172814ad0db99b2ede246f92c4611dab07c756
Reviewed-on: https://chromium-review.googlesource.com/759918
Commit-Queue: Eric Willigers <ericwilligers@chromium.org>
Reviewed-by: Darren Shen <shend@chromium.org>
Cr-Commit-Position: refs/heads/master@{#515117}
MXEBot pushed a commit to mirror/chromium that referenced this issue Nov 10, 2017
Discussed in
w3c/csswg-drafts#492



Change-Id: I31172814ad0db99b2ede246f92c4611dab07c756
Reviewed-on: https://chromium-review.googlesource.com/759918
Commit-Queue: Eric Willigers <ericwilligers@chromium.org>
Reviewed-by: Darren Shen <shend@chromium.org>
Cr-Commit-Position: refs/heads/master@{#515117}
@tabatkins tabatkins added Closed Rejected as Invalid Commenter Satisfied Commenter has indicated satisfaction with the resolution / edits. Tested Memory aid - issue has WPT tests labels Jan 24, 2019
@tabatkins
Copy link
Member

Tested in web-platform-tests/wpt#15088

@tabatkins tabatkins added this to the CSS Syntax 3 June 2019 CR milestone Jun 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed Rejected as Invalid Commenter Satisfied Commenter has indicated satisfaction with the resolution / edits. css-syntax-3 selectors-4 Current Work Tested Memory aid - issue has WPT tests
Projects
None yet
Development

No branches or pull requests

4 participants