Group CSS features #1519

nzakas · 2025-04-08T18:54:30Z

One of the valuable parts of the mdn-data package is how it separates CSS features into different categories:

At-rules
functions
properties
selectors
syntaxes
types
units

In the current webref package, it's just a collection of objects that we then need to dig into to figure out what types are contained within. It would be helpful if the categories could be exposed at the top level of the package and list every entry for that category regardless of spec.

The text was updated successfully, but these errors were encountered:

tidoust · 2025-04-09T10:23:51Z

I note the current @webref/css package already separates at the root level between:

at-rules
properties
selectors
and "values", which is a mixed bag of things.

The mixed bag of things exists because CSS specs do not really distinguish between other types when they define concepts. There is a notion of function but the specs do not necessarily use that consistently. That ambiguity seems to appear in mdn-data too. For example, the abs() function appears both as a "function" and as a "syntax" in mdn-data.

CSS specs do use a type definition type too, which could perhaps be used to populate a related category. There seems to be many more type definitions in specs than in what mdn-data currently lists as types. For example, line-color-list, linear-color-stop, ident-token are all type definitions from a spec perspective. If they are not in the list on purpose, is there a way to distinguish between types?

CSS specs define units as value definitions that are for something. It may be relatively easy to assemble the list of units automatically with a short list of underlyling types. For example, looking at all values defined for <angle>, <length> and a few others.

Essentially, the question is: can CSS features be categorized automatically? If not, what amount of manual data would need to be maintained?

nzakas · 2025-04-09T17:49:08Z

Thanks for the response. A follow-up question: assuming everyone wants webref packages to be as useful as possible, is there a reason the specs themselves can't be updated to encode this information where appropriate?

tidoust · 2025-04-10T10:01:02Z

No reason in theory and, on top of trying to reduce the amount of work needed to maintain Webref, we also restrict the amount of data that needs to be manually injected in Webref to a bare minimum as a way to push fixes and improvements back to the underlying specs.

In practice there are ~120 CSS specs at various levels of maturity and activity, with dozens of editors and >3800 open issues. We already maintain a few patches in Webref for things that need fixing in CSS specs to get consistent data (these patches link back to issues raised against the specs). If most CSS specs need to be updated to provide additional semantics, that's likely going to require elbow grease both to convince CSS WG participants that the effort is worth prioritizing and to help with the actual updates. That's also why I'm trying to assess whether missing categories can already be determined automatically from available information.

nzakas · 2025-04-10T15:37:16Z

Ah gotcha, thanks for explaining. 👍

tidoust · 2025-04-28T15:27:17Z

I explored a bit the differences between MDN data and Webref, see underlying code in tidoust/mdn-webref, along with the results:

The webref.json file, which could represent what we may want to end up with in Webref to ease consumption of data.
The report, which highlights differences between the two projects.

As far as I can tell, missing data in Webref is mostly stuff that is non standard or that has been obsoleted, but that is still present in MDN data (and sometimes documented on MDN). I do not know to what extent that data is a must have in Webref. There's more data missing in MDN data, perhaps because the underlying features are more recent and not yet documented.

There may be a few cases where data needs to be slightly improved in specs so that it can start appearing in Webref. One example is <general-enclosed> which is currently defined in a <pre> tag without any class, skipped by the crawler as too generic. That seems easily fixable.

I still do not understand what syntaxes are meant to encompass. I managed to cover most of them by assembling functions and types, but that also creates hundreds of syntaxes that are not accounted for in MDN data. Are syntaxes used in practice? How?

(On top of the features themselves, I note that the grouping information in MDN data does not exist in Webref. That grouping seems more specific to MDN though. Same thing for links to MDN pages).

nzakas · 2025-04-28T15:51:04Z

Syntaxes are used in CSSTree to enable validation:
https://github.com/csstree/csstree/blob/9558ba790daeda2b24935838bf89990699ece66e/lib/data.js#L7

Basically, the parser creates an AST and the lexer validates the AST against these syntax definitions.

tidoust · 2025-05-01T10:20:27Z

Thanks @nzakas. I had not realized that entries in the "types" category in MDN data do not have a syntax key and that the "syntaxes" category collects that information. I'm not sure why functions are listed under the "syntaxes" category too, as that seems to duplicate the information already present in the functions.json file. All in all, I think the "syntaxes" category can be assembled by merging the "functions" and "types" categories, provided entries there do have a syntax key of course.

That initial exploration suggests that the categorization itself can be done automatically, with straightforward reasons that explain why some data is missing in Webref. That's a good first result!

I'll now look into actual syntax values to understand where and why Webref differs from MDN data. I somewhat expect to find more substantive differences as MDN data syntaxes are manually curated to match reality in main browsers if I understand things correctly, while Webref data is more meant to be a view of what latest specs drafts currently define, regardless of what browsers support. When specs lag behind implementations, they need fixing, knowing about the problem creates a good feedback loop. When specs are more recent than implementations, it may be challenging to select the right syntax automatically. Anyway, let's find out ;)

nzakas · 2025-05-01T15:08:43Z

Thanks for the update and all of our work on this. 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Group CSS features #1519

Group CSS features #1519

nzakas commented Apr 8, 2025

tidoust commented Apr 9, 2025

nzakas commented Apr 9, 2025

tidoust commented Apr 10, 2025

nzakas commented Apr 10, 2025

tidoust commented Apr 28, 2025 •

edited

Loading

nzakas commented Apr 28, 2025

tidoust commented May 1, 2025

nzakas commented May 1, 2025

Group CSS features #1519

Group CSS features #1519

Comments

nzakas commented Apr 8, 2025

tidoust commented Apr 9, 2025

nzakas commented Apr 9, 2025

tidoust commented Apr 10, 2025

nzakas commented Apr 10, 2025

tidoust commented Apr 28, 2025 • edited Loading

nzakas commented Apr 28, 2025

tidoust commented May 1, 2025

nzakas commented May 1, 2025

tidoust commented Apr 28, 2025 •

edited

Loading