Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-legacy mode for dfns extraction #1120

Merged
merged 2 commits into from
Nov 25, 2022
Merged

Auto-legacy mode for dfns extraction #1120

merged 2 commits into from
Nov 25, 2022

Conversation

tidoust
Copy link
Member

@tidoust tidoust commented Nov 24, 2022

Dfns extraction expected specs to always follow the data definitions model specified in Bikeshed's documentation. Older specs don't follow that model. Terms they define are still extracted but with a private access level.

This update makes the code peak at all definitions found in the spec. When some definition has a data-dfn-type attribute or some other attribute directly related to the data definitions model, extraction proceeds as before. When no definition has any of these attributes, the "legacy" extraction mode is activated and all extracted definitions are considered to be "public".

This makes a few existing dfns public that used to be private for specs such as TC39 specs, css-page-4 (see below for details), but that is precisely what we want!

This would fix #1117 the lazy way, meaning without introducing extra crawling options ;)

css-page-4 - https://drafts.csswg.org/css-page-4/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://drafts.csswg.org/css-page-4/'
  }
}
css-style-attr - https://drafts.csswg.org/css-style-attr/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
...
    url: 'https://drafts.csswg.org/css-style-attr/'
  }
}
ecma-402 - https://tc39.es/ecma402/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://tc39.es/ecma402/'
  }
}
html - https://html.spec.whatwg.org/multipage/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
      heading: {
+         href: 'https://html.spec.whatwg.org/multipage/infrastructure.html#parallelism',
+         id: 'parallelism',
+         number: '2.1.1',
+         title: 'Parallelism'
-         href: 'https://html.spec.whatwg.org/multipage/introduction.html#typographic-conventions',
-         id: 'typographic-conventions',
-         number: '1.9.2',
-         title: 'Typographic conventions'
      },
+       href: 'https://html.spec.whatwg.org/multipage/infrastructure.html#in-parallel',
+       id: 'in-parallel',
-       href: 'https://html.spec.whatwg.org/multipage/introduction.html#x-that',
-       id: 'x-that',
      informative: false,
      linkingText: [
+         'in parallel'
-         'this'
      ],
      localLinkingText: [],
      type: 'dfn'
    },
    {
+       access: 'private',
-       access: 'public',
      definedIn: 'prose',
      for: [],
...
        title: 'Parallelism'
      },
+       href: 'https://html.spec.whatwg.org/multipage/infrastructure.html#immediately',
+       id: 'immediately',
-       href: 'https://html.spec.whatwg.org/multipage/infrastructure.html#in-parallel',
-       id: 'in-parallel',
      informative: false,
      linkingText: [
+         'immediately'
-         'in parallel'
      ],
      localLinkingText: [],
      type: 'dfn'
    },
    {
+       access: 'public',
...
-       access: 'private',
...
rfc7231 - https://httpwg.org/specs/rfc7231.html
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://httpwg.org/specs/rfc7231.html'
  }
}
rfc9110 - https://httpwg.org/specs/rfc9110.html
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://httpwg.org/specs/rfc9110.html'
  }
}
svg-integration - https://svgwg.org/specs/integration/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'dt',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://svgwg.org/specs/integration/'
  }
}
svg-strokes - https://svgwg.org/specs/strokes/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'table',
      for: [],
...
    },
    {
+       access: 'public',
...
-       access: 'private',
...
SVG2 - https://svgwg.org/svg2-draft/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
...
    },
    {
+       access: 'public',
+       definedIn: 'table',
+       for: [
+         'SVGLength'
+       ],
+       heading: {
+         href: 'https://svgwg.org/svg2-draft/types.html#InterfaceSVGLength',
+         id: 'InterfaceSVGLength',
+         number: '4.5.2',
+         title: 'Interface SVGLength'
+       },
+       href: 'https://svgwg.org/svg2-draft/types.html#__svg__SVGLength__SVG_LENGTHTYPE_NUMBER',
+       id: '__svg__SVGLength__SVG_LENGTHTYPE_NUMBER',
+       informative: false,
+       linkingText: [
+         'SVG_LENGTHTYPE_NUMBER'
+       ],
+       localLinkingText: [],
+       type: 'const'
+     },
+     {
+       access: 'public',
+       definedIn: 'table',
+       for: [
...
-       access: 'private',
-       definedIn: 'prose',
-       for: [],
-       heading: {
-         href: 'https://svgwg.org/svg2-draft/types.html#InterfaceSVGLength',
-         id: 'InterfaceSVGLength',
-         number: '4.5.2',
-         title: 'Interface SVGLength'
-       },
-       href: 'https://svgwg.org/svg2-draft/types.html#LengthValue',
-       id: 'LengthValue',
-       informative: false,
-       linkingText: [
-         'value'
-       ],
-       localLinkingText: [],
-       type: 'dfn'
-     },
-     {
-       access: 'public',
-       definedIn: 'table',
-       for: [
-         'SVGLength'
-       ],
...
tc39-decorators - https://tc39.es/proposal-decorators/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://tc39.es/proposal-decorators/'
  }
}
tc39-import-assertions - https://tc39.es/proposal-import-assertions/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://tc39.es/proposal-import-assertions/'
  }
}
tc39-intl-enumeration - https://tc39.es/proposal-intl-enumeration/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://tc39.es/proposal-intl-enumeration/'
  }
}
tc39-intl-numberformat - https://tc39.es/proposal-intl-numberformat-v3/out/numberformat/proposed.html
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://tc39.es/proposal-intl-numberformat-v3/out/numberformat/proposed.html'
  }
}
tc39-shadowrealm - https://tc39.es/proposal-shadowrealm/
Expected values to be strictly deep-equal:
+ actual - expected ... Lines skipped

{
  dfns: [
    {
+       access: 'public',
-       access: 'private',
      definedIn: 'prose',
...
    url: 'https://tc39.es/proposal-shadowrealm/'
  }
}

Dfns extraction expected specs to always follow the data definitions model
specified in Bikeshed's documentation. Older specs don't follow that model.
Terms they define are still extracted but with a `private` access level.

This update makes the code peak at all definitions found in the spec. When some
definition has a `data-dfn-type` attribute or some other attribute directly
related to the data definitions model, extraction proceeds as before. When no
definition has any of these attributes, the "legacy" extraction mode is
activated and all extracted definitions are considered to be "public".
@dontcallmedom
Copy link
Member

that's a wonderfully elegant solution to #1117! I'm surprised to see the HTML in the list of impacted specs, though, since it's definitely using the dfn model - can you comment on that?

@tidoust
Copy link
Member Author

tidoust commented Nov 25, 2022

Ah, I forgot to remove it from the diff. It is due to removal of duplicate dfns added in #1110 but not yet integrated in a released version of Reffy (so not yet integrated in webref). The Typographic conventions section in HTML defines "this" twice to show how definitions get marked. The second definition is now ignored by Reffy (but still appears in webref data). In an ideal world, we would probably ignore these fake definitions altogether but that seems like a good topic for another day ;)

@tidoust tidoust merged commit f97baf1 into main Nov 25, 2022
@tidoust tidoust deleted the dfn-auto-legacy branch November 25, 2022 09:10
@dontcallmedom dontcallmedom linked an issue Nov 25, 2022 that may be closed by this pull request
tidoust added a commit that referenced this pull request Nov 27, 2022
New features:
- Auto-legacy mode for dfns extraction (#1120)
- Prevent duplicates in dfns extracts (#1110)

Dependencies bumped:
- Bump rollup from 3.2.5 to 3.5.0 (#1113, #1116, #1122)
- Bump web-specs from 2.35.0 to 2.36.0 (#1121)
- Bump minimatch from 3.0.4 to 3.1.2 (#1118)
- Bump puppeteer from 19.2.2 to 19.3.0 (#1119)
- Bump ajv from 8.11.0 to 8.11.2 (#1112)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider flag to force export of dfns in legacy specs
2 participants