Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLs within specification HTML file are not "memorable" #533

Open
davidmalcolm opened this issue Jun 16, 2022 · 6 comments
Open

URLs within specification HTML file are not "memorable" #533

davidmalcolm opened this issue Jun 16, 2022 · 6 comments
Labels
2.2.0 editorial Purely editorial.

Comments

@davidmalcolm
Copy link

davidmalcolm commented Jun 16, 2022

I'm working on a SARIF consumer (updating GCC to be able to view SARIF files to replay them, using GCC's output formats; as opposed to issue #531, which is GCC as a SARIF producer).

My code can issue complaints about malformed SARIF files e.g.:

invalid.sarif:6:20: error: not enough strings in ‘arguments’ array for placeholder ‘{2}’ [SARIF v2.1.0 §3.11.11]
    6 |       { "message": { "text" : "the {0} {1} fox jumps over the {2} dog", "arguments": ["quick", "brown"] } }
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

where I'm capturing the fact that this file violates [SARIF v2.1.0 §3.11.11].

I'd like to be able to provide a URL for the specific part of the specification that is violated (both as a hyperlink in the output, and as the 3.49.12 reportingDescriptor.helpUri property when reporting the violation in SARIF form)

Unfortunately, looking at:
https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html
the HTML for that section is:

<h3><a name="_Toc34317472"></a><a name="_Toc33187355"></a><a
name="_Ref508811093">3.11.11 arguments property</a></h3>

i.e. the anchor for section 3.11.11 has name "_Ref508811093". The anchor IDs appear to have been auto-generated by an output tool, rather than having meanings.

Is there a way to fix this? Are there canonical URIs for the various parts of the specification?

See e.g. https://www.w3.org/Provider/Style/URI.html

@rillig
Copy link

rillig commented Sep 29, 2022

Additional question: Are the _Toc anchors from cos01, cos02 and os related in any way?

For a standards document that has excellent section anchors and IDs, see the C++ programming language.

@sthagen
Copy link
Contributor

sthagen commented Oct 1, 2022

Why not use a map in the implementation language which should be easily derivable from the spec html (table of contents)?

@rillig
Copy link

rillig commented Oct 1, 2022

Why not use a map in the implementation language which should be easily derivable from the spec html (table of contents)?

When I'm looking at a URL, I want to know exactly what this URL is about. The anchor _Toc33187355 doesn't convey any information. In contrast, the sections in the C++ standard have helpful names like array.creation that are human-friendly and most probably stable over time.

If OASIS should ever decide that the HTML version of the specification would be regenerated, all previous references to the sections would become broken. Not so for the C++ standard, as the anchors are human-centered and give a strong hint what they are about. As long as the content of the C++ standard stays the same, these anchors will survive. Not so for the OASIS standards, where the anchors differ between the os version and the previous drafts.

These stable URLs (and more generally: stable identifiers) are what Tim Berners Lee described in the Cool URIs article.

Why not use a map in the implementation language which should be easily derivable from the spec html (table of contents)?

A map in the implementation would only have this limited scope, for this one implementation. This limits the usefulness of the OASIS standard.

By the way, POSIX has done the same mistake by using anchors of the form #tag_20_73_04, using the magic numbers 20, 73 and 04. While not as random as the _Toc anchors generated by Microsoft Word, the magic numbers are not stable over time either.

@sthagen
Copy link
Contributor

sthagen commented Oct 2, 2022

That was the point: Being helpful for the case brought up.
There are OASIS specifications that contain more semantic links, which I like (as editor and reader) but because of different source formats (markdown).
Still the general assumption that such links will never break is hard to uphold across versions.
The semantic versioning does not match all use cases and even then with a new major version (in the semantic version lingo) anything goes (embrace change).

That being "said": The members of the committee may always consider and adopt such quest for semantic links / anchors.
This is why I labeled the issue as "question".

@michaelcfanning michaelcfanning added 2.2.0 editorial Purely editorial. and removed question labels Jul 13, 2023
@michaelcfanning
Copy link
Contributor

Let's put this in the editorial backlog for action in 2.2. Clearly an opportunity to do a better job. I have personally experienced a fair amount of pain myself obtaining links to various doc sections. We're moving to markdown, it appears, and will have more control over the design here.

@schlaman-ms
Copy link

Document location for issue:

Generic issue - no specific location.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.2.0 editorial Purely editorial.
Projects
None yet
Development

No branches or pull requests

5 participants