Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename "XHTML parsing" etc to more-accurate "XML parsing" #2062

Merged
merged 4 commits into from
Nov 25, 2016
Merged

Conversation

sideshowbarker
Copy link
Contributor

See #2056 (comment)

The current Parsing XHTML documents, Serializing XHTML fragments, and Parsing XHTML fragments sections define requirements for XML processing, not anything specific to “XHTML”.

And in general continued use of the term “the XHTML syntax” buys us nothing at this point, so the change in this PR replaces all references to that with just “XML”.

When most authors see the term “XHTML document” they think about XHTML1, not about anything we define in the current HTML spec. It’s ambiguous. So this change clears away that ambiguity.

@domenic
Copy link
Member

domenic commented Nov 18, 2016

I think there's a worthy goal here in clarifying that certain parts of the spec currently say "XHTML" but are meant to apply more broadly to all XML in web browsers. However, I'm not sure I'd go as far as this PR does in attempting to eradicate the concept of XHTML entirely. (Also I think it's worth being careful about how you phrase things to avoid pitchfork-wielding mobs; you're not "retiring XHTML" in the sense of asking browsers to get rid of it; you're phasing out the name XHTML.)

In my opinion the name XHTML is still useful, to clarify that we're talking about a specific XML vocabulary. To the extent people think about standards at all when using the word XHTML, I disagree that they think about XHTML1; I hope that most people are aware that the HTML Standard is what defines XHTML these days as something that swallowed both the HTML4 and XHTML1 efforts.

Concretely in terms of this PR, I think most of the changes then are not good, as they just remove specificity. It requires a lot more care to go through and find which parts of the spec are actually talking about XML in general as opposed to the specific XHTML vocabulary. At a glance, the "parsing XHTML documents" and "serializing XHTML fragments" sections are the most obvious changes, but most of the others I find in the PR I'd rather not change.

@domenic
Copy link
Member

domenic commented Nov 18, 2016

In my opinion the name XHTML is still useful, to clarify that we're talking about a specific XML vocabulary.

Stated another way, the changes in this PR seem tantamount to me to replacing all mentions of "SVG" with "XML".

@annevk
Copy link
Member

annevk commented Nov 18, 2016

I don't think it's quite the same as that. The vocabulary is HTML and there's an HTML and an XML syntax. This is true for SVG too. The vocabulary is SVG and there's an HTML and an XML syntax.

Giving the vocabulary a different name in XML is rather weird and mostly a historical thing because it was new at the time and the focus back then was much more on syntax than systems.

@domenic
Copy link
Member

domenic commented Nov 18, 2016

I see your point. Maybe it's just what I'm used to, but I still think there's value in having a simple name like "XHTML" for "the XML syntax of HTML", and using that in various sections in the spec.

@sideshowbarker sideshowbarker changed the title Retire XHTML, rename "XHTML parsing" etc to "XML…" Rename "XHTML parsing" etc to "XML…" Nov 19, 2016
@sideshowbarker sideshowbarker changed the title Rename "XHTML parsing" etc to "XML…" Rename "XHTML parsing" etc to more-accurate "XML parsing" Nov 19, 2016
@sideshowbarker
Copy link
Contributor Author

Stated another way, the changes in this PR seem tantamount to me to replacing all mentions of "SVG" with "XML".

It seems to me that’s not apt, for the same reasons pointed out in #2062 (comment).

To me it seems the way we’re using the “XHTML” label is similar to using a special name like “HSVG“ or whatever to label the case of SVG is embedded in a text/html document rather than an XML one.

I think it's worth being careful about how you phrase things to avoid pitchfork-wielding mobs; you're not "retiring XHTML" in the sense of asking browsers to get rid of it; you're phasing out the name XHTML.

You’re right—I’ve changed the issue title now to be less provocative. (And thanks for calling me on that—it’s counterproductive to be flame-baiting here.)

I still think there's value in having a simple name like "XHTML" for "the XML syntax of HTML", and using that in various sections in the spec.

I strongly agree there would be if we had evidence showing we have a lot of authors who are actually doing that. But the evidence we actually have shows the opposite.

I’ll post another comment with the details, but in the mean time, I want to explicitly assert something implicit in me raising this PR to begin with—which is: I don’t think the underlying use case is common enough or important enough to merit us continuing to give it a special label.

@domenic
Copy link
Member

domenic commented Nov 22, 2016

To close the loop here, I withdraw my objection, but I don't wan't to be the one reviewing this since my instinct is just to leave things as they are, so I'm not a good judge of whether each of the changes is correct.

I think we may want to be more careful about preserving the auto-generated IDs, however. And changing the split-filename has a lot of consequences; either we need to create a redirect, or we should just leave it as-is as a historical note.

Copy link
Member

@zcorpan zcorpan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think this is fine, but the prose needs some tweaks here and there and we shouldn't change filenames or ids.

@@ -560,7 +560,7 @@



<h3>HTML vs XHTML</h3>
<h3>HTML vs XML</h3>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this changes the expectation about what this section is about, from "HTML as text/html vs HTML as XML" to "HTML vs XML the meta language".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "HTML vs XML syntax"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this section we should add a note that HTML's XML syntax was formerly known as XHTML, but that we decided to abandon that terminology since it does not exist for MathML and SVG either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as we transition it's important to acknowledge that XHTML was indeed a thing of sorts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-  <h3>HTML vs XHTML</h3>
+  <h3>HTML vs XML</h3>

I think this changes the expectation about what this section is about, from "HTML as text/html vs HTML as XML" to "HTML vs XML the meta language".

Agreed yeah the change is misleading as is.

Maybe "HTML vs XML syntax"?

I changed it to “HTML syntax vs XML syntax”.

I think in this section we should add a note that HTML's XML syntax was formerly known as XHTML, but that we decided to abandon that terminology since it does not exist for MathML and SVG either.

I think in this section we should add a note that HTML's XML syntax was formerly known as XHTML, but that we decided to abandon that terminology since it does not exist for MathML and SVG either.

I added a note saying that.

@@ -577,19 +577,19 @@
<code>text/html</code> <span>MIME type</span>, then it will be processed as an HTML document by
Web browsers. This specification defines the latest HTML syntax, known simply as "HTML".</p>

<p>The second concrete syntax is the XHTML syntax, which is an application of XML. When a document
<p>The second concrete syntax is XML. When a document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine.

is transmitted with an <span>XML MIME type</span>, such as <code>application/xhtml+xml</code>,
then it is treated as an XML document by Web browsers, to be parsed by an XML processor. Authors
are reminded that the processing for XML and HTML differs; in particular, even minor syntax errors
will prevent a document labeled as XML from being rendered fully, whereas they would be ignored in
the HTML syntax. This specification defines the latest XHTML syntax, known simply as "XHTML".</p>
the HTML syntax.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine


<p>The DOM, the HTML syntax, and the XHTML syntax cannot all represent the same content. For
<p>The DOM, the HTML syntax, and XML cannot all represent the same content. For
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine

and in the XHTML syntax. Similarly, documents that use the <code>noscript</code> feature can be
represented using the HTML syntax, but cannot be represented with the DOM or in the XHTML syntax.
and in XML. Similarly, documents that use the <code>noscript</code> feature can be
represented using the HTML syntax, but cannot be represented with the DOM or in XML.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine

@@ -2469,10 +2462,6 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

<dd>

<p>Implementations that support <span>the XHTML syntax</span> must support some version of XML,
as well as its corresponding namespaces specification, because that syntax uses an XML
serialization with namespaces. <ref spec=XML> <ref spec=XMLNS></p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we still want to require support for Namespaces in XML for UAs that support XML?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't think we should remove this. In fact, we have numerous dependencies on XML/XMLNS in the platform. I doubt it's really optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-    <p>Implementations that support <span>the XHTML syntax</span> must support some version of XML,
-    as well as its corresponding namespaces specification, because that syntax uses an XML
-    serialization with namespaces. <ref spec=XML> <ref spec=XMLNS></p>

I suppose we still want to require support for Namespaces in XML for UAs that support XML?

Yes, I don't think we should remove this. In fact, we have numerous dependencies on XML/XMLNS in the platform. I doubt it's really optional.

Restored



<div w-nodev>
<h2 split-filename="xml"><dfn>XML</dfn></h2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change the filename or the id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+  <h2 split-filename="xml"><dfn>XML</dfn></h2>

Don't change the filename or the id.

OK, restored those

is unsafe if they are defined in an external file (except for <code data-x="">&amp;lt;</code>,
<code data-x="">&amp;gt;</code>, <code data-x="">&amp;amp;</code>, <code data-x="">&amp;quot;</code>
and <code data-x="">&amp;apos;</code>).</p>


<div w-nodev>

<h3>Parsing XHTML documents</h3>
<h3>Parsing XML documents</h3>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id="parsing-xhtml-documents"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id="parsing-xhtml-documents"

Added


<!--//HTMLPARSER-->


<!--en-GB--><h3 id="serialising-xhtml-fragments">Serializing XHTML fragments</h3>
<!--en-GB--><h3 id="serialising-xml-fragments">Serializing XML fragments</h3>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't change the id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-  <!--en-GB--><h3 id="serialising-xhtml-fragments">Serializing XHTML fragments</h3>
+  <!--en-GB--><h3 id="serialising-xml-fragments">Serializing XML fragments</h3>

don't change the id

OK, reverted the id

@@ -109247,7 +109219,7 @@ Hello.&lt;/pre></pre>



<h3>Parsing XHTML fragments</h3>
<h3>Parsing XML fragments</h3>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-  <h3>Parsing XHTML fragments</h3>
+  <h3>Parsing XML fragments</h3>

id

Added

@annevk
Copy link
Member

annevk commented Nov 23, 2016

I'm also in favor of this. I added some minor comments on top of those of @zcorpan.

@domenic domenic added the clarification Standard could be clearer label Nov 23, 2016
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good.

@@ -560,7 +560,7 @@



<h3>HTML vs XML</h3>
<h3>HTML syntax vs XML syntax</h3>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should preserve the ID here. Also, "HTML vs XML syntax" seems more natural?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-  <h3>HTML vs XML</h3>
+  <h3>HTML syntax vs XML syntax</h3>

We should preserve the ID here.

oofs, yeah—fixed

Also, "HTML vs XML syntax" seems more natural?

Yes, quite clear in the context—changed to that

is transmitted with an <span>XML MIME type</span>, such as <code>application/xhtml+xml</code>,
then it is treated as an XML document by Web browsers, to be parsed by an XML processor. Authors
are reminded that the processing for XML and HTML differs; in particular, even minor syntax errors
will prevent a document labeled as XML from being rendered fully, whereas they would be ignored in
the HTML syntax.</p>

<p class="note">The XML syntax for HTML was formerly referred to as "XHTML", but this
specification does not use that term (among other reasons, because no corresponding term is used
for the cases of MathML and SVG).</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "no such term is used for the HTML syntaxes of MathML and SVG"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "no such term is used for the HTML syntaxes of MathML and SVG"?

OK—changed it to that

@@ -577,19 +577,23 @@
<code>text/html</code> <span>MIME type</span>, then it will be processed as an HTML document by
Web browsers. This specification defines the latest HTML syntax, known simply as "HTML".</p>

<p>The second concrete syntax is the XHTML syntax, which is an application of XML. When a document
<p id="the-xml-syntax">The second concrete syntax is XML. When a document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this ID added by the way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so there would be something to point to for anybody who wanted a specific reference for the XML syntax. But it’s not strictly necessary and not used internally, so I can just remove it if you think we should.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's remove it then. There's a couple of sections that can be referenced with exposed IDs.

Copy link
Contributor Author

@sideshowbarker sideshowbarker Nov 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id="the-xml-syntax"

Yeah, let's remove it then. There's a couple of sections that can be referenced with exposed IDs.

OK, removed

@sideshowbarker
Copy link
Contributor Author

I think we may want to be more careful about preserving the auto-generated IDs, however.

Yeah I shouldn’t have changed those to begin with. But I think we got them all reverted in review.

And changing the split-filename has a lot of consequences; either we need to create a redirect, or we should just leave it as-is as a historical note.

Yeah, undid that as well

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear whether these nits are worth fixing, but I'm not sure why we should change the tone from the original as well.

example, namespaces cannot be represented using the HTML syntax, but they are supported in the DOM
and in the XHTML syntax. Similarly, documents that use the <code>noscript</code> feature can be
represented using the HTML syntax, but cannot be represented with the DOM or in the XHTML syntax.
and in XML. Similarly, documents that use the <code>noscript</code> feature can be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be "the XML syntax"? This happens here and a couple times below where what used to say XHTML syntax is now just XML.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be "the XML syntax"? This happens here and a couple times below where what used to say XHTML syntax is now just XML.

Yup, so changed

and in the XHTML syntax. Similarly, documents that use the <code>noscript</code> feature can be
represented using the HTML syntax, but cannot be represented with the DOM or in the XHTML syntax.
and in XML. Similarly, documents that use the <code>noscript</code> feature can be
represented using the HTML syntax, but cannot be represented with the DOM or in XML.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got this one

Comments that contain the string "<code data-x="">--&gt;</code>" can only be represented in the
DOM, not in the HTML and XHTML syntaxes.</p>
DOM, not in the HTML syntax or XML.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here. Should be "in the HTML and XML syntaxes" I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it and the rest too

@@ -1855,16 +1859,15 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
as <dfn data-x="">object properties</dfn> and <dfn data-x="">CSS properties</dfn> respectively.</p>

<p>Generally, when the specification states that a feature applies to <span>the HTML syntax</span>
or <span>the XHTML syntax</span>, it also includes the other. When a feature specifically only
or <span>XML</span>, it also includes the other. When a feature specifically only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.


<p>This specification uses the term <dfn data-x="">document</dfn> to refer to any use of HTML,
ranging from short static documents to long essays or reports with rich multimedia, as well as to
fully-fledged interactive applications. The term is used to refer both to <code>Document</code>
objects and their descendant DOM trees, and to serialized byte streams using the <span data-x="the
HTML syntax">HTML syntax</span> or <span data-x="the XHTML syntax">XHTML syntax</span>, depending
on context.</p>
HTML syntax">HTML syntax</span> or <span>XML</span>, depending on context.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

@zcorpan zcorpan dismissed their stale review November 24, 2016 12:21

Happy with this when Anne's nits are fixed

@zcorpan zcorpan removed their assignment Nov 24, 2016
@zcorpan
Copy link
Member

zcorpan commented Nov 24, 2016

Oh, can you re-wrap to 100 cols also?

@sideshowbarker
Copy link
Contributor Author

Oh, can you re-wrap to 100 cols also?

yup, will do right now



<div w-nodev>
<h2 split-filename="xhtml"><dfn id="xhtml">XML</dfn></h2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so if we keep this as "The XML syntax", and use <span>the XML syntax</span> above, it would be all consistent again. So I think we want to do that too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+  <h2 split-filename="xhtml"><dfn id="xhtml">XML</dfn></h2>

Yup, so changed

<p>The above technique is also useful in XHTML, since <code>noscript</code> is not supported in
<span>the XHTML syntax</span>.</p>
<p>The above technique is also useful in XML, since <code>noscript</code> is not supported in
XML.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<span>the XML syntax</span>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

is by essentially "turning off" the parser when scripts are enabled, so that the contents of the
element are treated as pure text and not as real elements. XML does not define a mechanism by
which to do this.</p>
syntax</span>, it has no effect in XML. This is because the way it works is by essentially
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<span>the XML syntax</span>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

Comments that contain the string "<code data-x="">--&gt;</code>" can only be represented in the
DOM, not in the HTML and XHTML syntaxes.</p>
DOM, not in the HTML syntax or the XML syntax.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"HTML and XML syntaxes"

or <span>the XHTML syntax</span>, it also includes the other. When a feature specifically only
applies to one of the two languages, it is called out by explicitly stating that it does not apply
to the other format, as in "for HTML, ... (this does not apply to XHTML)".</p>
or the XML syntax, it also includes the other. When a feature specifically only applies to one of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<span>the XML syntax</span>


<p>This specification uses the term <dfn data-x="">document</dfn> to refer to any use of HTML,
ranging from short static documents to long essays or reports with rich multimedia, as well as to
fully-fledged interactive applications. The term is used to refer both to <code>Document</code>
objects and their descendant DOM trees, and to serialized byte streams using the <span data-x="the
HTML syntax">HTML syntax</span> or <span data-x="the XHTML syntax">XHTML syntax</span>, depending
on context.</p>
HTML syntax">HTML syntax</span> or the XML syntax, depending on context.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<span data-x="the XML syntax">XML syntax</span> (no need to change this from the original either)

from the <span>HTML namespace</span> found in XML documents as described in this specification,
so that users can interact with them, unless the semantics of those elements have been
overridden by other specifications.</p>
<p>Web browsers that support XML must process elements and attributes from the <span>HTML
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<span>the XML syntax</span>

using a <a href="#writing">custom format</a> inspired by SGML (referred to as <span>the HTML
syntax</span>). Implementations must support at least one of these two formats, although
supporting both is encouraged.</p>
two authoring formats: one based on <span data-x="xhtml">XML</span>, and one using a <a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data-x="the XML syntax"

@@ -2469,8 +2467,8 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute

<dd>

<p>Implementations that support <span>the XHTML syntax</span> must support some version of XML,
as well as its corresponding namespaces specification, because that syntax uses an XML
<p>Implementations that support the XML syntax for HTML must support some version of XML, as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<span>the XML syntax</span>

@@ -9207,8 +9205,8 @@ partial interface <dfn id="document" data-lt="">Document</dfn> {

<p id="no-browsing-context">DOM nodes whose <span>node document</span> does not have a
<span>browsing context</span> are exempt from all document conformance requirements other than the
<a href="#writing">HTML syntax</a> requirements and <a href="#writing-xhtml-documents">XHTML
syntax</a> requirements.</p>
<a href="#writing">HTML syntax</a> requirements and <span data-x="xhtml">XML syntax</span>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should reinstate the "writing-xhtml-documents" ID and subsection (and link it from here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should reinstate the "writing-xhtml-documents" ID and subsection (and link it from here).

OK restored the link and restored the heading as:

<h3 id="writing-xhtml-documents">Writing documents in the XML syntax</h3>

is by essentially "turning off" the parser when scripts are enabled, so that the contents of the
element are treated as pure text and not as real elements. XML does not define a mechanism by
which to do this.</p>
syntax</span>, it has no effect in <span data-x="xhtml">the XML syntax</span>. This is because the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data-x="the XML syntax"

<p>The above technique is also useful in XHTML, since <code>noscript</code> is not supported in
<span>the XHTML syntax</span>.</p>
<p>The above technique is also useful in <span data-x="xhtml">the XML syntax</span>, since
<code>noscript</code> is not supported in <span data-x="xhtml">the XML syntax</span>.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not supported there*

@@ -98404,8 +98402,8 @@ dictionary <dfn>StorageEventInit</dfn> : <span>EventInit</span> {
<h2 split-filename="syntax" id="syntax"><dfn>The HTML syntax</dfn></h2>

<p class="note">This section only describes the rules for resources labeled with an <span>HTML
MIME type</span>. Rules for XML resources are discussed in the section below entitled "<span>The
XHTML syntax</span>".</p>
MIME type</span>. Rules for XML resources are discussed in the <span data-x="xhtml">XML
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data-x="the XML syntax"

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many nits left relative to the original text and a couple of referencing errors as far as I can tell.


<p class="note">This section only describes the rules for XML resources. Rules for
<code>text/html</code> resources are discussed in the section above entitled "<span>The HTML
syntax</span>".</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason to remove this note or the "Writing XML documents" section following it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-  <p class="note">This section only describes the rules for XML resources. Rules for
-  <code>text/html</code> resources are discussed in the section above entitled "<span>The HTML
-  syntax</span>".</p>

I don't see a reason to remove this note or the "Writing XML documents" section following it.

OK—restored both

@annevk
Copy link
Member

annevk commented Nov 24, 2016

LGTM, but maybe someone should give it another pass with fresh eyes.

This change replaces references throughout the spec to “the XHTML
syntax” and “XHTML parsing”, etc., with references instead to “the XML
syntax [of HTML]” and “XML parsing”, while adding a couple of notes to
help make clear that the term “the XML syntax” is the same thing the
term “XHTML” was formerly used for.
@sideshowbarker
Copy link
Contributor Author

LGTM, but maybe someone should give it another pass with fresh eyes.

To make that easier, I went ahead and squashed the commits so there’s only one diff to look at.

@@ -1852,15 +1856,15 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
as <dfn data-x="">object properties</dfn> and <dfn data-x="">CSS properties</dfn> respectively.</p>

<p>Generally, when the specification states that a feature applies to <span>the HTML syntax</span>
or <span>the XHTML syntax</span>, it also includes the other. When a feature specifically only
or the <span>XML syntax</span>, it also includes the other. When a feature specifically only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The span needs to wrap the "the" as well to xref correctly. (We need to make wattsi fail for this...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a Wattsi bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The span needs to wrap the "the" as well to xref correctly. (We need to make wattsi fail for this...)

oofs, thanks for catching that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -109129,14 +109131,11 @@ Hello.&lt;/pre></pre>
parser</span> can also be <span data-x="abort a parser">aborted</span>, which must again be done in
the same way as for an <span>HTML parser</span>.</p>

<p>For the purposes of conformance checkers, if a resource is determined to be in <span>the XHTML
syntax</span>, then it is an <span data-x="XML documents">XML document</span>.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this no longer necessary? "XML documents" here is the DOM concept, and some document conformance differences are stated in terms of that (e.g. noscript). But it might not be clear how that concept applies to a conformance checker without this paragraph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-  <p>For the purposes of conformance checkers, if a resource is determined to be in <span>the XHTML
-  syntax</span>, then it is an <span data-x="XML documents">XML document</span>.</p>

Why is this no longer necessary?

I agree it’s necessary now after the round “XML”->“the XML syntax” review changes, so I’ve restored it.

Initially when I had replaced “the XHTML syntax” with just “XML”, this would have become:

For the purposes of conformance checkers, if a resource is determined to be XML, then it is an XML document.

…which didn’t seem to expressing anything implementable…

@@ -108970,21 +108969,24 @@ Hello.&lt;/pre></pre>
spec=XMLENTITY></p>


<h2 split-filename="xhtml"><dfn id="xhtml">The XHTML syntax</dfn></h2>
<h2 split-filename="xhtml" id="xhtml"><dfn>The XML syntax</dfn></h2>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another subtle change. The ID used to be on the <dfn>. I think that distinction is important for Wattsi.

MIME type</span>. Rules for XML resources are discussed in the section below entitled "<span>The
XHTML syntax</span>".</p>
MIME type</span>. Rules for XML resources are discussed in the <span data-x="the XML syntax">XML
syntax</span> section.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put the "the" inside the span and omit data-x? Same for the note below.

<p>The above technique is also useful in XHTML, since <code>noscript</code> is not supported in
<span>the XHTML syntax</span>.</p>
<p>The above technique is also useful in <span>the XML syntax</span>, since <code>noscript</code>
is not supported in there.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/in there/there/

@annevk
Copy link
Member

annevk commented Nov 25, 2016

I'll push a fix for these remaining nits.

@annevk annevk merged commit 643d1bc into master Nov 25, 2016
@annevk
Copy link
Member

annevk commented Nov 25, 2016

\o/

@sideshowbarker sideshowbarker deleted the retire-xhtml branch November 25, 2016 13:28
@sideshowbarker
Copy link
Contributor Author

w00t

@domenic @annevk @zcorpan thanks for your patience—never meant for this one to consume as much of our time as it ending up taking, but I really think this change was due and I’m glad it got reviewed carefully and worded carefully (especially compared to my initial slash-and-burn attempt)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Standard could be clearer
Development

Successfully merging this pull request may close these issues.

4 participants