Edits to the section on character escapes. Added references

to CharMod Section 4.6.
w3c · Feb 2, 2016 · 823d812 · 823d812
1 parent d6d4a84
commit 823d812
Showing 1 changed file with 34 additions and 6 deletions.
diff --git a/index.html b/index.html
@@ -1006,14 +1006,32 @@ <h3>Character Escapes</h3>
           additional equivalent means of representing characters inside a given
           resource. They also allow for the encoding of Unicode characters not
           represented in the character encoding scheme used by the document.</p>
+        <p>See also, Section 4.6 of [[!CHARMOD]].</p>
+
+        <!-- examples taken from S4.6 charmod
+        
+		<aside class="example">
+		  <div class="example-header marker"></div>
+		  <p>Some examples of escapes and includes:</p>
+		  <ul>
+		  <li>HTML and XML define 'Numeric Character References' which allow both the escaping of syntax-significance 
+		  and the expression of arbitrary Unicode characters. Expressed as &amp;#x3C; or &amp;#60; the character '&lt;' will not 
+		  be parsed as a markup delimiter.</li>
+          <li>The programming language Java uses '"' to delimit strings. To express '"' within a string, one may escape it as '\"'.</li>
+          <li>XML defines 'CDATA sections' which allow escaping the syntax-significance of all characters between the CDATA 
+           section delimiters. CDATA sections prevent the expression of characters using numeric character references.</li>
+		  </ul>
+		</aside>
+		
+		
+		--->
+
         <p>For example, <span class="qchar">€</span> <span class="uname" translate="no">U+20AC
             EURO SIGN</span> can also be encoded in HTML as the hexadecimal
           entity <code>&amp;#x20ac;</code> or as the decimal entity <code>&amp;#8364;</code>.
           In a JavaScript or JSON file, it can appear as <code>\u20ac</code>
           while in a CSS stylesheet it can appear as <code>\20ac</code>. All of
-          these representations encode the same literal character value: <span
-
-            class="qchar">€</span>.</p>
+          these representations encode the same literal character value: <span class="qchar">€</span>.</p>
         <p>Character escapes are normally interpreted before a document is
           processed and strings within the format or protocol are matched.
           Returning to an example we used above: </p>
@@ -1455,9 +1473,19 @@ <h4> Unicode Normalizing Specification Requirements </h4>
       </section>
       <section id="expandingCharacterEscapes">
         <h2>Expanding Character Escapes and Includes</h2>
-        <p>Character escapes, such as HTML's numeric character references (for example, <code>&amp;#x20AC;</code>)
-        or named entity references (<code>&amp;amp;</code>), and other included values that are intended
-        to form part of matched string values require expansion when matching strings.</p>
+        <p>Most document formats and protocols provide a means for 
+		encoding characters or including external data, including text, into a 
+		<a>resource</a>. This is discussed in detail in Section 4.6 of [[!CHARMOD]] 
+		as well as <a href="#characterEscapes">above</a>.</p>
+
+		<p>When performing matching, it is important to know when to interpret character escapes so that
+		   a match succeeds (or fails) appropriately. Normally, escapes, references, and includes are processed
+		   or expanded before performing matching, since these syntaxes exist to allow difficult-to-encode
+		   sequences to be put into a document conveniently. </p>
+		  <p>When processing the syntax of a document format...</p>
+		  <p>When performing a match on syntactic content...</p>
+		  <p>When performing a match on natural language content...</p>
+
         <p class="issue">Edit me!</p>
       </section>
       <section id="handlingCaseFolding">