Skip to content

Commit

Permalink
[e] (0) Since I'm going to be editing this algorithm some more, let's…
Browse files Browse the repository at this point in the history
… bite the bullet and do what foolip and anne wanted, which is to normalise newlines early for sanity.

git-svn-id: http://svn.whatwg.org/webapps@6757 340c8d12-0b0e-0410-8428-c7bf67bfef74
  • Loading branch information
Hixie committed Oct 25, 2011
1 parent 833abd6 commit 4320eb4
Show file tree
Hide file tree
Showing 3 changed files with 93 additions and 165 deletions.
78 changes: 28 additions & 50 deletions complete.html
Expand Up @@ -33068,10 +33068,25 @@ <h6 id=parsing-0><span class=secno>4.8.10.13.3 </span>Parsing</h6>

<p>The <dfn id=webvtt-parser-algorithm>WebVTT parser algorithm</dfn> is as follows:</p>

<ol><li><p>Let <var title="">input</var> be the string being parsed,
after conversion to Unicode.</li>
<ol><li>

<p>Let <var title="">input</var> be the string being parsed, after
conversion to Unicode, and with the following transformations
applied:</p>

<ul><li><p>Replace all U+0000 NULL characters by U+FFFD REPLACEMENT
CHARACTERs.</li>

<li><p>Replace all U+0000 NULL characters in <var title="">input</var> by U+FFFD REPLACEMENT CHARACTERs.</li>
<li><p>Replace each U+000D CARRIAGE RETURN U+000A LINE FEED
(CRLF) character pair by a single U+000A LINE FEED (CRLF)
character.</li>

<li><p>Replace all remaining U+000D CARRIAGE RETURN characters by
U+000A LINE FEED (CRLF) characters.</li>

</ul></li>

<li>

<li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
string. In an <a href=#incremental-webvtt-parser>incremental WebVTT parser</a>, when this
Expand All @@ -33088,9 +33103,7 @@ <h6 id=parsing-0><span class=secno>4.8.10.13.3 </span>Parsing</h6>


<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is less than six characters
long, then abort these steps. The file is not a <a href=#webvtt-file>WebVTT
Expand All @@ -33110,26 +33123,14 @@ <h6 id=parsing-0><span class=secno>4.8.10.13.3 </span>Parsing</h6>
<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>


<li><p><i title="">Header</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
characters</a> that are <em>not</em> U+000D CARRIAGE RETURN (CR)
or U+000A LINE FEED (LF) characters. Let <var title="">line</var>
be those characters, if any.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
characters</a> that are <em>not</em> U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>
Expand All @@ -33144,13 +33145,11 @@ <h6 id=parsing-0><span class=secno>4.8.10.13.3 </span>Parsing</h6>


<li><p><i>Cue loop</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
characters</a> that are either U+000D CARRIAGE RETURN (CR) or
U+000A LINE FEED (LF) characters.</li>
characters</a> that are U+000A LINE FEED (LF)
characters.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>end</i>. (In such a case, <var title="">position</var> is also forcibly past the end of <var title="">input</var><!-- since we've just collected newlines, so we
Expand Down Expand Up @@ -33200,19 +33199,14 @@ <h6 id=parsing-0><span class=secno>4.8.10.13.3 </span>Parsing</h6>
<li><p>Let <var title="">cue</var>'s <a href=#text-track-cue-identifier>text track cue
identifier</a> be <var title="">line</var>.<p></li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then discard <var title="">cue</var> and jump
to the step labeled <i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then
discard <var title="">cue</var> and jump to the step labeled <i>cue
Expand All @@ -33229,19 +33223,11 @@ <h6 id=parsing-0><span class=secno>4.8.10.13.3 </span>Parsing</h6>
past the end of <var title="">input</var>, then jump to the step
labeled <i>cue text processing</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled <i>cue text
processing</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue text processing</i>.</li>
Expand Down Expand Up @@ -33274,19 +33260,11 @@ <h6 id=parsing-0><span class=secno>4.8.10.13.3 </span>Parsing</h6>
past the end of <var title="">input</var>, then jump to the step
labeled <i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue loop</i>.</li>
Expand Down
78 changes: 28 additions & 50 deletions index
Expand Up @@ -33068,10 +33068,25 @@ The General Relativistic Field Equations</pre>

<p>The <dfn id=webvtt-parser-algorithm>WebVTT parser algorithm</dfn> is as follows:</p>

<ol><li><p>Let <var title="">input</var> be the string being parsed,
after conversion to Unicode.</li>
<ol><li>

<p>Let <var title="">input</var> be the string being parsed, after
conversion to Unicode, and with the following transformations
applied:</p>

<ul><li><p>Replace all U+0000 NULL characters by U+FFFD REPLACEMENT
CHARACTERs.</li>

<li><p>Replace all U+0000 NULL characters in <var title="">input</var> by U+FFFD REPLACEMENT CHARACTERs.</li>
<li><p>Replace each U+000D CARRIAGE RETURN U+000A LINE FEED
(CRLF) character pair by a single U+000A LINE FEED (CRLF)
character.</li>

<li><p>Replace all remaining U+000D CARRIAGE RETURN characters by
U+000A LINE FEED (CRLF) characters.</li>

</ul></li>

<li>

<li><p>Let <var title="">position</var> be a pointer into <var title="">input</var>, initially pointing at the start of the
string. In an <a href=#incremental-webvtt-parser>incremental WebVTT parser</a>, when this
Expand All @@ -33088,9 +33103,7 @@ The General Relativistic Field Equations</pre>


<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is less than six characters
long, then abort these steps. The file is not a <a href=#webvtt-file>WebVTT
Expand All @@ -33110,26 +33123,14 @@ The General Relativistic Field Equations</pre>
<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>


<li><p><i title="">Header</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
characters</a> that are <em>not</em> U+000D CARRIAGE RETURN (CR)
or U+000A LINE FEED (LF) characters. Let <var title="">line</var>
be those characters, if any.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>
characters</a> that are <em>not</em> U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>
Expand All @@ -33144,13 +33145,11 @@ The General Relativistic Field Equations</pre>


<li><p><i>Cue loop</i>: <a href=#collect-a-sequence-of-characters>Collect a sequence of
characters</a> that are either U+000D CARRIAGE RETURN (CR) or
U+000A LINE FEED (LF) characters.</li>
characters</a> that are U+000A LINE FEED (LF)
characters.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>end</i>. (In such a case, <var title="">position</var> is also forcibly past the end of <var title="">input</var><!-- since we've just collected newlines, so we
Expand Down Expand Up @@ -33200,19 +33199,14 @@ The General Relativistic Field Equations</pre>
<li><p>Let <var title="">cue</var>'s <a href=#text-track-cue-identifier>text track cue
identifier</a> be <var title="">line</var>.<p></li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then discard <var title="">cue</var> and jump
to the step labeled <i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then
discard <var title="">cue</var> and jump to the step labeled <i>cue
Expand All @@ -33229,19 +33223,11 @@ The General Relativistic Field Equations</pre>
past the end of <var title="">input</var>, then jump to the step
labeled <i>cue text processing</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled <i>cue text
processing</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue text processing</i>.</li>
Expand Down Expand Up @@ -33274,19 +33260,11 @@ The General Relativistic Field Equations</pre>
past the end of <var title="">input</var>, then jump to the step
labeled <i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000D CARRIAGE RETURN (CR) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p>If <var title="">position</var> is past the end of <var title="">input</var>, then jump to the step labeled
<i>end</i>.</li>

<li><p>If the character indicated by <var title="">position</var>
is a U+000A LINE FEED (LF) character, advance <var title="">position</var> to the next character in <var title="">input</var>.</li>

<li><p><a href=#collect-a-sequence-of-characters>Collect a sequence of characters</a> that are
<em>not</em> U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF)
characters. Let <var title="">line</var> be those characters, if
any.</li>
<em>not</em> U+000A LINE FEED (LF) characters. Let <var title="">line</var> be those characters, if any.</li>

<li><p>If <var title="">line</var> is the empty string, then jump
to the step labeled <i>cue loop</i>.</li>
Expand Down

0 comments on commit 4320eb4

Please sign in to comment.