Permalink
Fetching contributors…
Cannot retrieve contributors at this time
3125 lines (2747 sloc) 134 KB
<pre class='metadata'>
Title: CSS Text Module Level 3
Shortname: css-text
Level: 3
Status: ED
Work Status: Refining
Group: csswg
ED: https://drafts.csswg.org/css-text-3/
TR: https://www.w3.org/TR/css-text-3/
Previous Version: https://www.w3.org/TR/2017/WD-css-text-3-20170822/
Previous Version: https://www.w3.org/TR/2013/WD-css-text-3-20131010/
Previous Version: https://www.w3.org/TR/2012/WD-css3-text-20121113/
Issue Tracking: Tracker http://www.w3.org/Style/CSS/Tracker/products/10
Test Suite: http://test.csswg.org/suites/css3-text/nightly-unstable/
Editor: Elika J. Etemad / fantasai, Invited Expert, http://fantasai.inkedblade.net/contact, w3cid 35400
Editor: Koji Ishii, Invited Expert, kojiishi@gluesoft.co.jp, w3cid 45369
Editor: Florian Rivoal, Invited Expert, https://florian.rivoal.net, w3cid 43241
Abstract: This CSS module defines properties for text manipulation and specifies their processing model. It covers line breaking, justification and alignment, white space handling, and text transformation.
At Risk: the ''full-width'' value of 'text-transform'
At Risk: the ''full-size-kana'' value of 'text-transform'
At Risk: the &lt;length> values of the 'tab-size' property
At Risk: the 'text-justify' property
At Risk: the percentage values of 'word-spacing'
At Risk: the 'hanging-punctuation' property
At Risk: the ''line-break/anywhere'' value of the 'line-break' property
Ignored Vars: letter-spacing
Status Text: This publication partially addresses the issues in the <a href="https://drafts.csswg.org/css-text-3/issues-lc-2013">disposition of comments</a> since the <a href="https://www.w3.org/TR/2013/WD-css-text-3-20131010/">October 2013 Last Call Working Draft</a>, and, while a marked improvement over the previous draft, is not considered to be entirely up-to-date at the time of publication. A completed dispostion of comments and corresponding draft will be published once the issues are fully addressed and reviewed by the CSSWG and Internationalization WG.
</pre>
<pre class=link-defaults>
spec:css-display-3; type:property; text:display
</pre>
<style type="text/css">
img { vertical-align: middle; }
span[lang] { font-size: 125%; line-height: 1; vertical-align: middle;}
/* Bidi & spaces example */
.egbidiwsaA,.egbidiwsbB,.egbidiwsaB,.egbidiwsbC
{ white-space:pre;font-size:80%;font-family:monospace; vertical-align:2px; margin:1px }
.egbidiwsaA { background:lime;padding:2px; }
.egbidiwsbB { border:2px solid blue }
.egbidiwsaB { background:yellow;border:2px dotted white }
.egbidiwsbC { border:2px dotted red }
.hyphens-ex {
border: thin solid black;
display: inline-block;
padding: 4pt;
}
/* Start Letter-spacing Tutorial */
.ls-ex {
font-size: 200%;
margin-left: 1em;
margin-right: 1em;
}
.ls-fixed-width {
width: 10em;
}
.color-box { background: rgb(224, 203, 82); }
.bad { color: red; }
.good { color: green; }
/* End Letter-spacing Tutorial */
.char { border: 1px dotted gray; }
.quarter { font-size: 25%; }
tt[lang="ja"] { font-family: "MS Gothic", "Osaka", monospace }
div.figure table {
margin :auto;
}
.feedback {
background: #FFEECC;
border-color: orange;
}
.feedback:before {
content: "Info Needed";
color: #FF8800;
}
.data .no { color: red; }
.data .ok { color: green; }
.data th em { display: block; font-size: smaller; font-weight: normal; font-style: italic; }
</style>
<h2 id="intro">
Introduction</h2>
<p>This module describes the typesetting controls of CSS;
that is, the features of CSS that control the translation of
source text to formatted, line-wrapped text.
Various CSS properties provide control over
<a href="#transforming">case transformation</a>,
<a href="#white-space">white space collapsing</a>,
<a href="#white-space">text wrapping</a>,
<a href="#line-breaking">line breaking rules</a>
and <a href="#hyphenation">hyphenation</a>,
<a href="#justification">alignment and justification</a>,
<a href="#spacing">spacing</a>,
and <a href="#edge-effects">indentation</a>.
<div class="note">
<p>Note: Font selection is covered in <a href="https://www.w3.org/TR/css-fonts-3/">CSS Fonts Level 3</a> [[CSS-FONTS-3]].
<p>
<span id="decoration"></span>
<span id="text-decoration"></span>
<span id="line-decoration"></span>
<span id="text-decoration-line"></span>
<span id="text-decoration-color"></span>
<span id="text-decoration-style"></span>
<span id="text-decoration-skip"></span>
<span id="text-underline-position"></span>
<span id="emphasis-marks"></span>
<span id="text-emphasis-style"></span>
<span id="text-emphasis-color"></span>
<span id="text-emphasis"></span>
<span id="text-emphasis-position"></span>
<span id="text-shadow"></span>
Features for decorating text,
such as <a href="https://www.w3.org/TR/css-text-decor-3/#line-decoration">underlines</a>,
<a href="https://www.w3.org/TR/css-text-decor-3/#emphasis-marks">emphasis marks</a>,
and <a href="https://www.w3.org/TR/css-text-decor-3/#text-shadow-property">shadows</a>,
(previously part of this module)
are covered in
<a href="https://www.w3.org/TR/css-text-decor-3/">CSS Text Decoration Level 3</a> [[CSS-TEXT-DECOR-3]].
<p><a href="https://www.w3.org/TR/css-writing-modes-3/#text-direction">Bidirectional</a> and
<a href="https://www.w3.org/TR/css-writing-modes-3/#vertical-intro">vertical</a> text
are addressed in
<a href="https://www.w3.org/TR/css-writing-modes-3/">CSS Writing Modes Level 3</a> [[CSS-WRITING-MODES-3]].
</div>
Further information about the typesetting requirements
of various languages and writing systems around the world
can be found in the <a href="https://www.w3.org/International/core/">Internationalization Working Group</a>’s
<a href="https://www.w3.org/TR/typography/">Typography Index</a>.
[[TYPOGRAPHY]]
<h3 id="placement">
Module Interactions</h3>
<p>This module, together with [[CSS-TEXT-DECOR-3]],
replaces and extends the text-level features defined in [[!CSS2]] chapter 16.
<p>In addition to the terms defined below,
other terminology and concepts used in this specification are defined
in [[!CSS2]] and [[!CSS-WRITING-MODES-3]].
<h3 id="values">
Values</h3>
This specification follows the <a href="https://www.w3.org/TR/CSS2/about.html#property-defs">CSS property definition conventions</a> from [[!CSS2]].
Value types not defined in this specification are defined in CSS Values & Units [[!CSS-VALUES-3]].
Other CSS modules may expand the definitions of these value types.
In addition to the property-specific values listed in their definitions,
all properties defined in this specification
also accept the <a>CSS-wide keywords</a> keywords as their property value.
For readability they have not been repeated explicitly.
<h3 id="languages">
Languages and Typesetting</h3>
<p><strong class="advisement">
Authors should language-tag their content accurately for the best typographic behavior.
</strong>
<p>The <dfn export>content language</dfn> of an element is the (human) language
the element is declared to be in, according to the rules of the
<a href="https://www.w3.org/TR/CSS2/conform.html#doclanguage">document language</a>.
For example, the rules for determining the <a>content language</a> of an HTML
are <a href="https://html.spec.whatwg.org/multipage/dom.html#language">defined</a> in [[HTML]],
and the rules for determining the <a>content language</a> of an XML element use
are <a href="https://www.w3.org/TR/REC-xml/#sec-lang-tag">defined</a> in [[XML10]].
Note that it is possible for the <a>content language</a> of an element
to be unknown--
e.g. untagged content,
or content in a <a>document language</a> that does not have a language-tagging facility
is considered to have an unknown <a>content language</a>.
Note: Authors can tag content using the global <code>lang</code> attribute in HTML,
the universal <code>xml:lang</code> attribute in XML,
and the HTTP <code>Content-Language</code> header for content served over HTTP.
Language and writing system conventions can affect
line breaking, hyphenation, justification, glyph selection,
and many other typographic effects.
<strong>In CSS, language-specific typographic tailorings
are only applied when the <a>content language</a> is known (declared).</strong>
Therefore,
higher quality typography requires authors to communicate to the UA
the correct linguistic context of the text in the document.
More information about language tags and their interpretation,
particularly the use of script tags for atypical language + writing-system combinations,
can be found in [[#script-tagging]].
<h3 id="characters">
Characters and Letters</h3>
<p>The basic unit of typesetting is the <dfn export>character</dfn>.
However, because writing systems are not always as simple as the basic English alphabet,
what a <a>character</a> actually is depends on the context in which the term is used.
For example, in Hangul (the Korean writing system),
each square representation of a syllable
(e.g. <span lang=ko-hang title="Hangul syllable HAN">한</span>=<span lang=ko-Latn>Han</span>)
can be considered a <a>character</a>.
However, the square symbol is really composed of multiple letters each representing a phoneme
(e.g. <span lang=ko-hang title="Hangul letter HIEUH">ㅎ</span>=<span lang=ko-Latn>h</span>,
<span lang=ko-hang title="Hangul letter HIEUH">ㅏ</span>=<span lang=ko-Latn>a</span>,
<span lang=ko-hang title="Hangul letter HIEUH">ㄴ</span>=<span lang=ko-Latn>n</span>)
and these also could each be considered a <a>character</a>.
<p>A basic unit of computer text encoding, for any given encoding,
is also called a <a>character</a>,
and depending on the encoding,
a single encoding <a>character</a> might correspond
to the entire pre-composed syllabic <a>character</a> (e.g. <span lang=ko-hang title="Hangul syllable HAN">한</span>),
to the individual phonemic <a>character</a> (e.g. <span lang=ko-hang title="Hangul letter HIEUH">ㅎ</span>),
or to smaller units such as
a base letterform (e.g. <span lang=ko-hang title="Hangul letter IEUNG">ㅇ</span>)
and any combining marks that vary it (e.g. extra strokes that represent aspiration).
<p>In turn, a single encoding <a>character</a> can be represented in the data stream as one or more bytes;
and in programming environments one byte is sometimes also called a <a>character</a>.
<p>Therefore the term <a>character</a> is fairly ambiguous where technical precision is required.
<p>For text layout, we will refer to the <dfn export lt="typographic character unit|typographic character">typographic character unit</dfn>
as the basic unit of text.
Even within the realm of text layout,
the relevant <a>character</a> unit depends on the operation.
For example, line-breaking and letter-spacing will segment
a sequence of Thai characters that include U+0E33 THAI CHARACTER SARA AM differently;
or the behaviour of a conjunct consonant in a script such as Devanagari
may depend on the font in use.
So the <a>typographic character</a> represents a unit of the writing system&mdash;<!--
-->such as a Latin alphabetic letter (including its diacritics),
Hangul syllable,
Chinese ideographic character,
Myanmar syllable cluster&mdash;<!--
-->that is indivisible with respect to a particular typographic operation
(line-breaking, first-letter effects, tracking, justification, vertical arrangement, etc.).
<a href="http://www.unicode.org/reports/tr29/">Unicode Standard Annex #29: Text Segmentation</a>
defines a unit called the <dfn>grapheme cluster</dfn>
which approximates the <a>typographic character</a>.
A UA must use the <em>extended grapheme cluster</em>
(not <em>legacy grapheme cluster</em>), as defined in [[!UAX29]],
as the basis for its <a>typographic character unit</a>.
However, the UA should tailor the definitions
as required by typographic tradition
since the default rules are not always appropriate or ideal--
and is expected to tailor them differently
depending on the operation as needed.
<p class="note">
Note: The rules for such tailorings are out of scope for CSS.
<!--
however W3C currently maintains a wiki page
where some known tailorings are collected.
-->
<div class="example">
The following are some examples of <a>typographic character unit</a> tailorings
required by standard typesetting practice:
<ul>
<li>
<p>In some scripts such as Myanmar or Devanagari,
the <a>typographic character unit</a> for both justification and line-breaking
is an entire syllable,
which can include more than one [[!UAX29]] <a>grapheme cluster</a>.
<li>
<p>In other scripts such as Thai or Lao,
even though for line-breaking the <a>typographic character</a>
matches Unicode’s default <a>grapheme clusters</a>,
for letter-spacing the relevant unit
is <em>less</em> than a [[!UAX29]] <a>grapheme cluster</a>,
and may require decomposition or other substitutions
before spacing can be inserted.
<p>For instance,
to properly letter-space the Thai word คำ (U+0E04 + U+0E33),
the U+0E33 needs to be decomposed into U+0E4D + U+0E32,
and then the extra letter-space inserted before the U+0E32: คํ า.
<p>A slightly more complex example is น้ำ (U+0E19 + U+0E49 + U+0E33).
In this case, normal Thai shaping will first decompose the U+0E33 into U+0E4D + U+0E32
and then swap the U+0E4D with the U+0E49, giving U+0E19 + U+0E4D + U+0E49 + U+0E32.
As before the extra letter-space is then inserted before the U+0E32: นํ้ า.
<li>
<p>Vertical typesetting [[!CSS-WRITING-MODES-3]] can also require tailoring.
For example, when typesetting ''text-orientation/upright'' text,
Tibetan tsek and shad marks are kept with the preceding grapheme cluster,
rather than treated as an independent <a>typographic character unit</a>.
</ul>
</div>
<p>A <dfn export>typographic letter unit</dfn> or <dfn>letter</dfn> for the purpose of this specification
is a <a>typographic character unit</a> belonging to one of the Letter or Number general
categories in Unicode. [[!UAX44]]
See <a href="#character-properties">Character Properties</a>
for how to determine the Unicode properties of a <a>typographic character unit</a>.
<p>The rendering characteristics of a <a>typographic character unit</a> divided
by an element boundary is undefined:
it may be rendered as belonging to either side of the boundary,
or as some approximation of belonging to both.
Authors are forewarned that dividing <a>grapheme clusters</a>
by element boundaries may give inconsistent or undesired results.
<h2 id="transforming">
Transforming Text</h2>
<h3 id="text-transform-property">
<span id="caps-prop"></span>
<span id="text-transform"></span>
Case Transforms: the 'text-transform' property</h3>
<pre class="propdef">
Name: text-transform
Value: none | [capitalize | uppercase | lowercase ] || full-width || full-size-kana
Initial: none
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property transforms text for styling purposes.
It has no effect on the underlying content,
and must not affect the content of a plain text copy &amp; paste operation.
<p>Values have the following meanings:
<dl dfn-for=text-transform dfn-type=value>
<dt><dfn>none</dfn></dt>
<dd>No effects.</dd>
<dt><dfn>capitalize</dfn></dt>
<dd>Puts the first <a>typographic letter unit</a> of each word, if lowercase, in titlecase;
other characters are unaffected.</dd>
<dt><dfn>uppercase</dfn></dt>
<dd>Puts all <a>letters</a> in uppercase.
<dt><dfn>lowercase</dfn></dt>
<dd>Puts all <a>letters</a> in lowercase.</dd>
<dt><dfn>full-width</dfn></dt>
<dd>Puts all <a>typographic character units</a> in fullwidth form.
If a character does not have a corresponding fullwidth form,
it is left as is.
This value is typically used to typeset Latin letters and digits
as if they were ideographic characters.
<dt><dfn>full-size-kana</dfn></dt>
<dd>Converts all [=small Kana=] characters to the equivalent [=full-size Kana=].
This value is typically used for ruby annotation text,
where authors may want all small Kana to be drawn as large Kana
to compensate for legibility issues at the small font sizes typically used in ruby.
</dl>
<p>For ''capitalize'', what constitutes a “word“ is UA-dependent;
[[!UAX29]] is suggested (but not required) for determining such word
boundaries. Authors should not expect ''capitalize'' to follow
language-specific titlecasing conventions (such as skipping articles
in English).
<div class="example">
<p>The following example converts the ASCII characters
used in abbreviations in Japanese text to their fullwidth variants
so that they lay out and line break like ideographs:
<pre>abbr:lang(ja) { text-transform: full-width; }</pre>
</div>
<p class="note">
Note: As defined in <a href="#order">Text Processing Order of Operations</a>,
transforming text affects line-breaking and other formatting operations.
<p>
The UA must use the full case mappings for Unicode
characters, including any conditional casing rules, as defined in
Default Case Algorithm section of The Unicode Standard [[!UNICODE]].
If (and only if) the <a>content language</a>
of the element is, according to the rules of the
<a href="https://www.w3.org/TR/CSS2/conform.html#doclanguage">document language</a>,
known,
then any appropriate language-specific rules must be applied as well.
These minimally include, but are not limited to, the language-specific
rules in Unicode's
<a href="http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt">SpecialCasing.txt</a>.
<div class="example">
<p>For example, in Turkish there are two &ldquo;i&rdquo;s, one with
a dot&mdash;&ldquo;İ&rdquo; and &ldquo;i&rdquo;&mdash; and one
without&mdash;&ldquo;I&rdquo; and &ldquo;ı&rdquo;. Thus the usual
case mappings between &ldquo;I&rdquo; and &ldquo;i&rdquo; are
replaced with a different set of mappings to their respective
undotted/dotted counterparts, which do not exist in English. This
mapping must only take effect if the <a>content language</a> is Turkish
(or another Turkic language that uses Turkish casing rules);
in other languages, the usual mapping of &ldquo;I&rdquo;
and &ldquo;i&rdquo; is required. This rule is thus conditionally
defined in Unicode's SpecialCasing.txt file.
</div>
<p>The definition of fullwidth and halfwidth forms can be found on the
Unicode consortium web site at [[!UAX11]].
The mapping to fullwidth form is defined by taking code points with
the <code>&lt;wide&gt;</code> or the <code>&lt;narrow&gt;</code> tag
in their <code>Decomposition_Mapping</code> in [[!UAX44]].
For the <code>&lt;narrow&gt;</code> tag,
the mapping is from the code point to the decomposition (minus <code>&lt;narrow&gt;</code> tag),
and for the <code>&lt;wide&gt;</code> tag,
the mapping is from the decomposition (minus the <code>&lt;wide&gt;</code> tag)
back to the original code point.
<p>The mappings for small Kana to full-size Kana are defined in [[#small-kana]].
<p>When multiple values are specified and therefore multiple transformations need to be applied,
they are applied in the following order:
<ol>
<li>''capitalize'', ''upercase'', and ''lowercase''
<li>''full-width''
<li>''full-size-kana''
</ol>
<p>Text transformation happens after <a href="#white-space-rules">white
space processing</a>, which means that ''full-width'' only transforms
U+0020 spaces to U+3000 within <a>preserved</a> white space.
<p class="note">
Note: A future level of CSS may introduce the ability to create custom mapping
tables for less common text transforms, such as by an ''@text-transform''
rule similar to ''@counter-style'' from [[CSS-COUNTER-STYLES-3]].
<h2 id="white-space-property">
<span id="white-space-collapsing"></span><span id='text-wrap'></span>
White Space and Wrapping: the 'white-space' property</h2>
<pre class="propdef">
Name: white-space
Value: normal | pre | nowrap | pre-wrap | break-spaces | pre-line
Initial: normal
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property specifies two things:
<ul>
<li>whether and how <a>white space</a> inside the element is collapsed
<li>whether lines may <a>wrap</a> at unforced <a>soft wrap opportunities</a>
</ul>
<p>Values have the following meanings, which must be interpreted
according to
the <a href="#white-space-rules">White Space Processing</a> and
<a href="#line-breaking">Line Breaking</a> rules:</p>
<dl dfn-for=white-space dfn-type=value>
<dt><dfn>normal</dfn></dt>
<dd>This value directs user agents to collapse sequences of <a>white space</a>
into a single character (or <a href="#line-break-transform">in some
cases</a>, no character).
Lines may wrap at allowed <a>soft wrap opportunities</a>,
as determined by the line-breaking rules in effect,
in order to minimize inline-axis overflow.
<dt><dfn>pre</dfn></dt>
<dd>This value prevents user agents from collapsing sequences of <a>white space</a>.
<a>Segment breaks</a> such as line feeds
are preserved as <a>forced line breaks</a>.
Lines only break at <a>forced line breaks</a>;
content that does not fit within the block container overflows it.
<dt><dfn>nowrap</dfn>
<dd>Like ''white-space/normal'', this value collapses <a>white space</a>;
but like ''pre'', it does not allow wrapping.
<dt><dfn>pre-wrap</dfn></dt>
<dd>Like ''pre'', this value preserves <a>white space</a>;
but like ''white-space/normal'', it allows wrapping.
<dt><dfn>break-spaces</dfn></dt>
<dd>The behavior is identical to that of ''white-space/pre-wrap'',
except that:
* Any sequence of <a>preserved</a> white space always takes up space,
including at the end of the line.
* A line breaking opportunity exists after every <a>preserved</a> <a>white space</a> character,
including between white space characters.
As preserved spaces take up space and do not hang,
they affect the box's intrinsic sizes ([=min-content size=] and [=max-content size=]).
<p class="note">Note: This value does not guarantee that there will never be any overflow due to spaces:
for example, if the line length is so short that even a single space does not fit,
overflow is unavoidable.</p>
<dt><dfn>pre-line</dfn></dt>
<dd>Like ''white-space/normal'', this value collapses consecutive spaces and allows wrapping,
but preserves <a>segment breaks</a> in the source as <a>forced line breaks</a>.
</dl>
<p>The following informative table summarizes the behavior of various
'white-space' values:</p>
<table class="data">
<colgroup class="header"></colgroup>
<colgroup span=4></colgroup>
<thead>
<tr>
<th></th>
<th>New Lines</th>
<th>Spaces and Tabs</th>
<th>Text Wrapping</th>
<th>End-of-line spaces</th>
</tr>
</thead>
<tbody>
<tr>
<th>''white-space/normal''</th>
<td>Collapse</td>
<td>Collapse</td>
<td>Wrap</td>
<td>Remove</td>
</tr>
<tr>
<th>''pre''</th>
<td>Preserve</td>
<td>Preserve</td>
<td>No wrap</td>
<td>Preserve</td>
</tr>
<tr>
<th>''nowrap''</th>
<td>Collapse</td>
<td>Collapse</td>
<td>No wrap</td>
<td>Remove</td>
</tr>
<tr>
<th>''pre-wrap''</th>
<td>Preserve</td>
<td>Preserve</td>
<td>Wrap</td>
<td>Collapse or hang</td>
</tr>
<tr>
<th>''break-spaces''</th>
<td>Preserve</td>
<td>Preserve</td>
<td>Wrap</td>
<td>Wrap</td>
</tr>
<tr>
<th>''pre-line''</th>
<td>Preserve</td>
<td>Collapse</td>
<td>Wrap</td>
<td>Remove</td>
</tbody>
</table>
<p>See <a href="#white-space-processing">White Space Processing Rules</a>
for details on how white space collapses. An informative summary of
collapsing (''white-space/normal'' and ''nowrap'') is presented below:
<ul>
<li>A sequence of segment breaks and other <a>white space</a> between two
Chinese, Japanese, or Yi characters collapses into nothing.
<li>A zero width space before or after a white space sequence
containing a segment break causes the entire sequence of <a>white space</a>
to collapse into a zero width space.
<li>Otherwise, consecutive <a>white space</a> collapses into a single space.
</ul>
<p>See <a href="#line-breaking">Line Breaking</a>
for details on wrapping behavior.
<h2 id="white-space-processing">
White Space Processing Details</h2>
<p>The source text of a document often contains formatting
that is not relevant to the final rendering: for example,
<a href="http://rhodesmill.org/brandon/2012/one-sentence-per-line/">breaking the source into segments</a>
(lines) for ease of editing
or adding white space characters such as tabs and spaces to indent the source code.
CSS white space processing allows the author to control interpretation of such formatting:
to preserve or collapse it away when rendering the document.
White space processing in CSS interprets white space characters only for rendering:
it has no effect on the underlying document data.
<p>White space processing in CSS is controlled with the 'white-space' property.
<p id="segment-normalization">
CSS does not define document segmentation rules. Segments can be
separated by a particular newline sequence (such as a line feed or
CRLF pair), or delimited by some other mechanism, such as the SGML
<code>RECORD-START</code> and <code>RECORD-END</code> tokens.
For CSS processing, each document language&ndash;defined segment break
and each line feed (U+000A)
in the text is treated as a <dfn export>segment break</dfn>,
which is then interpreted for rendering as specified by the 'white-space' property.
<p class="note">Note: A document parser might
not only normalize any <a>segment breaks</a>,
but also collapse other space characters or
otherwise process white space according to markup rules.
Because CSS processing occurs <em>after</em> the parsing stage,
it is not possible to restore these characters for styling.
Therefore, some of the behavior specified below
can be affected by these limitations and
may be user agent dependent.</p>
<p class="note">Note: Anonymous blocks consisting entirely of
<a>collapsible</a> <a>white space</a> are removed from the rendering tree.
Thus any such <a>white space</a> surrounding a block-level element is collapsed away.
See [[CSS2]] section
<a href="https://www.w3.org/TR/CSS2/visuren.html#anonymous">9.2.2.1</a></p>
<p>
Form feeds (U+000C)
(that are not <a>segment breaks</a>)
are rendered as a zero-width space (U+200B). <!-- thus can disrupt shaping -->
Control characters (<a>Unicode category</a> <code>Cc</code>)
other than tab (U+0009), line feed (U+000A), and form feed (U+000C),
must be rendered as a visible glyph
which the UA must synthethize if the glyphs found in the font are not visible
and otherwise treated as any other character
of the Other Symbols (<code>So</code>) <a>general category</a> and Common <a lt="Unicode script">script</a>.
The UA may use a glyph provided by a font specifically for the control character,
substitute the glyphs provided for the corresponding symbol in the Control Pictures block,
generate a visual representation of its codepoint value,
or use some other method to provide an appropriate visible glyph.
As required by [[!UNICODE]],
unsupported <code>Default_ignorable</code> characters must be ignored for rendering.
<h3 id="white-space-rules">
The White Space Processing Rules</h3>
<p>White space processing in CSS affects only
the <dfn export lt="white space|white space characters| document white space|document white space characters">document white space characters</dfn>:
spaces (U+0020), tabs (U+0009), and <a href="#white-space-processing">segment breaks</a>.
<p class="note">
Note: The set of characters considered <a>document white space</a> (part of the document content)
and that considered syntactic white space (part of the CSS syntax)
are not necessarily identical.
However, since both include spaces (U+0020), tabs (U+0009), and line feeds (U+000A)
most authors won't notice any differences.
<h4 id="white-space-phase-1">Phase I: Collapsing and Transformation</h4>
<p>For each inline (including anonymous inlines;
see [[CSS2]] section <a href="https://www.w3.org/TR/CSS2/visuren.html#anonymous">9.2.2.1</a>)
within an inline
formatting context, white space characters are handled as follows,
ignoring <dfn noexport>bidi formatting characters</dfn>
(characters with the <code>Bidi_Control</code> property [[!UAX9]])
as if they were not there:
<ul>
<li id="collapse"><p>If 'white-space' is set to
''white-space/normal'', ''nowrap'', or ''pre-line'',
white space characters are considered <dfn export lt="collapsible white space|collapsible">collapsible</dfn>
and are processed by performing the following steps:</p>
<ol>
<li>All spaces and tabs immediately preceding or following a segment
break are removed.</li>
<li><a>Segment breaks</a> are transformed for
rendering according to the <a href="#line-break-transform">segment break transformation rules</a>.
</li>
<li>Every tab is converted to a space (U+0020).</li>
<li>Any space immediately following another collapsible space&mdash;even
one outside the boundary of the inline containing that space,
provided both spaces are within the same inline formatting
context&mdash;is collapsed to have zero advance width. (It is
invisible, but retains its <a>soft wrap opportunity</a>, if any.)</li>
</ol>
</li>
<li><p>If 'white-space' is set to ''pre'', ''pre-wrap'', or ''break-spaces''
any sequence of spaces is treated as a sequence of non-breaking spaces.
However, a <a>soft wrap opportunity</a> exists at the end of the sequence.
<p>Then, the entire block is rendered. Inlines are laid out, taking bidi
reordering into account, and <a>wrapping</a> as specified by the
'white-space' property.</p>
<div class="example" id="egbidiwscollapse">
<p>The following example illustrates
the interaction of white-space collapsing and bidirectionality.
Consider the following markup fragment, taking special note of spaces
(with varied backgrounds and borders for emphasis and identification):
<pre><code>&lt;ltr&gt;A<span class="egbidiwsaA">&#160;</span>&lt;rtl&gt;<span class="egbidiwsbB">&#160;</span>B<span class="egbidiwsaB">&#160;</span>&lt;/rtl&gt;<span class="egbidiwsbC">&#160;</span>C&lt;/ltr&gt;</code></pre>
<p>where the <code>&lt;ltr&gt;</code> element represents a left-to-right embedding
and the <code>&lt;rtl&gt;</code> element represents a right-to-left embedding.
If the 'white-space' property is set to ''white-space/normal'',
the white-space processing model will result in the following:
<ul style="line-height:1.3">
<li>The space before the B (<span class="egbidiwsbB">&#160;</span>)
will collapse with the space after the A (<span class="egbidiwsaA">&#160;</span>).
<li>The space before the C (<span class="egbidiwsbC">&#160;</span>)
will collapse with the space after the B (<span class="egbidiwsaB">&#160;</span>).
</ul>
<p>This will leave two spaces,
one after the A in the left-to-right embedding level,
and one after the B in the right-to-left embedding level.
The text will then be ordered according to the Unicode bidirectional algorithm,
with the end result being:
<pre>A<span class="egbidiwsaA">&#160;</span><span class="egbidiwsaB">&#160;</span>BC</pre>
<p>Note that there will be two spaces between A and B,
and none between B and C.
This is best avoided by putting spaces outside the element
instead of just inside the opening and closing tags and, where practical,
by relying on implicit bidirectionality instead of explicit embedding levels.
</div>
<h4 id="line-break-transform">
Segment Break Transformation Rules</h4>
<p>When 'white-space' is ''pre'', ''pre-wrap'', ''break-spaces'', or ''pre-line'',
<a>segment breaks</a> are not <a>collapsible</a>
and are instead transformed into a preserved line feed (U+000A).
<p>For other values of 'white-space', <a>segment breaks</a> are <a>collapsible</a>.
As with spaces,
any collapsible <a>segment break</a> immediately following another collapsible <a>segment break</a>
is removed.
Then the remaining <a>segment breaks</a> are
either transformed into a space (U+0020) or removed
depending on the context before and after the break:
<ul>
<li>If the character immediately before or immediately after the segment
break is the zero-width space character (U+200B), then the break
is removed, leaving behind the zero-width space.
<li>Otherwise, if the <a>East Asian Width property</a> [[!UAX11]] of both
the character before and after the segment break is <code>F</code>,
<code>W</code>, or <code>H</code> (not <code>A</code>),
and neither side is Hangul or Emoji (Unicode property <code>Emoji</code>),
then the segment break is removed.
<li>Otherwise, if the <a>content language</a> of the <a>segment break</a>
is Chinese, Japanese, or Yi,
and the character before or after the segment break
is punctuation or a symbol (Unicode <a>general category</a> P* or S*)
and has an <a>East Asian Width property</a> of <code>A</code>
or is Emoji,
and the character on the other side of the segment break is
<code>F</code>, <code>W</code>, or <code>H</code>,
and not Hangul or Emoji,
then the segment break is removed.
<li>Otherwise, the segment break is converted to a space (U+0020).
</ul>
<p class="note">Note: The white space processing rules have already
removed any tabs and spaces after the segment break before these checks
take place.</p>
<p class="feedback issue">Comments on how well this would work in practice would
be very much appreciated, particularly from people who work with
Thai and similar scripts.
Note that browser implementations do not currently follow these rules
(although IE does in some cases transform the break).</p>
<h4 id="white-space-phase-2">Phase II: Trimming and Positioning</h4>
<p>As each line is laid out,</p>
<ol>
<li>A sequence of collapsible spaces at the beginning of a line
(ignoring any intervening <a>inline box</a> boundaries)
is removed.
<li>If the tab size is zero, tabs are not rendered.
Otherwise, each tab is rendered as a horizontal shift
that lines up the start edge of the next glyph with the next <a>tab stop</a>.
If this distance is less than 0.5<a href="https://www.w3.org/TR/css-values-3/#ch">ch</a>,
then the subsequent <a>tab stop</a> is used instead.
<dfn lt="tab stop">Tab stops</dfn> occur at points that are multiples of the tab size
from the block's starting content edge.
The tab size is given by the 'tab-size' property.
Note: See [[UAX9]] for <a href="http://unicode.org/reports/tr9/#L1">rules on how U+0009 tabulation interacts with bidi</a>.
<li>A sequence at the end of a line
(ignoring any intervening <a>inline box</a> boundaries)
of <a>collapsible</a> spaces (U+0020)
and/or ideographic spaces (U+3000) whose 'white-space' value collapses spaces
is removed.
<li>If there remains any sequence of <a>white space</a>
and/or ideographic spaces (U+3000)
at the end of a line:
<ul>
<li>If 'white-space' is set to ''pre-wrap'',
the UA must <a>hang</a> this sequence.
It may also visually collapse the character advance widths
of any that would otherwise overflow.
Note: Hanging the white space rather than collapsing it
allows users to see the space when selecting or editing text.
<li>If 'white-space' is set to ''break-spaces'',
hanging or collapsing the advance width of spaces
at the end of the line is not allowed;
those that overflow must <a>wrap</a> to the next line.
</ul>
</ol>
Issue: Add example of hanging white space + same example right-aligned.
<p>White space that was not removed or collapsed during the white space
processing steps is called <dfn>preserved</dfn> white space.</p>
<h3 id="tab-size-property" caniuse="css3-tabsize">
Tab Character Size: the 'tab-size' property</h3>
<pre class="propdef">
Name: tab-size
Value: <<number>> | <<length>>
Initial: 8
Applies to: block containers
Inherited: yes
Computed value: the specified number or absolute length
Animation type: by computed value type
Canonical order: n/a
</pre>
<p>This property determines the tab size used to render preserved tab characters (U+0009).
A <<number>> represents the measure as a multiple of the space character's advance width (U+0020)
including its associated 'letter-spacing' and 'word-spacing'.
Negative values are not allowed.
<h2 id="line-breaking">
Line Breaking and Word Boundaries</h2>
<p>When inline-level content is laid out into lines, it is broken across line boxes.
Such a break is called a <dfn>line break</dfn>.
When a line is broken due to explicit line-breaking controls,
or due to the start or end of a block,
it is a <dfn>forced line break</dfn>.
When a line is broken due to content <dfn lt="wrapping|wrap">wrapping</dfn>
(i.e. when the UA creates unforced line breaks in order to fit the content within the measure),
it is a <dfn>soft wrap break</dfn>.
The process of breaking inline-level content into lines is called <dfn export lt="line breaking process | line breaking">line breaking</dfn>.
<p>Wrapping is only performed at an allowed break point,
called a <dfn export>soft wrap opportunity</dfn>.
When wrapping is enabled (see 'white-space'),
the UA must minimize the amount of content overflowing a line
by wrapping the line at a <a>soft wrap opportunity</a>,
if one exists.
<p>In most writing systems,
in the absence of hyphenation a <a>soft wrap opportunity</a> occurs only at word boundaries.
Many such systems use spaces or punctuation to explicitly separate words,
and <a>soft wrap opportunities</a> can be identified by these characters.
Scripts such as Thai, Lao, and Khmer, however,
do not use spaces or punctuation to separate words.
Although the zero width space (U+200B) can be used as an explicit word delimiter in these scripts,
this practice is not common.
As a result, a lexical resource is needed to correctly identify <a>soft wrap opportunities</a> in such texts.
<p>In some other writing systems,
<a>soft wrap opportunities</a> are based on orthographic syllable boundaries,
not word boundaries.
Some of these systems, such as Javanese and Balinese,
are similar to Thai and Lao in that they
require analysis of the text to find breaking opportunities.
In others such as Chinese (as well as Japanese, Yi, and sometimes also Korean),
each syllable tends to correspond to a single <a>typographic letter unit</a>,
and thus line breaking conventions allow the line to break
anywhere <em>except</em> between certain character combinations.
Additionally the level of strictness in these restrictions
varies with the typesetting style.
<p>CSS does not fully define where <a>soft wrap opportunities</a> occur;
however some controls are provided to distinguish common variations.
<div class="note">
<p>Note: Further information on line breaking conventions can be found in
[[JLREQ]] and [[JIS4051]] for Japanese,
[[ZHMARK]] for Chinese, and
in [[!UAX14]] for all scripts in Unicode.
<p class="feedback issue">Any guidance for appropriate references here would be
much appreciated.
</div>
<h3 id="line-break-details">
Line Breaking Details</h3>
<p>When determining <a>line breaks</a>:
<ul>
<li>Regardless of the 'white-space' value,
lines always break at each <a>preserved</a> forced break character:
for all values, line-breaking behavior defined for
the BK, CR, LF, CM, NL, and SG line breaking classes in [[!UAX14]]
must be honored.
<li>When 'white-space' allows wrapping,
line breaking behavior defined for the WJ, ZW, and GL line-breaking classes in [[!UAX14]]
must be honored.
<li>UAs that allow wrapping at punctuation other than spaces should prioritize breakpoints.
For example, if breaks after slashes are given a lower priority than spaces,
the sequence "check /etc" will never break between the "/" and the "e".
As long as care is taken to avoid such awkward breaks, allowing breaks at
appropriate punctuation other than spaces is recommended, as it results
in more even-looking margins, particularly in narrow measures.
The UA may use the width of the containing block, the text's language,
and other factors in assigning priorities.
<li>Out-of-flow elements do not introduce a <a>forced line break</a>
or <a>soft wrap opportunity</a> in the flow.
<li>The line breaking behavior of a replaced element or other atomic inline
is equivalent to an ideographic character
(Unicode linebreaking class <code>ID</code> [[!UAX14]]),
and additionally, for Web-compatibility, introduces a <a>soft wrap opportunity</a>
between itself and any adjacent U+00A0 NO-BREAK SPACE character.
<li>For <a>soft wrap opportunities</a> created by characters that disappear at the line break (e.g. U+0020 SPACE),
properties on the box directly containing that character control the line breaking at that opportunity.
For <a>soft wrap opportunities</a> defined by the boundary between two characters,
the properties on nearest common ancestor of the two characters controls breaking.
<!-- http://lists.w3.org/Archives/Public/www-style/2008Dec/0043.html -->
<li>For <a>soft wrap opportunities</a> before the first or after the last character of a box,
the break occurs immediately before/after the box (at its margin edge)
rather than breaking the box between its content edge and the content.
<li>Line breaking in/around Ruby is defined in <a href="https://www.w3.org/TR/css-ruby-1/#line-breaks">CSS Ruby</a> [[!CSS-RUBY-1]].
</ul>
<h3 id="word-break-property" caniuse="word-break">
Breaking Rules for Letters: the 'word-break' property</h3>
<pre class="propdef">
Name: word-break
Value: normal | keep-all | break-all
Initial: normal
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property specifies <a>soft wrap opportunities</a> between letters,
i.e. where it is “normal” and permissible to break lines of text.
Specifically it controls whether a <a>soft wrap opportunity</a> exists
between adjacent <a>typographic letter units</a>
(or other <a>typographic character units</a>
belonging to the
<code>NU</code>, <code>AL</code>, <code>AI</code>, or <code>ID</code>
Unicode line breaking classes [[!UAX14]]).
It does not affect rules governing the <a>soft wrap opportunities</a>
created by spaces (including U+3000 IDEOGRAPHIC SPACE) and punctuation.
(See 'line-break' for controls affecting punctuation.)
<div class="example">
<p>For example, in some styles of CJK typesetting, English words are allowed
to break between any two letters, rather than only at spaces or hyphenation points;
this can be enabled with ''word-break:break-all''.
<div class="figure">
<img src="images/break-all.png" alt="A snippet of Japanese text with English in it. The word 'caption' is broken into 'capt' and 'ion' across two lines.">
<p class="caption">An example of English text embedded in Japanese
being broken at an arbitrary point in the word.
</div>
<p>As another example, Korean has two styles of line-breaking:
between any two Korean syllables (''word-break: normal'')
or, like English, mainly at spaces (''word-break: keep-all'').
<pre>
각 줄의 마지막에 한글이 올 때 줄 나눔 <strong>기
준을</strong> “글자” 또는 “어절” 단위로 한다.</pre>
<pre>
각 줄의 마지막에 한글이 올 때 줄 나눔
<strong>기준을</strong> “글자” 또는 “어절” 단위로 한다.</pre>
</div>
<p class="note">
Note: To enable additional break opportunities only in the case of overflow,
see 'overflow-wrap'.
<p>Values have the following meanings:</p>
<dl dfn-for=word-break dfn-type=value>
<dt><dfn>normal</dfn></dt>
<dd>Words break according to their customary rules,
as described <a href="#line-breaking">above</a>.
Korean, which commonly exhibits two different behaviors,
allows breaks between any two consecutive Hangul/Hanja.
<dt><dfn>break-all</dfn></dt>
<dd>Breaking is allowed within “words”:
in addition to ''word-break/normal'' <a>soft wrap opportunities</a>:
specifically, any <a>typographic character units</a> resolving to the
<code>NU</code> (“numeric”), <code>AL</code> (“alphabetic”), or <code>SA</code> (“Southeast Asian”)
line breaking classes [[!UAX14]]
are instead treated as <code>ID</code> (“ideographic characters”)
for the purpose of line-breaking.
Hyphenation is not applied. This option is used mostly in a context where
the text consists predominantly of CJK characters with only short non-CJK excerpts,
and it is desired that the text be better distributed on each line.</dd>
Note: This value does not affect whether there are <a>soft wrap opportunities</a> around punctuation characters.
To allow breaks anywhere, see ''line-break: anywhere''.
<dt><dfn>keep-all</dfn></dt>
<dd>Breaking is forbidden within “words”:
implicit <a>soft wrap opportunities</a> between <a>typographic letter units</a>
(or other <a>typographic character units</a>
belonging to the
<code>NU</code>, <code>AL</code>, <code>AI</code>, or <code>ID</code>
Unicode line breaking classes [[!UAX14]])
are suppressed,
i.e. breaks are prohibited between pairs of such characters
(regardless of 'line-break' settings)
except where opportunities exist due to dictionary-based breaking.
Otherwise this option is equivalent to ''word-break/normal''.
In this style, sequences of CJK characters do not break.
<p class=note>Note: This is the other common behavior for Korean (which uses spaces between words),
and is also useful for mixed-script text where CJK snippets are mixed
into another language that uses spaces for separation.</dd>
</dl>
<p>Symbols that line-break the same way as letters of a particular category
are affected the same way as those letters.
<div class="example">
<p>Here's a mixed-script sample text:
<pre>这是一些汉字, and some Latin,<!--
--> &#x0648; &#x06A9;&#x0645;&#x06CC; &#x0646;&#x0648;&#x0634;&#x062A;&#x0646; &#x0639;&#x0631;&#x0628;&#x06CC;, <!--
-->และตัวอย่างการเขียนภาษาไทย.</pre>
<p>The break-points are determined as follows (indicated by &lsquo;&middot;&rsquo;):
<dl>
<dt>''word-break: normal''</dt>
<dd>
<pre>这·是·一·些·汉·字,·and·some·Latin,<!--
-->·&#x0648;·&#x06A9;&#x0645;&#x06CC;·&#x0646;&#x0648;&#x0634;&#x062A;&#x0646;·&#x0639;&#x0631;&#x0628;&#x06CC;<!--
-->และ·ตัวอย่าง·การเขียน·ภาษาไทย.</pre>
<dt>''word-break: break-all''</dt>
<dd>
<pre>这·是·一·些·汉·字,·a·n·d·s·o·m·e·L·a·t·i·n,<!--
-->·&#x0648;·&#xFB90;·&#xFEE4;·&#xFEF0;·&#xFEE7;·&#xFEEE;·&#xFEB7;·&#xFE98;·&#xFEE6;·&#xFECB;·&#xFEAE;·&#xFE91;·&#xFEF0;<!--
-->แ·ล·ะ·ตั·ว·อ·ย่·า·ง·ก·า·ร·เ·ขี·ย·น·ภ·า·ษ·า·ไ·ท·ย.</pre>
<dt>''word-break: keep-all''</dt>
<dd>
<pre>这是一些汉字,·and·some·Latin,<!--
-->·&#x0648;·&#x06A9;&#x0645;&#x06CC;·&#x0646;&#x0648;&#x0634;&#x062A;&#x0646;·&#x0639;&#x0631;&#x0628;&#x06CC;<!--
-->และ·ตัวอย่าง·การเขียน·ภาษาไทย.</pre>
</dl>
</div>
<div class="example" id=jp-title-break>
<style>
#jp-title-break td samp,
#jp-title-break td pre {
font-size: 1.5em;
width: 9em;
line-height: 2;
padding: 0.5em 1em;
display: block;
margin: auto;
border: solid gray 1px;
background: white;
}
</style>
<p>Japanese is usually typeset allowing line breaks within words.
However, it is sometimes preferred to suppress these wrapping opportunities
and to only allow wrapping at the end of certain sentence fragments.
This is most commonly done in very short pieces of text,
such as headings and table or figure captions.
This can be achieved by marking the allowed wrapping points
with <{wbr}> or U+200B ZERO WIDTH SPACE,
and suppressing the other ones using ''word-break: keep-all''.
For instance, the following markup can produce either of the renderings below,
depending on the value of the 'word-break' property:
<pre><code highlight=markup>
&lt;h1>窓ぎわの&lt;wbr>トットちゃん&lt;/h1>
</code></pre>
<table class=data>
<colgroup span=1></colgroup>
<colgroup span=2></colgroup>
<thead>
<tr>
<td>
<th><code highlight=css>h1 { word-break: normal }</code>
<th><code highlight=css>h1 { word-break: keep-all }</code>
<tbody>
<tr>
<th scope=row>Expected rendering
<td>
<pre lang=ja>
窓ぎわのトットちゃ
ん</pre>
<td>
<pre lang=ja>
窓ぎわの
トットちゃん</pre>
<tr>
<th scope=row>Result in your browser
<td>
<samp lang=ja>
窓ぎわの<wbr>トットちゃん
</samp>
<td>
<samp style="word-break:keep-all" lang=ja>
窓ぎわの<wbr>トットちゃん
</samp>
</table>
</div>
<p>When shaping scripts such as Arabic
are allowed to break within words due to ''word-break/break-all''
the characters must still be shaped
as if the word were <a href="#word-break-shaping">not broken</a>
(see [[#word-break-shaping]]).
<h3 id="line-break-property">
Line Breaking Rules: the 'line-break' property</h3>
<pre class="propdef">
Name: line-break
Value: auto | loose | normal | strict | anywhere
Initial: auto
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property specifies the strictness of line-breaking rules applied
within an element:
particularly how <a>wrapping</a> interacts with punctuation and symbols.
Values have the following meanings:</p>
<dl dfn-for=line-break dfn-type=value>
<dt><dfn>auto</dfn></dt>
<dd>The UA determines the set of line-breaking restrictions to use,
and it may vary the restrictions based on the length of the line; e.g.,
use a less restrictive set of line-break rules for short lines.
<dt><dfn>loose</dfn></dt>
<dd>Breaks text using the least restrictive set of line-breaking
rules. Typically used for short lines, such as in newspapers.</dd>
<dt><dfn>normal</dfn></dt>
<dd>Breaks text using the most common set of line-breaking rules.</dd>
<dt><dfn>strict</dfn></dt>
<dd>Breaks text using the most stringent set of line-breaking
rules.</dd>
<dt><dfn>anywhere</dfn></dt>
<dd>There is a <a>soft wrap opportunity</a> around every <a>typographic character unit</a>,
including around any punctuation character or preserved spaces,
or in the middle of words,
disregarding any prohibition against line breaks introduced by characters with the GL, WJ, or ZWJ character class (see [[UAX14]]).
The different wrapping opportunities must not be prioritized.
Hyphenation is not applied.
Note: This value triggers the line breaking rules typically seen in terminals.</dd>
</dl>
<p class="feedback issue">
The rules here are following guidelines from <a href="https://www.w3.org/TR/klreq/#line-break">KLREQ</a> for Korean,
which don't allow the Chinese/Japanese-specific breaks.
However, the resulting behavior could use some review and feedback to make sure they are correct,
particularly when “word basis” breaking is used (''word-break: keep-all'') in Korean.
<p>CSS distinguishes between four levels of strictness in the rules for
text wrapping.
The precise set of rules in effect for each of ''line-break/loose'', ''line-break/normal'', and ''line-break/strict'' is up to the UA
and should follow language conventions.
However, this specification does require that:</p>
<ul>
<li>
The following breaks are forbidden in ''strict'' line breaking
and allowed in ''line-break/normal'' and ''loose'':
<ul>
<li>breaks before Japanese small kana or the Katakana-Hiragana prolonged sound mark:
i.e. characters with the Unicode Line Break property <code>CJ</code>.
(See <a href="http://www.unicode.org/Public/UNIDATA/LineBreak.txt">LineBreak.txt</a> in [[!UNICODE]].)
</ul>
<li>
The following breaks are allowed for ''line-break/normal'' and ''loose'' line breaking
if the <a>content language</a> is Chinese or Japanese,
and are otherwise forbidden:
<ul>
<li>breaks before hyphens:<br>
&#x2010;&nbsp;U+2010, &#x2013;&nbsp;U+2013, &#x301C;&nbsp;U+301C,
&#x30A0;&nbsp;U+30A0
</ul>
<li>
The following breaks are forbidden for ''line-break/normal'' and ''strict'' line breaking
and allowed in ''loose'':
<ul>
<li>breaks before iteration marks:<br>
&#x3005;&nbsp;U+3005, &#x303B;&nbsp;U+303B, &#x309D;&nbsp;U+309D,
&#x309E;&nbsp;U+309E, &#x30FD;&nbsp;U+30FD, &#x30FE;&nbsp;U+30FE
<li>breaks between inseparable characters
such as &#x2025;&nbsp;U+2025, &#x2026;&nbsp;U+2026
i.e. characters with the Unicode Line Break property <code>IN</code>.
(See <a href="http://www.unicode.org/Public/UNIDATA/LineBreak.txt">LineBreak.txt</a> in [[!UNICODE]].)
</ul>
<li>
The following breaks are allowed for ''loose''
if the <a>content language</a> is Chinese or Japanese
and are otherwise forbidden:
<ul>
<li>breaks before certain centered punctuation marks:<br>
&#x30FB;&nbsp;U+30FB,
&#xFF1A;&nbsp;U+FF1A, &#xFF1B;&nbsp;U+FF1B, &#xFF65;&nbsp;U+FF65,
&#x203C;&nbsp;U+203C,
&#x2047;&nbsp;U+2047, &#x2048;&nbsp;U+2048, &#x2049;&nbsp;U+2049,
&#xFF01;&nbsp;U+FF01, &#xFF1F;&nbsp;U+FF1F
<li>breaks before suffixes:<br>
Characters with the Unicode Line Break property <code>PO</code>
and the <a>East Asian Width property</a> [[!UAX11]] <code>A</code>, <code>F</code>, or <code>W</code>.
<li>breaks after prefixes:<br>
Characters with the Unicode Line Break property <code>PR</code>
and the <a>East Asian Width property</a> [[!UAX11]] <code>A</code>, <code>F</code>, or <code>W</code>.
</ul>
</ul>
<p class="note">Note: In the requirements listed above,
no distinction is made among the levels of strictness in non-CJK text:
only CJK codepoints are affected,
unless the text is marked as Chinese or Japanese,
in which case some additional common codepoints are affected.
<div class="example">
As UAs can add additional distinctions
between ''line-break/strict''/''line-break/normal''/''line-break/loose'' modes,
these values can exhibit other differences as well.
For example, a UA with sufficiently-advanced Thai language processing ability
could choose to map different levels of strictness in Thai line-breaking
to these keywords,
e.g. disallowing breaks within compound words in ''line-break/strict'' mode
(e.g. breaking ตัวอย่างการเขียนภาษาไทย as ตัวอย่าง·การเขียน·ภาษาไทย)
while allowing more breaks in ''line-break/loose''
(ตัวอย่าง·การ·เขียน·ภาษา·ไทย).
</div>
<p class="note">Note: The CSSWG recognizes that in a future edition of the
specification finer control over line breaking may be necessary to
satisfy high-end publishing requirements.
<h2 id="hyphenation">Breaking Within Words</h2>
<p><dfn id=hyphenate lt="hyphenation|hyphenate">Hyphenation</dfn>
allows the controlled splitting of words
to improve the layout of paragraphs,
typically splitting words at syllabic or morphemic boundaries
and visually indicating the split (usually by inserting a hyphen, U+2010).
In some cases, hyphenation may also alter the spelling of a word.
Regardless, hyphenation is a rendering effect only:
it must have no effect on the underlying document content
or on text selection or searching.
<p>Hyphenation occurs when the line breaks at a valid <dfn>hyphenation opportunity</dfn>,
which creates a <a>soft wrap opportunity</a> within the word.
In CSS it is controlled with the 'hyphens' property.
CSS Text Level 3 does not define the exact rules for hyphenation,
however UAs are strongly encouraged to optimize their line-breaking implementation
to choose good break points and appropriate hyphenation points.
Hyphenation opportunities <em>are</em> considered when calculating
<a lt="min-content size">min-content intrinsic sizes</a>.
<!-- This is because it allows tables to hyphenate instead of overflowing,
which is particularly important in long-word languages like German.
https://bugzilla.mozilla.org/show_bug.cgi?id=418975 -->
<p>CSS also provides the 'overflow-wrap' property, which can allow
arbitrary breaking within words when the text would otherwise overflow
its container.
<h3 id="hyphens-property" caniuse="css-hyphens">
Hyphenation Control: the 'hyphens' property</h3>
<pre class="propdef">
Name: hyphens
Value: none | manual | auto
Initial: manual
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property controls whether hyphenation is allowed to create more
<a>soft wrap opportunities</a> within a line of text.
Values have the following meanings:
<dl dfn-for=hyphens dfn-type=value>
<dt><dfn>none</dfn>
<dd>Words are not hyphenated, even if characters inside
the word explicitly define <a>hyphenation opportunities</a>.
<dt><dfn>manual</dfn>
<dd>Words are only hyphenated where there are characters inside the word
that explicitly suggest <a>hyphenation opportunities</a>.
<div class="example">
<p>In Unicode, U+00AD is a conditional "soft hyphen" and U+2010 is an
unconditional hyphen. Unicode Standard Annex #14 describes the
<a href="http://unicode.org/reports/tr14/#SoftHyphen">role of soft hyphens in</a>
Unicode line breaking. [[!UAX14]]
In HTML, &amp;shy; represents the soft hyphen character,
which suggests a hyphenation opportunity.
<pre>ex&amp;shy;ample</pre>
</div>
<dt><dfn>auto</dfn>
<dd>Words may be broken at <a>hyphenation opportunities</a>
determined automatically by a language-appropriate hyphenation resource
in addition to those indicated explicitly by a conditional hyphen.
Automatic <a>hyphenation opportunities</a> within a word must be ignored
if the word contains a conditional hyphen (&amp;shy; or U+00AD),
in favor of the conditional hyphen(s).
However, if, even after breaking at such opportunities,
a portion of that word is is still too long to fit on one line,
an automatic hyphenation opportunity may be used.
</dl>
<p>Correct automatic hyphenation requires a hyphenation resource
appropriate to the language of the text being broken.
The UA must therefore only automatically hyphenate text
for which the <a>content language</a> is known
and for which it has an appropriate hyphenation resource.
<p class="advisement">
Authors should correctly tag their content’s <a lt="content language">language</a>
(e.g. using the HTML <code>lang</code> attribute
or the HTTP <code>Content-Language</code> header)
in order to obtain correct automatic hyphenation.
<p>When shaping scripts such as Arabic are allowed to break within words
due to hyphenation,
the characters must still be shaped
as if the word were <a href="#word-break-shaping">not broken</a>
(see [[#word-break-shaping]]).
<div class="example">
<p>For example, if the Uyghur word &ldquo;داميدى&rdquo;
were hyphenated, it would appear as
<img src="images/uyghur-hyphenate-joined.png"
alt="[isolated DAL + isolated ALEF + initial MEEM + medial YEH + hyphen + line-break + final DAL + isolated ALEF MAKSURA]">
not as
<img src="images/uyghur-hyphenate-unjoined.png"
alt="[isolated DAL + isolated ALEF + initial MEEM + final YEH + hyphen + line-break + isolated DAL + isolated ALEF MAKSURA]">
</div>
<h3 id="overflow-wrap-property" caniuse="wordwrap">
Overflow Wrapping: the 'overflow-wrap'/'word-wrap' property</h3>
<pre class="propdef">
Name: overflow-wrap, word-wrap
Value: normal | break-word
Initial: normal
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property specifies whether the UA may break at otherwise disallowed points within a line
to prevent overflow,
when an otherwise-unbreakable string is too long to fit within the line box,
or when sequences of <a>preserved</a> white space would <a>hang</a>.
It only has an effect when
'white-space' allows <a>wrapping</a>. Possible values:</p>
Issue: It has been proposed that this property could also apply when the 'white-space' property
does not allow <a>wrapping</a>,
introducing a break anywhere the line would otherwise overflow,
but without causing any change to intrinsic size computations.
See <a href="https://github.com/w3c/csswg-drafts/issues/1171#issuecomment-295522963">https://github.com/w3c/csswg-drafts/issues/1171#issuecomment-295522963</a>
<dl dfn-for=overflow-wrap dfn-type=value>
<dt><dfn>normal</dfn></dt>
<dd>Lines may break only at allowed break points. However, the restrictions
introduced by ''word-break: keep-all'' may be relaxed to match ''word-break: normal''
if there are no otherwise-acceptable break points in the line.</dd>
<dt><dfn>break-word</dfn></dt>
<dd>An otherwise unbreakable sequence of <a>characters</a>
may be broken at an arbitrary point if
there are no otherwise-acceptable break points in the line.
Shaping characters are still shaped as if the word were not
broken, and grapheme clusters must stay together as one unit.
No hyphenation character is inserted at the break point.
</dl>
<p><a>Soft wrap opportunities</a> introduced by ''overflow-wrap: break-word''
are not considered when calculating <a lt="min-content size">min-content intrinsic sizes</a>.
<p>For legacy reasons, UAs must treat 'word-wrap' as an [=legacy name alias=]
of the 'overflow-wrap' property.
<h3 id="word-break-shaping">
Shaping Across Intra-word Breaks</h3>
<p>When shaping scripts such as Arabic <a>wrap</a>
at unforced <a>soft wrap opportunities</a> within words
(such as when breaking due to
''word-break: break-all'',
''line-break: anywhere'',
''overflow-wrap: break-word'',
or when <a>hyphenating</a>)
the characters must still be shaped
(their joining forms chosen)
as if the word were still whole.
<div class="example">
For example, if the word “نوشتن” is broken between the “ش” and “ت”,
the “ش” still takes its initial form (“ﺷ”), and the “ت” its medial form (“ﺘ”)--
forming as in “ﻧﻮﺷ | ﺘﻦ”, not as in “نوش | تن”.
</div>
<h2 id="justification">
Alignment and Justification</h2>
<p>Alignment and justification controls how inline content is distributed within a line box.
<h3 id="text-align-property">
Text Alignment: the 'text-align' shorthand</h3>
<pre class="propdef shorthand">
Name: text-align
Value: start | end | left | right | center | justify | match-parent | justify-all
Initial: start
Applies to: block containers
Inherited: yes
Canonical order: n/a
Animation type: discrete
</pre>
<p>This <a>shorthand property</a>
sets the 'text-align-all' and 'text-align-last' properties
and describes how the inline-level content of a block
is aligned along the inline axis
if the content does not completely fill the line box.
Values other than ''justify-all'' or ''match-parent'' are assigned to 'text-align-all'
and reset 'text-align-last' to ''text-align-last/auto''.
Values have the following meanings:
<dl dfn-for=text-align dfn-type=value>
<dt><dfn>start</dfn></dt>
<dd>Inline-level content is aligned to the <a spec=css-writing-modes-3>start</a> edge of the line box.
<dt><dfn>end</dfn></dt>
<dd>Inline-level content is aligned to the <a spec=css-writing-modes-3>end</a> edge of the line box.
<dt><dfn>left</dfn></dt>
<dd>Inline-level content is aligned to the
<a href="https://www.w3.org/TR/css-writing-modes-3/#line-left">line left</a>
edge of the line box.
(In vertical writing modes,
this will be either the physical top or bottom,
depending on 'text-orientation'.) [[CSS-WRITING-MODES-3]]
<dt><dfn>right</dfn></dt>
<dd>Inline-level content is aligned to the
<a href="https://www.w3.org/TR/css-writing-modes-3/#line-right">line right</a>
edge of the line box.
(In vertical writing modes,
this will be either the physical top or bottom,
depending on 'text-orientation'.) [[CSS-WRITING-MODES-3]]
<dt><dfn>center</dfn></dt>
<dd>Inline-level content is centered within the line box.
<dt><dfn>justify</dfn></dt>
<dd>Text is justified according to the method specified by the 'text-justify' property,
in order to exactly fill the line box.
Unless otherwise specified by 'text-align-last',
the last line before a forced break or the end of the block is ''start''-aligned.
<dt><dfn>justify-all</dfn></dt>
<dd>Sets both 'text-align-all' and 'text-align-last' to ''text-align/justify'',
forcing the last line to justify as well.
<dt><dfn>match-parent</dfn></dt>
<dd>This value behaves the same as ''inherit''
(computes to its parent's computed value)
except that an <a>inherited value</a> of ''start'' or ''end''
is interpreted against the parent’s
(or the <a>initial containing block</a>’s, if there is no parent)
'direction' value
and results in a computed value of either 'left' or 'right'.
When specified on the 'text-align' shorthand,
sets both 'text-align-all' and 'text-align-last' to ''text-align/match-parent''.
</dl>
<p>A block of text is a stack of
<a href="https://www.w3.org/TR/CSS2/visuren.html#line-box">line boxes</a>.
This property specifies how the inline-level boxes within each line box
align with respect to the start and end sides of the line box.
Alignment is not with respect to the
<a href="https://www.w3.org/TR/CSS2/visuren.html#viewport">viewport</a>
or containing block.
<p>In the case of ''justify'', the UA may stretch or shrink any inline boxes
by <a href="#text-justify-property">adjusting</a> their text. (See 'text-justify'.)
If an element's white space is not <a href="#collapse">collapsible</a>,
then the UA is not required to adjust its text for the purpose of justification
and may instead treat the text as having no <a>justification opportunities</a>.
If the UA chooses to adjust the text, then it must ensure
that <a>tab stops</a> continue to line up as required by the
<a href="#white-space-rules">white space processing rules</a>.
<p>If (after justification, if any) the inline contents of a line box are too long to fit within it,
then the contents are <a spec=css-writing-modes-3>start</a>-aligned:
any content that doesn't fit overflows the line box's <a spec=css-writing-modes-3>end</a> edge.
<p>See <a href="#bidi-linebox">Bidirectionality and line boxes</a>
for details on how to determine the <a spec=css-writing-modes-3>start</a> and <a spec=css-writing-modes-3>end</a> edges of a line box.
<h3 id="text-align-all-property">
Default Text Alignment: the 'text-align-all' property</h3>
<pre class="propdef">
Name: text-align-all
Value: start | end | left | right | center | justify | match-parent
Initial: start
Applies to: block containers
Inherited: yes
Computed value: keyword as specified, except for ''match-parent'' which computes as defined above
Canonical order: n/a
Animation type: discrete
</pre>
<p>This longhand of the 'text-align' <a>shorthand property</a>
specifies the inline alignment of all lines of inline content in the block container,
except for last lines overridden by a non-''text-align-last/auto'' value of 'text-align-last'.
See 'text-align' for a full description of values.
<p class="advisement">
Authors should use the 'text-align' shorthand instead of this property.
<h3 id="text-align-last-property" caniuse="css-text-align-last">
Last Line Alignment: the 'text-align-last' property</h3>
<pre class="propdef">
Name: text-align-last
Value: auto | start | end | left | right | center | justify | match-parent
Initial: auto
Applies to: block containers
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property describes how the last line of a block or a line
right before a <a>forced line break</a> is aligned.
<p>If <dfn dfn-for=text-align-last dfn-type=value>auto</dfn> is specified,
content on the affected line is aligned per 'text-align-all'
unless 'text-align-all' is set to ''justify'',
in which case it is start-aligned.
All other values are interpreted as described for 'text-align'.
<h3 id="text-justify-property" caniuse="css-text-justify">
Justification Method: the 'text-justify' property</h3>
<pre class="propdef">
Name: text-justify
Value: auto | none | inter-word | inter-character
Initial: auto
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: n/a
Computed value: specified keyword
Animation type: discrete
</pre>
<p>This property selects the justification method used when a line's
alignment is set to ''justify'' (see 'text-align').
The property applies to inlines,
but is inherited from block containers to the root inline box containing their inline-level contents.
It takes the following values:</p>
<dl dfn-for=text-justify dfn-type=value>
<dt><dfn>auto</dfn></dt>
<dd>The UA determines the justification algorithm to follow, based
on a balance between performance and adequate presentation quality.
Since justification rules vary by writing system and language,
UAs should, where possible, use a justification algorithm appropriate to the text.
<p class="example">
For example, the UA could use by default a justification method that is a
simple universal compromise for all writing systems&mdash;such as
primarily expanding <a>word separators</a>
and between CJK <a>typographic letter units</a>
along with secondarily expanding between Southeast Asian <a>typographic letter units</a>.
Then, in cases where the <a>content language</a> of the paragraph is known,
it could choose a more language-tailored justification behavior
e.g. following [[JLREQ]] for Japanese,
using cursive elongation for Arabic,
using ''inter-word'' for German,
etc.
<div class="figure" id="fig-text-justify-cursive">
<p>
<img alt="Two lines of calligraphic Arabic end together due to a mix of compressed and swash forms."
title="Swash forms elongate the first line while a compressed contextual ligature shortens the second, allowing both to end precisely together."
src="images/text-justify-cursive.png"></p>
<p class="caption">An example of cursively-justified Arabic text,
rendered by <a href="http://www.decotype.com/">Tasmeem</a>.
Like English, Arabic can be justified by adjusting the spacing between words,
but in most styles it can also be justified by calligraphically elongating or compressing the letterforms themselves.
In this example, the upper text is extended to fill the line by the use of elongated (kashida) forms and swash forms,
while the bottom line is compressed slightly by using a stacked combination for the characters between ت and م.
By employing traditional calligraphic techniques,
a typesetter can justify the line while preserving flow and color,
providing a very high quality justification effect.
However, this is by its nature a very script-specific effect.
</div>
<div class="figure" id="fig-text-justify-compromise">
<p>
<img alt="Extra space is partly to spaces and partly among CJK and Thai letters."
src="images/text-justify-compromise.png"></p>
<p class="caption">Mixed-script text with ''text-justify: auto'':
this interpretation uses a universal-compromise justification method,
expanding at spaces as well as between CJK and Southeast Asian letters.
This effectively uses inter-word + inter-ideograph spacing
for lines that have word-separators and/or CJK characters
and falls back to inter-cluster behavior for lines that don't
or for which the space stretches too far.
</div>
<dt><dfn>none</dfn></dt>
<dd>Justification is disabled: there are no <a>justification opportunities</a> within the text.
<div class="figure" id="fig-text-justify-none">
<p>
<img alt="No extra space is insesrted."
src="images/text-justify-none.png"></p>
<p class="caption">Mixed-script text with ''text-justify: none''
</div>
<p class="note">Note: This value is intended for use in user stylesheets
to improve readability or for accessibility purposes.
<dt><dfn>inter-word</dfn></dt>
<dd>Justification adjusts spacing at <a>word separators</a> only
(effectively varying the used 'word-spacing' on the line).
This behavior is typical for languages that separate words using spaces,
like English or Korean.
<div class="figure" id="fig-text-justify-interword">
<p>
<img alt="Extra space is equally distributed mainly to spaces."
src="images/text-justify-interword.png"></p>
<p class="caption">Mixed-script text with ''text-justify: inter-word''
</div>
<dt><dfn>inter-character</dfn></dt>
<dd>Justification adjusts spacing between each pair of adjacent <a>typographic character units</a>
(effectively varying the used 'letter-spacing' on the line).
This value is sometimes used in East Asian systems such as Japanese.
<div class="figure" id="fig-text-justify-distribute">
<p>
<img alt="Extra space is equally distributed at points between spaces and letters of all writing systems."
src="images/text-justify-distribute.png"></p>
<p class="caption">Mixed-script text with ''text-justify: inter-character''
</div>
For legacy reasons, UAs must also support the alternate keyword <dfn>distribute</dfn>
with the exact same meaning and behavior.
</dl>
<p class="advisement">
Since optimal justification is language-sensitive,
authors should correctly language-tag their content for the best results.
Note: The guidelines in this level of CSS do not describe a complete
justification algorithm. They are merely a minimum set of requirements
that a complete algorithm should meet. Limiting the set of requirements
gives UAs some latitude in choosing a justification algorithm that
meets their needs and desired balance of quality, speed, and complexity.
<h4 id="expanding-text">
Expanding and Compressing Text</h4>
<p>When justifying text, the user agent takes the remaining space between
the ends of a line's contents and the edges of its line box, and
distributes that space throughout its contents so that the contents
exactly fill the line box.
The user agent may alternatively distribute negative space,
putting more content on the line than would otherwise fit under normal spacing conditions.
<p>A <span id="expansion-opportunity"><dfn export>justification opportunity</dfn></span> is
a point where the justification algorithm may alter spacing within the text.
A justification opportunity can be provided by a single <a>typographic character unit</a>
(such as a <a>word separator</a>),
or by the juxtaposition of two <a>typographic character units</a>.
As with controls for <a href="#line-break-details">soft wrap opportunities</a>,
whether a <a>typographic character unit</a> provides a <a>justification opportunity</a>
is controlled by the 'text-justify' value of its parent;
similarly, whether a <a>justification opportunity</a> exists between two consecutive <a>typographic character units</a>
is determined by the 'text-justify' value of their nearest common ancestor.
<p>
Space distributed by justification is <em>in addition to</em>
the spacing defined by the 'letter-spacing' or 'word-spacing' properties.
When such additional space is distributed to a <a>word separator</a> <a>justification opportunity</a>,
it is applied under the same rules as for 'word-spacing'.
Similarly, when space is distributed to an <a>justification opportunity</a> between
two <a>typographic character units</a>,
it is applied under the same rules as for 'letter-spacing'.
<p>A justification algorithm may divide <a>justification opportunities</a> into different priority levels.
All <a>justification opportunities</a> within a given level
are expanded or compressed at the same priority,
regardless of which <a>typographic character units</a> created that opportunity.
For example, if <a>justification opportunities</a> between two Han characters
and between two Latin letters are defined to be at the same level
(as they are in the ''inter-character'' justification style),
they are not treated differently because they originate from different <a>typographic character units</a>.
It is not defined in this level
whether or how other factors
(such as font size, letter-spacing, glyph shape, position within the line, etc.)
may influence the distribution of space to <a>justification opportunities</a> within the line.
<p>The UA may enable or break optional ligatures or use other font features
such as alternate glyphs or glyph compression
to help justify the text under any method.
This behavior is not controlled by this level of CSS.
However, UAs <em>must not</em> break required ligatures
or otherwise disable features required to correctly shape complex scripts.
<p>If a <a>justification opportunity</a> exists within a line,
and <a href="#text-align-property">text alignment</a> specifies
full justification (''text-align/justify'') for that line,
it must be justified.
<h4 id="justify-symbols">
Handling Symbols and Punctuation</h4>
<p>When determining <a>justification opportunities</a>,
a <a>typographic character unit</a> from the Unicode Symbols (S*) and Punctuation (P*) classes
is generally treated the same as a <a>typographic letter unit</a> of the same script
(or, if the character's script property is Common,
then as a <a>typographic letter unit</a> of the dominant script).
<p>However, by typographic tradition there may be additional rules
controlling the justification of symbols and punctuation.
Therefore, the UA may reassign specific characters
or introduce additional levels of prioritization
to handle <a>justification opportunities</a> involving symbols and punctuation.
<p class="example">
For example, there are traditionally no <a>justification opportunities</a>
between consecutive
U+2014 Em Dash ‘—’,
U+2015 Horizontal Bar ‘―’,
U+2026 Horizontal Ellipsis ‘…’,
or U+2025 Two Dot Leader ‘‥’
characters [[JLREQ]];
thus a UA might assign these characters to a “never” prioritization level.
As another example, certain fullwidth punctuation characters
(such as U+301A Left White Square Bracket ‘〚’)
are considered to contain a <a>justification opportunity</a> in Japanese.
The UA might therefore assign these characters to a higher prioritization
level than the opportunities between ideographic characters.
<h4 id="justify-limits">
Unexpandable Text</h4>
<p>If the inline contents of a line cannot be stretched to the full width of the line box,
then they must be aligned as specified by the 'text-align-last' property.
(If 'text-align-last' is ''justify'', then
they must be aligned as for ''center''.)
<h4 id="justify-cursive">
Cursive Scripts</h4>
<p>Justification <em>must not</em> introduce gaps between the joined <a>typographic letter units</a>
of <a>cursive scripts</a> such as Arabic.
If it is able, the UA <em>may</em>
translate space distributed to <a>justification opportunities</a> within a run of such <a>typographic letter units</a>
into some form of cursive elongation for that run.
It otherwise <em>must</em> assume that no <a>justification opportunity</a> exists
between any pair of <a>typographic letter units</a> in <a>cursive script</a>
(regardless of whether they join).
<div class="example">
<p>The following are examples of unacceptable justification:
<div class="figure">
<p><img alt="" src="images/arabic-stretch-spaced.png">
<p class="caption">Adding gaps between every pair of Arabic letters
</div>
<div class="figure">
<p><img alt="" src="images/arabic-stretch-unjoined.png">
<p class="caption">Adding gaps between every pair of unjoined Arabic letters
</div>
</div>
<p>Some font designs allow for the use of the tatweel character for justification.
A UA that performs tatweel-based justification must properly handle the rules for its use.
Note that correct insertion of tatweel characters depends on context, including
the letter-combinations involved, location within the word, and location of the word within the line.
<h4 id="justify-algos">
Minimum Requirements for ''text-justify/auto'' Justification</h4>
<p>For <a value for=text-justify>auto</a> justification, this specification does not define
what all of the <a>justification opportunities</a> are,
how they are prioritized, or
when and how multiple levels of <a>justification opportunities</a> interact.
However, it does require that
<ul>
<li>
Unless contraindicated by the typographic traditions of the <a>content language</a> or adjacent symbols/punctuation,
each of the following provides a <a>justification opportunity</a>:
<ul>
<li><a>Word separators</a>
<li>The boundary between a <a>typographic character unit</a> of any <a>block scripts</a> and any other <a>typographic character unit</a>
<li>The boundary between a <a>typographic character unit</a> of any <a>clustered scripts</a> and any other <a>typographic character unit</a>
</ul>
<li>
All <a>letters</a> belonging to all <a>block scripts</a> are treated the same,
and all <a>letters</a> belonging to all <a>clustered scripts</a> are treated the same.
For example, no distiction is made between
the justification opportunity between a Han letter followed by another Han letter,
vs. the justification opportunity between a Han letter followed by a Hangul letter.
</ul>
Further information on text justification can be found in (or submitted to) <a href="https://www.w3.org/International/articles/typography/justification">“Approaches to Full Justification”</a>,
which indexes by writing system and language,
and is maintained by the <a href="https://www.w3.org/International/">W3C Internationalization Working Group</a>. [[JUSTIFY]]
<h2 id="spacing">
Spacing</h2>
<p>CSS offers control over text spacing
via the 'word-spacing' and 'letter-spacing' properties, which specify additional space
around <a>word separators</a> or between <a>typographic character units</a>, respectively.
The 'word-spacing' property can now be specified in percentages,
making it possible to, for example, double or eliminate word spacing.
<div class="example">
<p>In the following example, word spacing is halved,
but may expand if needed for text justification.
<pre>p { word-spacing: -50%; }</pre>
</div>
<h3 id="word-spacing-property">
Word Spacing: the 'word-spacing' property</h3>
<pre class="propdef">
Name: word-spacing
Value: normal | <<length-percentage>>
Initial: normal
Applies to: <a>inline boxes</a>
Inherited: yes
Percentages: refers to width of the affected glyph
Computed value: the keyword ''word-spacing/normal'' or a computed <<length-percentage>> value
Animation type: by computed value type
Canonical order: n/a
</pre>
<p>This property specifies additional spacing
between &ldquo;words&rdquo;.
Missing values are assumed to be ''word-spacing:normal''.
Values are interpreted as defined below:
<dl dfn-for=word-spacing dfn-type=value>
<dt><dfn>normal</dfn>
<dd>No additional spacing is applied.
Computes to zero.
<dt><dfn>&lt;length&gt;</dfn>
<dd>Specifies extra spacing <em>in addition to</em>
the intrinsic inter-word spacing defined by the font.
<dt><dfn>&lt;percentage&gt;</dfn>
<dd>Specifies the additional spacing as a percentage of the affected
character's advance width.
</dl>
<p>Additional spacing is applied to each <a>word separator</a>
left in the text after the <a href="#white-space-rules">white space processing rules</a> have been applied,
and should be applied half on each side of the character
unless otherwise dictated by typographic tradition.
Values may be negative, but there may be implementation-dependent limits.
<div class="example">
<p>The following example will make all the spaces between words in Arabic
be rendered as zero-width, and double the width of each space in English:
<pre>
:lang(ar) { word-spacing: -100%; }
:lang(en) { word-spacing: 100%; }</pre>
<p>The following example will <em>add</em> half the the width of the
&ldquo;0&rdquo; glyph to word spacing character [[CSS-VALUES-3]]:
<pre>p { word-spacing: 0.5ch; }</pre>
</div>
<p><dfn id="word-separator" lt="word-separator character | word separator">Word-separator characters</dfn>
are <a>typographic character units</a> whose purpose and general usage is to separate words.
In [[UNICODE]] this includes
the space (U+0020), the no-break space (U+00A0), the Ethiopic word space (U+1361),
the Aegean word separators (U+10100,U+10101), the Ugaritic word divider (U+1039F),
and the Phoenician Word Separator (U+1091F).
If there are no word-separator characters, or if a word-separating
character has a zero advance width (such as the zero width space U+200B)
then the user agent must not create an additional spacing between words.
General punctuation and fixed-width spaces (such as U+3000 and U+2000
through U+200A) are not considered word-separator characters.</p>
<h3 id="letter-spacing-property" caniuse="css-letter-spacing">
Tracking: the 'letter-spacing' property</h3>
<pre class="propdef">
Name: letter-spacing
Value: normal | <<length>>
Initial: normal
Applies to: <a>inline boxes</a>
Inherited: yes
Computed value: an absolute length
Animation type: by computed value type
Canonical order: n/a
</pre>
<p>This property specifies additional spacing (commonly called <dfn export>tracking</dfn>)
between adjacent <a>typographic character units</a>.
Letter-spacing is applied after
<a href="https://www.w3.org/TR/css-writing-modes-3/#text-direction">bidi reordering</a>
and is in addition to any 'word-spacing'.
Depending on the justification rules in effect,
user agents may further increase or decrease the space between <a>typographic character units</a>
in order to <a href="#text-justify-property">justify text</a>.
<p>Values have the following meanings:
<dl dfn-for=letter-spacing dfn-type=value>
<dt><dfn>normal</dfn>
<dd>No additional spacing is applied. Computes to zero.
<dt><dfn>&lt;length&gt;</dfn>
<dd>Specifies <em>additional</em> spacing between <a>typographic character units</a>.
Values may be negative, but there may be implementation-dependent limits.
</dl>
<p>For <a href="https://github.com/w3c/csswg-drafts/issues/1484">legacy reasons</a>,
a computed 'letter-spacing' of zero
yields a <a>resolved value</a> (<code>getComputedStyle()</code> return value)
of ''letter-spacing/normal''.
<p>For the purpose of 'letter-spacing', each consecutive run of atomic
inlines (such as images and inline blocks) is treated as a single
<a>typographic character unit</a>.
<p>Letter-spacing must not be applied at the beginning or at the end of a line.
<div class="example">
<p>Because letter-spacing is not applied at the beginning or end of a line,
text always fits flush with the edge of the block.
<pre>
p { letter-spacing: 1em; }
&lt;p>abc&lt;/p></pre>
<p class="ls-ex good ls-fixed-width color-box" style="text-align: left">a&#x3000;b&#x3000;c</p>
<p class="ls-ex good ls-fixed-width color-box" style="text-align: right">a&#x3000;b&#x3000;c</p>
<p>UAs therefore must not append letter spacing to the right or trailing edge of a line:</p>
<p class="ls-ex bad ls-fixed-width color-box" style="text-align: right">a&#x3000;b&#x3000;c&#x3000;</p>
</div>
<p>Letter spacing between two <a>typographic character units</a> effectively “belongs”
to the innermost element that contains the two <a>typographic character units</a>:
the total letter spacing between two adjacent <a>typographic character units</a> (after bidi reordering)
is specified by and rendered within
the innermost element that <em>contains</em> the boundary between the two <a>typographic character units</a>.
<div class="example">
<p>A given value of 'letter-spacing' only affects the spacing
between characters completely contained within the element for which it is specified:
<pre>
p { letter-spacing: 1em; }
span { letter-spacing: 2em; }
&lt;p>a&lt;span>bb&lt;/span>c&lt;/p></pre>
<p class="ls-ex">a&#x3000;<span class="color-box">b&#x3000;&#x3000;b</span>&#x3000;c</p>
<p>This also means that applying 'letter-spacing' to
an element containing only a single character
has no effect on the rendered result:
<pre>
p { letter-spacing: 1em; }
span { letter-spacing: 2em; }
&lt;p>a&lt;span>b&lt;/span>c&lt;/p></pre>
<p class="ls-ex">a&#x3000;<span class="color-box">b</span>&#X3000;c</p>
<p>An inline box only includes
letter spacing between characters completely contained within that element:
<pre>
p { letter-spacing: 1em; }
&lt;p>a&lt;span>bb&lt;/span>c&lt;/p></pre>
<p class="ls-ex good">a&#x3000;<span class="color-box">b&#x3000;b</span>&#x3000;c</p>
<p>It is incorrect to include the letter spacing on the right or trailing edge of the element:
<p class="ls-ex bad">a&#x3000;<span class="color-box">b&#x3000;b&#x3000;</span>c</p>
<p>Letter spacing is inserted <strong>after</strong> RTL reordering,
so the letter spacing applied to the inner span below has no effect,
since after reordering the "c" doesn't end up next to "&#x5d0;":
<pre>
p { letter-spacing: 1em; }
span { letter-spacing: 2em; }
&lt;!-- abc followed by Hebrew letters alef (&#x5d0;), bet (&#x5d1;) and gimel (&#x5d2;) -->
&lt;!-- Reordering will display these in reverse order. --&gt;
<bdo dir=ltr>&lt;p>ab&lt;span>c&#x5d0;&lt;/span>&#x5d1;&#x5d2;&lt;/p></bdo></pre>
<p class="ls-ex">a&#x3000;b</span>&#x3000;<span class="color-box">c</span>&#x3000;<span class="color-box">&#x5d0;</span>&#x3000;&#x5d1;&#x3000;&#x5d2;</p>
</div>
<p>Letter spacing ignores invisible zero-width formatting characters
(such as those from the Unicode Cf category).
Spacing must be added as if those characters did not exist in the document.
<p class="example">For example, 'letter-spacing' applied to
<code>A&amp;#x200B;B</code> is identical to <code>AB</code>,
regardless of where any element boundaries might fall.
<p>When the effective spacing between two characters is not zero
(due to either <a href="#text-justify-property">justification</a>
or a non-zero value of 'letter-spacing'),
user agents should not apply optional ligatures.
However, kerning should still be applied
(see 'font-kerning' [[CSS-FONTS-3]]).
<div class="example">
For example, if the word “filial” is letter-spaced,
an “fi” ligature should not be used
as it will prevent even spacing of the text.
<figure>
<p><strong style="letter-spacing: 0.5em">filial</strong> vs <strong style="letter-spacing: 0.5em">filial</strong>
</figure>
</div>
<h4 id="cursive-tracking">
Cursive Scripts</h4>
<p>If it is able, the UA <em>may</em> apply letter spacing to <a>cursive scripts</a>
by translating the total extra space to be distributed to a run of such letters
into some form of cursive elongation (or compression, for negative tracking values) for that run
that results in an equivalent total expansion (or compression) of the run.
Otherwise, if the UA cannot expand text from a <a>cursive script</a>
without breaking its cursive connections,
it <em>must not</em> apply spacing
between any pair of that script's <a>typographic letter units</a> at all
(effectively treating each word as a single <a>typographic letter unit</a>
for the purpose of letter-spacing).
Both cases will result in an effective spacing of zero between such letters;
however the former will preserve the sense of stretching out the text.
<div class="example">
<p>Below are some appropriate and inappropriate examples of spacing out Arabic text.
<table class="data">
<thead>
<tr><td><img src="images/arabic-stretch-original.png" alt="">
<td>&mdash;
<td>Original text
<tbody>
<tr><td><img src="images/arabic-stretch-spaced.png" alt="">
<td class="no">BAD
<th>Even distribution of space between each letter.
<em>Notice this breaks cursive joins!</em>
<tr><td><img src="images/arabic-stretch-kashida.png" alt="">
<td class="ok">OK
<th>Distributing &sum;<var>letter-spacing</var> by typographically-appropriate cursive elongation.
<em>The resulting text is as long as the previous evenly-spaced example.</em>
<tr><td><img src="images/arabic-stretch-suppressed.png" alt="">
<td class="ok">OK
<th>Suppressing 'letter-spacing' between Arabic letters.
<em>Notice 'letter-spacing' is nonetheless applied to non-Arabic characters (like spaces).</em>
<tr><td><img src="images/arabic-stretch-unjoined.png" alt="">
<td class="no">BAD
<th>Applying 'letter-spacing' only between non-joined letters.
<em>This distorts typographic color and obfuscates word boundaries.</em>
</table>
</div>
<div class="note">
Note: Proper cursive elongation or compression of a text
can vary depending on the
script, typeface, language,
location within a word, location within a line,
implementation complexity, font capabilities,
and calligraphic preferences,
and may not be possible in certain cases at all.
It may involve the use of shortening ligatures,
swash variants, contextual forms,
elongation glyphs such as U+0640 ARABIC TATWEEL,
or other microtypography.
It is outside the scope of CSS to define rules for these effects.
Authors should avoid applying 'letter-spacing' to cursive scripts
unless they are prepared to accept non-interoperable results.
</div>
<h3 id="boundary-shaping">
Shaping Across Element Boundaries</h3>
Text shaping <em>must</em> be broken at inline box boundaries
when any of the following are true
for any box whose boundary separates the two <a>typographic character units</a>:
1. Any of 'margin'/'border'/'padding'
separating the two <a>typographic character units</a> in the inline axis
is non-zero.
2. 'vertical-align' is not ''vertical-align/baseline''.
3. The boundary is a <a lt="bidi-isolates">bidi isolation boundary</a>.
Text shaping <em>must not</em> be broken across inline box boundaries
when there is no effective change in formatting,
or if the only formatting changes do not affect the glyphs
(as in applying <a href="https://www.w3.org/TR/css-text-decoration/">text decoration</a>).
Text shaping <em>should not</em> be broken across inline box boundaries otherwise,
if it is reasonable and possible for that case given the limitations of the font technology.
<div class="example">
An example of reasonable and possible shaping across boundaries
is Arabic shaping:
in many systems this is performed by the font engine,
allowing the font to provide variant glyphs
with potentially very sophisticated contextual shaping.
It's not generally possible to rely on this system across a font change
unless the font engine has an API to provide context,
but it is straightforward and therefore quite reasonable
for an engine to work around this limitation by, for example,
using the zero-width-joiner (U+200D) or zero-width-non-joiner (U+200C)
as appropriate to solicit the correct choice of
initial/medial/final/isolated glyph.
An example of possible but not reasonable shaping across boundaries
is handling a font that is sensitive to 20 characters of context
on either side to choose its glyphs:
passing all the text before <em>and after</em> the string in question,
even through multiple inline boundaries with formatting changes,
is complicated.
The UA <em>could</em> handle such cases,
but is not required to,
as they are not typical or fundamentally required
by any modern writing system.
An example of impossible shaping accross boundaries
is a change in font weight partway through the word “and”
in a font where a ligature would replace
all three letters of the word “and”
with an ampersand glyph (“&amp;”).
<!--
It's simply not possible for the UA
to create the effect of a partway-bold single glyph.
-->
</div>
<h2 id="edge-effects">
Edge Effects</h2>
<p>Edge effects control
the indentation of lines with respect to other lines in the block ('text-indent')
and how content is measured at the start and end edges of a line ('hanging-punctuation').
<h3 id="text-indent-property" caniuse="css-text-indent">
First Line Indentation: the 'text-indent' property</h3>
<pre class="propdef">
Name: text-indent
Value: [ <<length-percentage>> ] && hanging? && each-line?
Initial: 0
Applies to: block containers
Inherited: yes
Percentages: refers to block container’s own <a>inline-axis</a> <a>inner size</a>
Computed value: computed <<length-percentage>> value, plus any specified keywords
Animation type: by computed value type
Canonical order: <abbr title="follows order of property value definition">per grammar</abbr>
</pre>
<p>This property specifies the indentation applied to lines of inline
content in a block. The indent is treated as a margin applied to
the start edge of the line box.
<p>Unless otherwise specified by the ''each-line'' and/or <a value for=text-indent>hanging</a> keywords,
only lines that are the
<a href="https://www.w3.org/TR/CSS2/selector.html#first-line-pseudo">first formatted line</a> [[!CSS2]]
of an element are affected.
For example, the first line of an anonymous block box is only affected
if it is the first child of its parent element.
<p>Values have the following meanings:</p>
<dl dfn-for=text-indent dfn-type=value>
<dt><dfn>&lt;length&gt;</dfn>
<dd>Gives the amount of the indent as an absolute length.</dd>
<dt><dfn>&lt;percentage&gt;</dfn>
<dd>Gives the amount of the indent as a percentage of the containing
block's logical width.
Percentages must be treated as ''0'' for the purpose of calculating [=intrinsic size contributions=],
but are always resolved normally when performing layout.
Note: This can lead to the element overflowing.
It is not recommended to use percentage indents and intrinsic sizing toghether.
</dd>
<dt><dfn>each-line</dfn>
<dd>Indentation affects the first line of each block container
and each line after a <a>forced line break</a>
(but not lines after a <a>soft wrap break</a>).
<dt><dfn>hanging</dfn>
<dd>Inverts which lines are affected.</dd>
</dl>
<div class="example">
<p>If 'text-align' is ''start'' and 'text-indent' is ''5em'' in
left-to-right text with no floats present, then first line of text
will start 5em into the block:</p>
<pre class="output">
Since CSS1 it has been possible to
indent the first line of a block element
5em by setting the 'text-indent' property&nbsp;
to '5em'.
</pre>
<p>If we add the ''text-indent/hanging'' keyword,
then the first line will start flush,
but other lines will be indented 5em:
<pre class="output">
In CSS3 we can instead indent all other
lines of the block element by 5em
by setting the 'text-indent' property
to 'hanging 5em'.
</pre>
</div>
<div class="example">
<p>Since the 'text-indent' property only affects the “first formatted line”,
a line after a forced break will not be indented.
<pre class="output">
For example, in the middle of
this paragraph is an equation,
which is centered:
x + y = z
The first line after the equation
is flush (else it would look like
we started a new paragraph).
</pre>
<p>However, sometimes (as in poetry or code),
it is appropriate to indent each line
that happens to be long enough to wrap.
In the following example, 'text-indent'
is given a value of ''3em hanging each-line'',
giving the third line of the poem a hanging indent
where it soft-wraps at the block's right boundary:
<pre class="output">
In a short line of text
There need be no wrapping,
But when we go on and on and on&nbsp;&nbsp;
and on,
Sometimes a soft break
Can help us stay on the page.
</pre>
</div>
<p class="note">Note: Since the 'text-indent' property inherits,
when specified on a block element, it will affect descendant
inline-block elements.
For this reason, it is often wise to specify 'text-indent: 0' on
elements that are specified 'display: inline-block'.</p>
<h3 id="hanging-punctuation-property" caniuse="css-hanging-punctuation">
Hanging Punctuation: the 'hanging-punctuation' property</h3>
<pre class="propdef">
Name: hanging-punctuation
Value: none | [ first || [ force-end | allow-end ] || last ]
Initial: none
Applies to: <a>inline boxes</a>
Inherited: yes
Canonical order: <abbr title="follows order of property value definition">per grammar</abbr>
Computed value: specified keyword(s)
Animation type: discrete
</pre>
<p>This property determines whether a punctuation mark, if one is present,
<a>hangs</a> and may be placed outside the line box (or in the indent)
at the start or at the end of a line of text.
<p class="note">Note: If there is not sufficient padding on the
block container, 'hanging-punctuation' can trigger overflow.</p>
<p>When a punctuation mark <dfn lt="hang">hangs</dfn>, it is not considered
when measuring the line's contents for fit, alignment, or justification.
Depending on the line's alignment/justification, this can
result in the mark being placed outside the line box.
(The interaction of this measurement and kerning is currently UA-defined;
the CSSWG <a href="https://github.com/w3c/csswg-drafts/issues/2397">welcomes advice</a> on this point.)
<p>Values have the following meanings:</p>
<dl dfn-for=hanging-punctuation dfn-type=value>
<dt><dfn>none</dfn></dt>
<dd>No character <a>hangs</a>.</dd>
<dt><dfn>first</dfn></dt>
<dd>An opening bracket or quote at the start of the
<a href="https://www.w3.org/TR/CSS2/selector.html#first-line-pseudo">first
formatted line</a> of an element <a>hangs</a>.
This applies to all characters in the Unicode categories Ps, Pf, Pi
plus the ASCII quote marks “'” U+0027 and “"” U+0022.
<dt><dfn>last</dfn></dt>
<dd>A closing bracket or quote at the end of the
last formatted line of an element <a>hangs</a>.
This applies to all characters in the Unicode categories Pe, Pf, Pi
plus the ASCII quote marks “'” U+0027 and “"” U+0022.
<dt><dfn>force-end</dfn></dt>
<dd>A <a>stop or comma</a> at the end of a line <a>hangs</a>.</dd>
<dt><dfn>allow-end</dfn></dt>
<dd>A <a>stop or comma</a> at the end of a line <a>hangs</a> if it
does not otherwise fit prior to justification.</dd>
</dl>
<p>Non-zero inline-axis borders or padding between
a <a>hang</a>able mark and the edge of the line prevent the mark from hanging.
For example, a period at the end of an inline box with end padding
does not <a>hang</a> at the end edge of a line.
At most one punctuation character may <a>hang</a> at each edge of the line.
<p>A <a lt=hang>hanging</a> punctuation mark
is still enclosed inside its parent inline box,
is still counted as part of the <a>scrollable overflow region</a> [[!CSS-OVERFLOW-3]],
and still participates in text justification:
its character advance is just not measured when determining
how much content fits on the line,
how much the line's contents need to be expanded or compressed for justification,
or how to position the content within the line box for text alignment.
Effectively, the <a>hanging</a> punctuation mark’s character advance
is re-interpreted as an additional negative margin
on the affected edge of its parent <a>inline box</a>;
the line is otherwise laid out as usual.
<p><dfn lt="stop or comma">Stops and commas</dfn> allowed to <a>hang</a> include:
<table class="data">
<tr><td>U+002C <td>&#x002C; <td>COMMA
<tr><td>U+002E <td>&#x002E; <td>FULL STOP
<tr><td>U+060C <td>&#x060C; <td>ARABIC COMMA
<tr><td>U+06D4 <td>&#x06D4; <td>ARABIC FULL STOP
<tr><td>U+3001 <td>&#x3001; <td>IDEOGRAPHIC COMMA
<tr><td>U+3002 <td>&#x3002; <td>IDEOGRAPHIC FULL STOP
<tr><td>U+FF0C <td>&#xFF0C; <td>FULLWIDTH COMMA
<tr><td>U+FF0E <td>&#xFF0E; <td>FULLWIDTH FULL STOP
<tr><td>U+FE50 <td>&#xFE50; <td>SMALL COMMA
<tr><td>U+FE51 <td>&#xFE51; <td>SMALL IDEOGRAPHIC COMMA
<tr><td>U+FE52 <td>&#xFE52; <td>SMALL FULL STOP
<tr><td>U+FF61 <td>&#xFF61; <td>HALFWIDTH IDEOGRAPHIC FULL STOP
<tr><td>U+FF64 <td>&#xFF64; <td>HALFWIDTH IDEOGRAPHIC COMMA
</table>
<p>The UA may include other characters as appropriate.
<p class="note">Note: The CSS Working Group would appreciate if UAs including
other characters would <a href="#status">inform the working group</a>
of such additions.</p>
<div class="example">
<p>The ''allow-end'' and ''force-end'' are two variations of
hanging punctuation used in East Asia.</p>
<div>
<div class="sidefigure">
<img src="images/hanging-punctuation-allow-end.png"
width="202" height="51"
alt="hanging-punctuation: allow-end">
</div>
<pre class="css">
p {
text-align: justify;
hanging-punctuation: allow-end;
}</pre>
<div class="sidefigure">
<img src="images/hanging-punctuation-force-end.png"
width="202" height="51"
alt="hanging-punctuation: force-end">
</div>
<pre class="css">
p {
text-align: justify;
hanging-punctuation: force-end;
}
</pre>
</div>
<p>The punctuation at the end of the first line for ''allow-end''
does not hang, because it fits without hanging.
However, if ''force-end'' is used, it is forced to hang.
The justification measures the line without the hanging punctuation.
Therefore when the line is expanded, the punctuation is pushed outside the line.</p>
</div>
<h3 id="bidi-linebox">
Bidirectionality and Line Boxes</h3>
<p>The <a spec=css-writing-modes-3>start</a> and <a spec=css-writing-modes-3>end</a> edges of a line box
are determined by the <a>inline base direction</a> of the line box.
In most cases, this is given by its containing block's computed 'direction'.
<p>However if its containing block has ''unicode-bidi: plaintext'' [[!CSS-WRITING-MODES-3]],
the line box's <a>inline base direction</a> must be determined
by the <a>inline base direction</a> of the <a>bidi paragraph</a> to which it belongs:
that is, the <a>bidi paragraph</a> for which the line box holds content.
An empty line box
(i.e. one that contains no atomic inlines or
characters other than the line-breaking character, if any),
takes its <a>inline base direction</a> from the preceding line box (if any), or,
if this is the first line box in the containing block,
then from the 'direction' property of the containing block.
<div class="example">
<p>In the following example, assuming the <code>&lt;block&gt;</code>
is a preformatted block (''display: block; white-space: pre'') inheriting
''text-align: start'', every other line is right-aligned:</p>
<pre>
&lt;block style="unicode-bidi: plaintext"&gt;
Latin
&#x0648;·&#x06A9;&#x0645;&#x06CC;
Latin
&#x0648;·&#x06A9;&#x0645;&#x06CC;
Latin
&#x0648;·&#x06A9;&#x0645;&#x06CC;
&lt;/block&gt;</pre>
</div>
<p class="note">Note: The inline base direction determined here
applies to the line box itself, and not to its contents.
It affects 'text-align-all', 'text-align-last', 'text-indent', and 'hanging-punctuation',
i.e. the position and alignment of its contents with respect to its edges.
It does not affect the formatting or ordering of its content.
<div class="example">
<p>In the following example:
<pre>
&lt;para style="display: block; direction: rtl; unicode-bidi:plaintext">
&lt;quote style="unicode-bidi:plaintext">שלום!&lt;/quote>", he said.
&lt;/para&gt;</pre>
<p>The result should be a left-aligned line looking like this:
<pre>"!&#1513;&#1500;&#1493;&#1501;", he said.</pre>
<p>The line is left-aligned
(despite the containing block having ''direction: rtl'')
because the containing block (the <code>&lt;para></code>) has ''unicode-bidi:plaintext'',
and the line box belongs to a bidi paragraph that is LTR.
This is because that paragraph's first character with a strong direction
is the LTR "h" from "he". The RTL "שלום!" does precede the "he",
but it sits in its own bidi-isolated paragraph that is <em>not</em>
immediately contained by the <code>&lt;para&gt;</code>,
and is thus irrelevant to the line box's alignment.
From from the standpoint of the bidi paragraph immediately contained
by the <code>&lt;para&gt;</code> containing block,
the <code>&lt;quote&gt;</code>&rsquo;s bidi-isolated paragraph inside it is,
by definition, just a neutral U+FFFC character,
so the immediately-contained paragraph becomes LTR by virtue
of the "he" following it.
</div>
<div class="example">
<pre>
&lt;fieldset style="direction: rtl">
&lt;textarea style="unicode-bidi:plaintext">
Hello!
&lt;/textarea>
&lt;/fieldset></pre>
<p>As expected, the "Hello!" should be displayed LTR
(i.e. with the exclamation mark on the right end,
despite the <code>&lt;textarea></code>’s ''direction:rtl'')
and left-aligned.
This makes the empty line following it left-aligned as well,
which means that the caret on that line should appear at its
left edge. The first empty line, on the other hand, should
be right-aligned, due to the RTL direction of its containing
paragraph, the <code>&lt;textarea></code>.
</div>
<h2 class="no-num" id="order">Appendix A:
Text Processing Order of Operations</h2>
<p>The following list defines the order of text operations.
(Implementations are not bound to this order as long as the resulting layout is the same.)
<ol>
<li><a href="#white-space-phase-1">white space processing</a> part I (pre-wrapping)
<li><a href="#transforming">text transformation</a>
<li><a href="https://www.w3.org/TR/css-writing-modes-3/#text-combine-horizontal">text combination</a>
<li><a href="https://www.w3.org/TR/css-writing-modes-3/#text-orientation">text orientation</a> [[!CSS-WRITING-MODES-3]]
<li><a href="#wrapping">text wrapping</a> while applying per line:
<ul>
<li><a href="#text-indent-property">indentation</a>
<li><a href="https://www.w3.org/TR/css-writing-modes-3/#text-direction">bidirectional reordering</a> [[!CSS2]] / [[!CSS-WRITING-MODES-3]]
<li><a href="#white-space-phase-2">white space processing</a> part II
<li><a href="https://www.w3.org/TR/css-fonts-3/">font/glyph selection and positioning</a> [[!CSS2]] / [[!CSS-FONTS-3]]
<li>'letter-spacing' and 'word-spacing'
<li><a href="#hanging-punctuation-property">hanging punctuation</a>
</ul>
<li><a href="#justification">justification</a> (which may affect glyph selection and/or text wrapping, looping back into that step)
<li><a href="#text-align-property">text alignment</a>
</ol>
<h2 class="no-num" id="plaintext">
Appendix B: Conversion to Plaintext</h2>
<p>This appendix is normative for the purpose of plaintext copy-paste operations.
<!-- https://lists.w3.org/Archives/Public/www-style/2016Oct/0115.html
https://lists.w3.org/Archives/Public/public-editing-tf/2016Apr/0005.html -->
<p>When a CSS-rendered document is converted to a plaintext format,
it is expected that:
<ul>
<li>The 'text-transform' property has no effect.
<li>[[#white-space-phase-1]] is applied and any sequence of
<a>collapsible</a> spaces at the beginning of a <a>block</a>
or immediately following a <a>forced line break</a> is removed.
</ul>
<h2 class="no-num" id="default-stylesheet">
Appendix C: Default UA Stylesheet</h2>
<p>This appendix is informative,
and is to help UA developers to implement a default stylesheet for HTML,
but UA developers are free to ignore or modify as appropriate.
<div class="example">
<pre>
/* make list items and option elements align together */
li, option { text-align: match-parent; }</pre>
</div>
<p class="feedback issue">If you find any issues, recommendations to add, or corrections,
please send the information to <a href="mailto:www-style@w3.org">www-style@w3.org</a>
with <kbd>[css-text]</kbd> in the subject line.</p>
<h2 class="no-num" id="script-groups">
Appendix D: Scripts and Spacing</h2>
<p><em>This appendix is normative.</em></p>
<p>Typographic behavior varies somewhat by language, but varies drastically by writing system.
This appendix categorizes some common <a lt="Unicode script">scripts</a> in Unicode 6.0
according to their justification and spacing behavior.
Category descriptions are descriptive, not prescriptive;
the determining factor is the prioritization of <a>justification opportunities</a>.
<dl>
<dt><dfn>block scripts</dfn></dt>
<dd>CJK and by extension all Wide characters (see [[!UAX11]].)
The following <a>Unicode scripts</a> are included:
Bopomofo, Han, Hangul, Hiragana, Katakana, and Yi.
Characters of the <a>East Asian Width property</a> <code>W</code> and <code>F</code> are also included,
but <code>A</code> characters are included only if the <a>content language</a> is Chinese, Korean, or Japanese.
<dt><dfn>clustered scripts</dfn></dt>
<dd>Clustered scripts have discrete units
and break only at word boundaries,
but do not use visible word separators.
They prioritize stretching spaces,
but comfortably admit inter-character spacing for justification.
The clustered scripts include, but are not limited to, the following <a>Unicode scripts</a>:
Khmer,
Lao,
Myanmar,
New Tai Lue,
Tai Le,
Tai Tham,
Tai Viet,
Thai
<dt><dfn lt="cursive script">cursive scripts</dfn>
<dd>Cursive scripts do not admit gaps between their letters for either justification or 'letter-spacing'.
The following <a>Unicode scripts</a> are included:
Arabic,
Mandaic,
Mongolian,
N'Ko,
Phags Pa,
Syriac
</dl>
<p>User agents should update this list as they update their Unicode support
to handle as-yet-unencoded cursive scripts in future versions of Unicode,
and are encouraged to ask the CSSWG to update this spec accordingly.
<p class="issue">Should block and cluster scripts be merged?
They have different tolerances for space-justification vs inter-character justification,
but both admit both.
<h2 id="character-properties" class="no-num">Appendix E.
Characters and Properties</h2>
<p>Unicode defines four codepoint-level properties that are referenced
in CSS typesetting:
<dl export>
<dt><dfn lt="Unicode East Asian Width|East Asian Width property"><a href="http://www.unicode.org/reports/tr11/#Definitions">East Asian width property</a></dfn>
<dd>Defined in [[!UAX11]] and given as the <code>East_Asian_Width</code> property
in the Unicode Character Database [[!UAX44]].
<dt><dfn lt="Unicode General Category|Unicode category|General Category"><a href="http://www.unicode.org/reports/tr44/#General_Category_Values">general category</a></dfn>
<dd>Defined in [[!UAX44]] and given as the <code>General_Category</code> property
in the Unicode Character Database [[!UAX44]].
<dt><dfn lt="Unicode Script|Script property"><a href="http://www.unicode.org/reports/tr24/#Values">script property</a></dfn>
<dd>Defined in [[!UAX24]] and given as the <code>Script</code> property
in the Unicode Character Database [[!UAX44]].
(UAs must include any ScriptExtensions.txt assignments in this mapping.)
<dt><a href="http://www.unicode.org/reports/tr50/">Vertical Orientation</a>
<dd>Defined in [[!UTR50]] as the Vertical_Orientation property
and given in the UTR50 data file.
</dl>
<p>Unicode defines properties for individual codepoints, but sometimes
it is necessary to determine the properties of a <a>typographic character unit</a>.
For the purposes of CSS Text,
the properties of a <a>typographic character unit</a> are given by
the base character of its first <a>grapheme cluster</a>—except in two cases:
<ul>
<li><a>Grapheme clusters</a> formed with an Enclosing Mark (<code>Me</code>) of the Common script
are considered to be Other Symbols (<code>So</code>) in the Common script.
They are assumed to have the same Unicode properties as the Replacement Character U+FFFD.
<li>Grapheme clusters</a> formed with a Space Separator (<code>Zs</code>) as the base
are considered to be Modifier Symbols (<code>Sk</code>).
They are assumed to have the same East Asian Width property as the base,
but take their other properties from the first combining character in the sequence.
</ul>
<h2 id="script-tagging" class="no-num">Appendix F.
Tagging Content by Writing System</h2>
While most languages have a preferred writing system,
many can also be transcribed into a different writing system.
As a common example, most languages have at least one Latin transcription,
and can thus be written in the Latin writing system.
In these cases the document typically adopts the typographic conventions of the Latin writing system:
for example Japanese “romaji” and Chinese Pinyin use Latin letters and word spaces,
and follow Latin line-breaking and justification practices accordingly.
As another example, historical ideographic Korean
(<code>ko-Hani</code>)
does not use word spaces,
and should therefore be typeset as for Chinese.
Authors can indicate the use of an atypical writing system
with script subtags.
For example, to indicate use of the Latin writing system
for languages which don't natively use it,
the <code>-Latn</code> script subtag can be added,
e.g. <code>ja-Latn</code> for Japanese romaji.
Other subtags exist for other writing systems:
see [[BCP47]], [[ISO15924]], and the <a href="http://unicode.org/iso15924/iso15924-codes.html">ISO15924 script tag registry</a>.
Some common/historical examples follow:
<div class="example">
<dl>
<dt><code>zh-Latn</code>
<dd>Chinese, written in Latin transcription.
<dt><code>ko-Hani</code>
<dd>Korean, written in Hanja (Chinese ideographic characters).
<dt><code>tr-Arab</code>
<dd>Turkish, written in Arabic script.
<dt><code>mn-Cyrl</code>
<dd>Mongolian, written in Cyrillic.
<dt><code>mn-Mong</code>
<dd>Mongolian, written in traditional Mongolian script.
</dl>
</div>
UAs should assume the most common writing system
of the specified <a>content language</a>
when choosing typographic behaviors
such as line-breaking or justification strategies,
but must not assume that writing system
if the author has explicitly indicated a different one.
If the UA has no language-specific knowledge
of a particular language and writing system combination,
it must use the typographic conventions of the specified writing system
(assuming the conventions of a different language if necessary),
not the conventions of that language in a different writing system,
which would be inappropriate to the writing system used in this case.
More advice on language tagging can be found in
the <a href="https://www.w3.org/International/core/">Internationalization Working Group</a>’s
<a href="https://www.w3.org/International/articles/language-tags/">“Language tags in HTML and XML”</a>
and <a href="https://www.w3.org/International/questions/qa-choosing-language-tags">“Choosing a Language Tag”</a>.
<h2 id="small-kana" class=no-num>Appendix G.
Small Kana Mappings</h2>
<style>
.pairs-table th {
text-align: center;
}
</style>
<div class="pairs-table">
<table class=data>
<caption>Small Kana Map to Full-size Kana</caption>
<thead>
<tr>
<th scope=col><dfn noexport for=kana lt="small | small kana">Small</dfn>
<th scope=col><dfn noexport for=kana lt="full-size |full-size kana">Full-size</dfn>
<tbody title="Hiragana">
<!-- Vowels (Hiragana) -->
<tr>
<td>&#x3041; U+3041
<td>&#x3042; U+3042
<tr>
<td>&#x3043; U+3043
<td>&#x3044; U+3044
<tr>
<td>&#x3045; U+3045
<td>&#x3046; U+3046
<tr>
<td>&#x3047; U+3047
<td>&#x3048; U+3048
<tr>
<td>&#x3049; U+3049
<td>&#x304A; U+304A
<!-- Consonant K (Hiragana) -->
<tr>
<td>&#x3095; U+3095
<td>&#x304B; U+304B
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tr>
<td>&#x3096; U+3096
<td>&#x3051; U+3051
<!--
<tr>
<td>
<td>
-->
<!-- Consonant T (Hiragana) -->
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tr>
<td>&#x3063; U+3063
<td>&#x3064; U+3064
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<!-- Consonant Y (Hiragana) -->
<tr>
<td>&#x3083; U+3083
<td>&#x3084; U+3084
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#x3085; U+3085
<td>&#x3086; U+3086
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#x3087; U+3087
<td>&#x3088; U+3088
<!-- Consonant W (Hiragana) -->
<tr>
<td>&#x308E; U+308E
<td>&#x308F; U+308F
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tbody title="Katakana">
<!-- Vowels (Katakana) -->
<tr>
<td>&#x30A1; U+30A1
<td>&#x30A2; U+30A2
<tr>
<td>&#x30A3; U+30A3
<td>&#x30A4; U+30A4
<tr>
<td>&#x30A5; U+30A5
<td>&#x30A6; U+30A6
<tr>
<td>&#x30A7; U+30A7
<td>&#x30A8; U+30A8
<tr>
<td>&#x30A9; U+30A9
<td>&#x30AA; U+30AA
<!-- Consonant K (Katakana) -->
<tr>
<td>&#x30F5; U+30F5
<td>&#x30AB; U+30AB
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#x31F0; U+31F0
<td>&#x30AF; U+30AF
<tr>
<td>&#x30F6; U+30F6
<td>&#x30B1; U+30B1
<!--
<tr>
<td>
<td>
-->
<!-- Consonant S (Katakana) -->
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#x31F1; U+31F1
<td>&#x30B7; U+30B7
<tr>
<td>&#x31F2; U+31F2
<td>&#x30B9; U+30B9
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<!-- Consonant T (Katakana) -->
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tr>
<td>&#x30C3; U+30C3
<td>&#x30C4; U+30C4
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#x31F3; U+31F3
<td>&#x30C8; U+30C8
<!-- Consonant N (Katakana) -->
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tr>
<td>&#x31F4; U+31F4
<td>&#x30CC; U+30CC
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<!-- Consonant H (Katakana) -->
<tr>
<td>&#x31F5; U+31F5
<td>&#x30CF; U+30CF
<tr>
<td>&#x31F6; U+31F6
<td>&#x30D2; U+30D2
<tr>
<td>&#x31F7; U+31F7
<td>&#x30D5; U+30D5
<tr>
<td>&#x31F8; U+31F8
<td>&#x30D8; U+30D8
<tr>
<td>&#x31F9; U+31F9
<td>&#x30DB; U+30DB
<!-- Consonant M (Katakana) -->
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tr>
<td>&#x31FA; U+31FA
<td>&#x30E0; U+30E0
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<!-- Consonant Y (Katakana) -->
<tr>
<td>&#x30E3; U+30E3
<td>&#x30E4; U+30E4
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#x30E5; U+30E5
<td>&#x30E6; U+30E6
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#x30E7; U+30E7
<td>&#x30E8; U+30E8
<!-- Consonant R (Katakana) -->
<tr>
<td>&#x31FB; U+31FB
<td>&#x30E9; U+30E9
<tr>
<td>&#x31FC; U+31FC
<td>&#x30EA; U+30EA
<tr>
<td>&#x31FD; U+31FD
<td>&#x30EB; U+30EB
<tr>
<td>&#x31FE; U+31FE
<td>&#x30EC; U+30EC
<tr>
<td>&#x31FF; U+31FF
<td>&#x30ED; U+30ED
<!-- Consonant W (Katakana) -->
<tr>
<td>&#x30EE; U+30EE
<td>&#x30EF; U+30EF
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tbody title="Half-width Katakana">
<!-- Vowels (hwid) -->
<tr>
<td>&#xFF67; U+FF67
<td>&#xFF71; U+FF71
<tr>
<td>&#xFF68; U+FF68
<td>&#xFF72; U+FF72
<tr>
<td>&#xFF69; U+FF69
<td>&#xFF73; U+FF73
<tr>
<td>&#xFF6A; U+FF6A
<td>&#xFF74; U+FF74
<tr>
<td>&#xFF6B; U+FF6B
<td>&#xFF75; U+FF75
<!-- Consonant T (hwid Katakana) -->
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<tr>
<td>&#xFF6F; U+FF6F
<td>&#xFF82; U+FF82
<!--
<tr>
<td>
<td>
<tr>
<td>
<td>
-->
<!-- Consonant Y (hwid Katakana) -->
<tr>
<td>&#xFF6C; U+FF6C
<td>&#xFF94; U+FF94
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#xFF6D; U+FF6D
<td>&#xFF95; U+FF95
<!--
<tr>
<td>
<td>
-->
<tr>
<td>&#xFF6E; U+FF6E
<td>&#xFF96; U+FF96
</table>
</div>
<h2 id="priv-sec" class="no-num">
Privacy and Security Considerations</h2>
This specification introduces no new security considerations.
This specification leaks the user's installed hyphenation and line-breaking dictionaries.
<h2 class="no-num" id="acknowledgements">
Acknowledgements</h2>
<p>This specification would not have been possible without the help from:
Ayman Aldahleh, Bert Bos, Tantek Çelik, James Clark, Stephen Deach, John Daggett,
Martin Dürst,
Laurie Anna Edlund, Ben Errez, Yaniv Feinberg, Arye Gittelman, Ian
Hickson, Martin Heijdra, Richard Ishida, Masayasu Ishikawa,
Michael Jochimsen, Eric LeVine, Ambrose Li, Håkon Wium Lie, Chris Lilley,
Ken Lunde, Nat McCully, IM Mincheol, Shinyu Murakami, Paul Nelson,
Chris Pratley, Xidorn Quan, Marcin Sawicki,
Arnold Schrijver, Rahul Sonnad, Alan Stearns, Michel Suignard, Takao Suzuki,
Frank Tang, Chris Thrasher, Etan Wexler, Chris Wilson, Masafumi Yabe
and Steve Zilles.
<h2 class="no-num" id="changes">
Changes</h2>
ISSUE: THIS CHANGES LIST IS WAY INCOMPLETE PLEASE SEE
<a href="https://drafts.csswg.org/css-text-3/issues-lc-2013">Disposition of Comments</a>.
<h3 class="no-num" id="recent-changes">
Changes from the <a href="https://www.w3.org/TR/2013/WD-css-text-3-20131010/">October
2013 CSS3 Text <abbr title="Last Call Working Draft">LCWD</abbr></a></h3>
<ul>
<li>Switched 'tab-size' to use <<number>> so that it is animatable,
and defined it to also account for 'letter-spacing' and 'word-spacing'.
<li>Made 'text-align' a shorthand of 'text-align-last' and the new 'text-align-all' property.
<li>Removed dependence of 'text-align-last' on ''text-align: justify'',
since the problem it solves is now solved by the shorthanding relationship.
<li>Qualified that only lowercase letters are titlecased for ''text-transform: capitalize''; uppercase letters remain unaffected.
<li>For ''word-break: break-all'', switched to UAX14 notion of “letters”,
since that handles symbols better.
<li>Added ''line-break: anywhere''.
<li>Tweaked handling of Ambiguous characters during <a href="#line-break-transform">segment break transformation</a> to respond to language context.
</ul>
<h3 class="no-num" id="changes-2013">
Changes from the <a href="https://www.w3.org/TR/2013/WD-css-text-3-20131010/">October
2013 CSS3 Text <abbr title="Working Draft">LCWD</abbr></a></h3>
<p>Major changes include:
<ul>
<li>
</ul>
<p>Significant details updated:
<ul>
<li>
</ul>