Finally a tool for typography nerds.
JoliTypo is a tool fixing Microtypography glitches inside your HTML content.
use JoliTypo\Fixer;
$fixer = new Fixer(array('Ellipsis', 'Dash', 'EnglishQuotes', 'CurlyQuote', 'Hyphen'));
$fixed_content = $fixer->fix('<p>"Tell me Mr. Anderson... what good is a phone call... if you\'re unable to speak?" -- Agent Smith, <em>Matrix</em>.</p>');
<p>“Tell me Mr. Ander­son… what good is a phone call… if you’re unable to speak?”—Agent Smith, <em>Matrix</em>.</p>
“Tell me Mr. Anderson… what good is a phone call… if you’re unable to speak?”—Agent Smith, Matrix.
It's designed to be:
- language agnostic (you can fix
fr_FR
,fr_CA
,en_US
... You tell JoliTypo what to fix); - fully tested;
- easy to integrate into modern PHP project (composer and autoload);
- robust (make use of
\DOMDocument
instead of parsing HTML with dummy regexp); - smart enough to avoid Javascript, Code, CSS processing... (configurable protected tags list);
- fully open and usable in any project (MIT License).
Just tell the Fixer class which Fixer you want to run on your HTML contents and then, call fix()
:
use JoliTypo\Fixer;
$fixer = new Fixer(array("FrenchQuotes", "FrenchNoBreakSpace"));
$fixed_content = $fixer->fix('<p>Je suis "très content" de t\'avoir invité sur <a href="http://jolicode.com/">Jolicode.com</a> !</p>');
For your ease of use, you can find ready to use list of Fixer for your language here. Micro-typography is nothing like a standard or a law, what really matter is consistency, so feel free to use your own lists.
Also, be advise that JoliTypo is intended to be used on HTML contents (not pages) and will remove potential <head>
, <html>
and <body>
tags.
Requirements are handled by Composer (libxml and mbstring are required).
composer require jolicode/jolitypo "~0.1.4"
Usage outside composer is also possible, just add the src/
directory to any PSR-0 compatible autoloader.
Replace the simple -
by a ndash –
between numbers (dates ranges...) and the double --
by a mdash —
.
Replace the letter x between numbers (12 x 123
) by a times entity (×
, the real math symbol).
Replace the three dot ...
by an ellipsis …
.
Convert dumb quotes " "
to smart English style quotation marks “ ”
.
Convert dumb quotes " "
to smart French style quotation marks « »
and use a no break space.
Convert dumb quotes " "
to smart German style quotation marks „ “
(Anführungszeichen).
Some fonts (Verdana) are typographically incompatible with German.
Replace some classic spaces by non breaking spaces following the French typographic code.
No break space are placed before :
, thin no break space before ;
, !
and ?
.
Make use of org_heigl/hyphenator
, a tool enabling word-hyphenation in PHP.
This Hyphenator uses the pattern-files from OpenOffice which are based on the pattern-files created for TeX.
There is only some locale available for this fixer: af_ZA, ca, da_DK, de_AT, de_CH, de_DE, en_GB, en_UK, et_EE, fr, hr_HR, hu_HU, it_IT, lt_LT, nb_NO, nn_NO, nl_NL, pl_PL, pt_BR, ro_RO, ru_RU, sk_SK, sl_SI, sr, zu_ZA.
You can read more about this fixer on the official github repository.
This Fixer require a Locale to be set on the Fixer with $fixer->setLocale('fr_FR');
. Default to en_GB
.
Proper hyphenation is mandatory in justified text and you should avoid word breaking in titles with this line of CSS: hyphens:none;
.
Replace straight quotes '
by curly one's ’
.
There is on exception to consider: foot and inch marks (minutes and second marks). Purists use prime ′
, this fixer use straight quote for compatibility.
Read more about Curly quotes.
Handle trademark symbol ™
, a registered trademark symbol ®
, and a copyright symbol ©
. This fixer replace
commonly used approximations: (r)
, (c)
and (TM)
. A non-breaking space is put between numbers and copyright symbol too.
It is really easy to make your own Fixers, feel free to extend the provided ones if they do not fit your typographic rules.
$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Dash', 'EnglishQuotes', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('en_GB'); // Needed by the Hyphen Fixer
Those rules apply most of the recommendations of "Abrégé du code typographique à l'usage de la presse", ISBN: 9782351130667.
$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Dash', 'FrenchQuotes', 'FrenchNoBreakSpace', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('fr_FR'); // Needed by the Hyphen Fixer
Mostly the same as fr_FR, but the space before punctuation points is not mandatory.
$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Dash', 'FrenchQuotes', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('fr_CA'); // Needed by the Hyphen Fixer
Mostly the same as en_GB, according to Typefacts and Wikipedia.
$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Dash', 'GermanQuotes', 'CurlyQuote', 'Hyphen', 'Trademark'));
$fixer->setLocale('de_DE'); // Needed by the Hyphen Fixer
More to come (contributions welcome!).
$fixer = new Fixer(array('Ellipsis', 'Dimension', 'Dash', 'EnglishQuotes', 'CurlyQuote', 'Hyphen'));
$fixed_content = $fixer->fix("<p>Some user contributed HTML which does not use proper glyphs.</p>");
$fixer->setRules(array('CurlyQuote'));
$fixed_content = $fixer->fix("<p>I'm only replacing single quotes.</p>");
$fixer->setRules(array('Hyphen'));
$fixer->setLocale('en_GB'); // I tell which locale to use for Hyphenation
$fixed_content = $fixer->fix("<p>Very long words like Antidisestablishmentarianism.</p>");
If you want to add your own Fixer to the list, you have to implement JoliTypo\FixerInterface
.
Then just give JoliTypo their fully qualified name, or even instance:
// by FQN
$fixer = new Fixer(array('Ellipsis', 'Acme\\YourOwn\\TypoFixer'));
$fixed_content = $fixer->fix("<p>Content fixed by the 2 fixers.</p>");
// or instances, or both
$fixer = new Fixer(array('Ellipsis', 'Acme\\YourOwn\\TypoFixer', new Acme\\YourOwn\\PonyFixer("Some parameter")));
$fixed_content = $fixer->fix("<p>Content fixed by the 3 fixers.</p>");
Protected tags is a list of HTML tag name that the DOM parser must avoid. Nothing in those tags will be fixed.
$fixer = new Fixer(array('Ellipsis'));
$fixer->setProtectedTags(array('pre', 'a'));
$fixed_content = $fixer->fix("<p>Fixed...</p> <pre>Not fixed...</pre> <p>Fixed... <a>Not Fixed...</a>.</p>");
- Write test
- A Fixer is run on a piece of text, no HTML to deal with
- Implement
JoliTypo\FixerInterface
- Pull request
- PROFIT!!!
- Windows XP : Thin No-Break Space can't be used, all other spaces are ignored but they do not look bad (normal space).
- Mac OS Snow Leopard : no espaces fixes, demi-fixes, cadratin et demi-cadratin but does not look bad (normal space).
BUT if you use a font (@font-face
maybe) that contains all those glyphs, there will be no issues.
There is a known issue preventing JoliTypo to work correctly with APC versions older than 3.1.11.
We need to be able to use this tool everywhere, you can help by providing:
- Wordpress plugin (to replace or complete
wptexturize
) - Dotclear plugin ...
Also, there is a Todo list 😙
This piece of code is under MIT License. See the LICENSE file.
There are already quite a bunch of tool like this one (including good ones). But sadly, some are only for one language, some are running regexp on the whole HTML code (which is bad), some are not tested, some are bundled inside a CMS or a Library, some are not using proper auto-loading, some do not have an open bug tracker... Have a look by yourself:
- http://michelf.ca/projets/php-smartypants/
- http://michelf.ca/projets/php-smartypants/typographer/
- http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/
- https://github.com/Cerdic/textwheel/blob/master/typographie/fr.php
- https://github.com/spip/SPIP/blob/master/ecrire/typographie/fr.php
- https://github.com/dg/texy/blob/master/Texy/modules/TexyTypographyModule.php
- https://github.com/scoates/lexentity
- https://github.com/nofont/Typesetter.js
- https://github.com/judbd/php-typography (fork of php-typography, you can test it here: http://www.roxane-company.com/typonerd/)
- http://mdash.ru/
Thanks to theses online resources for helping a developer understand typography:
- [FR] http://typographisme.net/post/Les-espaces-typographiques-et-le-web
- http://daringfireball.net/projects/smartypants/
- [FR] http://www.uzine.net/article1802.html
- [FR] http://dascritch.net/post/2011/05/09/Les-espacements-unicodes
- http://www.punctuationmatters.com/ is a must read
- http://practicaltypography.com/
- [FR] "Abrégé du code typographique à l'usage de la presse", ISBN: 9782351130667
- https://en.wikipedia.org/wiki/Non-English_usage_of_quotation_marks