Skip to content
This repository


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Typographic replacements in HTML

branch: master


Tipi is for typographic replacements in HTML.

Status: ACTIVE

Under active development and maintenance.

Ideas behind this project

  • Input is HTML code, output is the same HTML code with changes in typography (entities, spaces, quotes, etc.).
  • You can't parse HTML with regex.
  • The best existing HTML parser and tokenizer for Python is lxml.
  • There are more languages than English in the world. Each of them has different typographic rules.



$ pip install tipi


Usage of tipi is very straightforward:

>>> from tipi import tipi
>>> html = '<p>"Zavolej mi na číslo <strong class="tel">765-876-888</strong>," řekla, a zmizela...</p>'
>>> html = tipi(html, lang='cs')
>>> html
'<p>\u201eZavolej mi na \u010d\xed\xadslo <strong class="tel">765\u2013876\u2013888</strong>,\u201c \u0159ekla, a\xa0zmizela\u2026</p>'
>>> print html
<p>Zavolej mi na čí­slo <strong class="tel">765876888</strong>, řekla, a zmizela</p>

Remember that tipi is designed to work with HTML. In case you need to perform replacements on plaintext, escape it first:

>>> fron tipi import tipi
>>> tipi('b -> c')  # this works only by coincidence!
u'b → c'
>>> tipi('a <- b -> c')
u'a  c'
>>> import cgi
>>> html = cgi.escape(u'a <- b -> c')
>>> html
u'a &lt;- b -&gt; c'
>>> tipi(html)
u'a ← b → c'


  • Support for multiple languages.
  • Language-sensitive replacements for single quotes and double quotes.
  • Ellipsis, dashes, nonbreakable spaces, ...
  • Arrows (--> turned into → ), dimensions (12 × 30).
  • Symbols (trademark, registered, copyright, EUR, ...)



License: MIT

© 2013-2014 Jan Javorek <>

This work is licensed under MIT license.

Something went wrong with that request. Please try again.