html5charref

Python library for escaping/unescaping HTML5 Named Character References.

The standard library includes the HTMLParser library for unescaping HTML named entities and HTML unicode escapes. Unfortunately, it doesn't include any of the named character entity references defined in HTML5. This library intends to provide a solution for escaping/unescaping HTML character references defined in HTML5.

Installation

You can install this project from PyPI:

pip install html5charref

Usage

The main purpose of html5charref is to unescape HTML named entities. It will also handle HTML unicode character escapes.

html = u'This has &copy; and &lt; and &#x000a9; symbols'
print html5charref.unescape(html)
# u'This has \uxa9 and < and \uxa9 symbols'

You can also use html5charref to find the HTML5 named entity for a given unicode character.

import html5charref
# The copyright character
print html5charref.escape_char(u'\u00a9')
# u'&copy;'

Updating Named Entity References

It is possible that additional named entity references will be added to the HTLM5 spec. You can update the list maintained by html5charref using the update_charrefs() function. This queries the latest named entity definitions from the w3 HTML5 site.

import html5charref
html5charref.update_charrefs()

Licensing

This project is licensed under the MIT license.

Documentation

View the full documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
docs		docs
html5charref		html5charref
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
README.rst		README.rst
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

html5charref

Installation

Usage

Updating Named Entity References

Licensing

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

bpabel/html5charref

Folders and files

Latest commit

History

Repository files navigation

html5charref

Installation

Usage

Updating Named Entity References

Licensing

Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages