html to etree

Parse html to lxml etree

Convenience methods for parsing html documents to lxml etree.

Lxml has limited capabilities for handling different encodings, and this library is intended as a reusable utility parsing byte-code html responses into ElementTrees using sane character decoding.

Free software: BSD license
Python versions: 2.7, 3.4+

Features

Parse html to lxml etree
Handle character decoding

Quickstart

Parse HTML given as byte strings:

tree = parse_html_bytes(body=body_bytes, content_type=res.headers.get('content-type'))

Parse HTML given as already decoded unicode string:

tree = parse_html_unicode(uni_string=body_unicode)

Credits

This package was created with Cookiecutter and the `fluquid/cookiecutter-pypackage`_ project template.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
src/html_to_etree		src/html_to_etree
tests		tests
.bumpversion.cfg		.bumpversion.cfg
.cookiecutterrc		.cookiecutterrc
.coveragerc		.coveragerc
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
TODO.rst		TODO.rst
VERSION		VERSION
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements-install.txt		requirements-install.txt
requirements-setup.txt		requirements-setup.txt
requirements-tests.txt		requirements-tests.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

html to etree

Features

Quickstart

Credits

About

Uh oh!

Releases

Packages

Languages

License

PythonLinks/html-to-etree

Folders and files

Latest commit

History

Repository files navigation

html to etree

Features

Quickstart

Credits

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages