Skip to content
Html Parser with lxml backend. Implements subset of BeautifulSoup API
Python
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
html_wrapper
LICENSE
README.md
requirements.txt
setup.cfg
setup.py

README.md

html_wrapper

html_wrapper implements a small subset of the BeautifulSoup API that I use. It's anywhere from 10x-100x faster than bs4.

Installation

pip3 install html_wrapper

Example

Faster to instantiate and parse HTML. Suits my needs.

In [1]: import bs4

In [2]: import html_wrapper

In [3]: %timeit html_wrapper.HtmlWrapper("<html><body><p>hi</p></body></html>").text
10000 loops, best of 3: 20.4 µs per loop

In [4]: %timeit bs4.BeautifulSoup("<html><body><p>hi</p></body></html>", "lxml").text
1000 loops, best of 3: 232 µs per loop

License

See LICENSE

You can’t perform that action at this time.