Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Extract content from html
Python
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
README.md
extract_content.py

README.md

extract_content

Extract content from html.

Usage

from extract_content import ContentExtractor
extractor = ContentExtractor()
body, title = extractor.analyse(html)

Argument of ContentExtractor#analyse should be Unicode type or UTF-8 string.

License

  • Copyright: 2012 by najeira.
  • License: BSD.

Based on

Something went wrong with that request. Please try again.