A dataset of popular pages (taken from <dir.yahoo.com>) with manually marked up semantic blocks.
A dataset of random pages with manually marked up semantic blocks.
Forked from misja/python-boilerpipe
Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages
Forked from bcoe/sandcastle
Forked from tpopela/vips_java
Implementation of Vision Based Page Segmentation algorithm in Java
Forked from openstack/python-swiftclient
2,848 contributions in the last year
Created a pull request in apache/spark that received 6 comments
This PR sets
default. It had originally been set to
true as a workaround for
Created an issue in channable/vaultenv that received 2 comments
Our last release was in September and we added a few new features and fixes since then so it would be nice to create a new release.