Skip to content
/ cetd Public

Content Extraction via Text Density (CETD) program provides algorithms to detect and remove the additional content

Notifications You must be signed in to change notification settings

bluele/cetd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Content Extraction via Text Density (CETD) program provides algorithms
to detect and remove the additional content (e.g. ads, navigation menus, copyright notices etc)
around the main content of a webpage.
see http://disnet.cs.bit.edu.cn/

License
 For RapidXMl, refer to http://rapidxml.sourceforge.net/
 The others are under the GPL version 3, read it at http://www.gnu.org/licenses/gpl.txtLicense

Install swig binding for golang
 $ go get github.com/bluele/cetd

About

Content Extraction via Text Density (CETD) program provides algorithms to detect and remove the additional content

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published