Skip to content

Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.

khazeshgar/importer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Importer

Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a computer file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before importing/using it in your own service or application.

Website: http://www.norconex.com/collectors/importer/

About

Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 98.5%
  • HTML 1.3%
  • Other 0.2%