Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 459 Bytes

README.md

File metadata and controls

18 lines (11 loc) · 459 Bytes

ncrawler

Copy of NCrawler from http://ncrawler.codeplex.com/

Simple and very efficient multithreaded web crawler with pipeline based processing written in C#. Contains HTML, Text, PDF, and IFilter document processors and language detection(Google). Easy to add pipeline steps to extract, use and alter information.

Build Nuget packages

Create debug packages

.\Build.ps1 -VersionSuffix build002

Create release packages

.\Build.ps1