Skip to content
An open source, feature rich search engine.
C xBase Shell C++ PHP HTML Other
Find file
Failed to load latest commit information.
build update to the latest snapshot released Jul 8, 2015
create update to the latest snapshot released Jul 7, 2015
debian initial code placement from SVN repository Aug 9, 2013
doc
etc update to the latest snapshot released Jul 7, 2015
include add 'Robots collect' Nov 3, 2015
misc
perl initial code placement from SVN repository Aug 9, 2013
scripts update to the latest snapshot released Jul 7, 2015
src fix crash in dt:minute limit is specified Nov 9, 2015
test update to the latest snapshot released Jul 7, 2015
.gitignore additions to .gitignore Oct 11, 2013
AUTHORS initial code placement from SVN repository Aug 9, 2013
BUGS initial code placement from SVN repository Aug 9, 2013
ChangeLog initial code placement from SVN repository Aug 9, 2013
INSTALL initial code placement from SVN repository Aug 9, 2013
LICENSE Initial commit Aug 8, 2013
Makefile.am
Makefile.in update to the latest snapshot released Jul 7, 2015
README fix mistypes in README Aug 26, 2013
README.XML initial code placement from SVN repository Aug 9, 2013
README.md link README.md into README Aug 25, 2013
TODO
acinclude.m4 fix checking for aspell; add checking for qsort_r; switch to GNU chec… Oct 24, 2013
aclocal.m4 update to the latest snapshot released Jul 7, 2015
bootstrap make it tunable Sep 11, 2013
charset-Makefile.am add new files into distribution package Aug 9, 2013
charset-configure.in initial code placement from SVN repository Aug 9, 2013
compile update autotools files Sep 16, 2013
config.guess update to the latest snapshot released Jul 7, 2015
config.sub update to the latest snapshot released Jul 7, 2015
configure update to the latest snapshot released Jul 7, 2015
configure.ac
depcomp add missed file Oct 2, 2013
dp-Makefile.am add new files into distribution package Aug 9, 2013
dp-configure.in move dps_heapsort into libdp Sep 5, 2013
install-sh update autotools files Sep 16, 2013
install.pl initial code placement from SVN repository Aug 9, 2013
ltconfig initial code placement from SVN repository Aug 9, 2013
ltmain.sh update to the latest snapshot released Jul 7, 2015
missing update to the latest snapshot released Jul 7, 2015
mkinstalldirs

README.md

DataparkSearch v.4

Full featured web search engine

Documentation and auxiliary files

Discussion group

Feel free to ask any question about DataparkSearch Engine in DataparkSearch group on Google Groups

Features

  • Support for http, https, ftp (passive mode), nntp and news URL schemes.
  • htdb virtual URL scheme for SQL database indexing.
  • Indexes text/html, text/xml, text/plain, audio/mpeg (mp3) and image/gif mime types natively.
  • External parsers support for other document types, including Microsoft Word, Excel, RTF, PowerPoint, Adobe Acrobat PDF and Flash.
  • Can index multilingual sites using content negotiation.
  • Can search all of the word forms using ispell affixes and dictionaries.
  • Synonym, acronym and abbreviation query expansion based on editable dictionaries, specified by language and charset.
  • Stop-words, synonyms and acronyms lists.
  • Options to query with all words, all words near to each others, any words, or Boolean queries. A subset of VQL (Verity Query Language) is supported.
  • Popularity Rank based on a neural network model.
  • Results can be sorted by relevancy (using vector calculation), popularity rank as "Goo" (adding weight for incoming links), and "Neo" (neural network model), last modified time, and by "importance" (a combination of relevancy and popularity rank).
  • Supports wide range of character sets support with automated character set and language detection.
  • Offers an accent insensitive search option.
  • Provides phrase segmenting (tokenizing) for Chinese, Japanese, Korean and Thai.*
  • Includes an indexer and a web CGI front-end, as well as a search module for Apache web server (mod_dpsearch).
  • Handles Internationalized Domain Names (IDN).
  • Summary Extraction Algorithm automatically sums up each document in several sentences.
  • Uses If-Modified-Since for efficient transfer of only changed files.
  • Can tweak URLs with session IDs and other weird formats, including some JavaScript link decoding.
  • Can perform parallel and multi-threaded indexing for faster updating.
  • Flexible update scheduling, including options for checking some sections of a site more frequently.
  • Handles basic authentication (user name and password) and cookies.
  • Stores a compressed text version of the documents for extracting and viewing.
  • Can specify a default character set and language for a server or subdirectory, or a list of possible languages.
  • Noindex tags: <!--UdmComment-->, <NOINDEX>, <!--noindex-->, Google's special comments <!-- google_ad_section_start -->, <!-- google_ad_section_start(weight=ignore) --> and <!-- google_ad_section_end --> consider as tags to include/exclude.
  • Can specify a content body tag.
  • Spellchecking for query words with aspell.
  • Flexible options and commands to customize search result pages.
  • Effective caching gives significant time reduction in search times.
  • Query logging stores the query, query parameters and the number of results found.

Disclaimer (see LICENSE for details)

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.

Additional permission under GNU GPL version 3 section 7

If you modify this program, or any covered work, by linking or combining it with the OpenSSL project's OpenSSL library (or a modified version of that library), containing parts covered by the terms of the OpenSSL or SSLeay licenses, the Free Software Foundation grants you additional permission to convey the resulting work. Corresponding Source for a non-source form of such a combination shall include the source code for the parts of OpenSSL used as well as that of the covered work.

Something went wrong with that request. Please try again.