Skip to content
PMS full-text search engine with no external dependencies written in C#
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

PMS Full-Text Search Engine for .NET Core

License: MIT Travis Status

Full-Text Search Engine with no external dependencies written in C# for .NET Core.

The aim of this project is to showcase algorithms, data structures and techniques that are used to create full-text search engines.

Getting Started

On Windows:

  1. Download and build code. Use the following commands:

    dotnet restore
    dotnet build
  2. Open folder with binaries: bin\Debug\netcoreapp2.0

  3. Start the following command. Replace DATA_PATH with a path to Datasets folder

    run_test.bat DATA_PATH
  4. If everything goes well the following messages are printed:

    Log from index construction:

    dotnet Protsyk.PMS.FullText.ConsoleUtil.dll index --input "F:\Sources\FullTextSearch\Datasets"
    PMS Full-Text Search (c) Petro Protsyk 2017-2018
    Indexed documents: 3, time: 00:00:00.1010004

    Dump of the index (for each term in the dictionary - the list of all occurrences):

    dotnet Protsyk.PMS.FullText.ConsoleUtil.dll print
    PMS Full-Text Search (c) Petro Protsyk 2017-2018
    2017 -> [1,1,9]
    algorithms -> [1,1,19]
    and -> [1,1,20]
    apple -> [3,1,1]
    banana -> [3,1,2]
    build -> [1,1,25]
    c -> [1,1,16]
    data -> [1,1,21]
    demonstrate -> [1,1,18]

    Search with query WORD(pms):

    dotnet Protsyk.PMS.FullText.ConsoleUtil.dll search --query "WORD(pms)"
    {filename:"TestFile001.txt", size:"180", created:"2018-04-02T10:09:41.4208444+02:00"}
    {filename:"TestFile002.txt", size:"29", created:"2018-04-02T10:09:41.4248447+02:00"}
    Documents found: 2, matches: 2, time: 00:00:00.0564721

    Lookup in the dictionary using a pattern i.e. all terms matching pattern:

    dotnet Protsyk.PMS.FullText.ConsoleUtil.dll lookup --pattern "WILD(pet*)"
    Terms found: 1, time: 00:00:00.0704173
    dotnet Protsyk.PMS.FullText.ConsoleUtil.dll lookup --pattern "EDIT(projct, 1)"
    Terms found: 1, time: 00:00:00.0847931

Query Language

  • WORD(apple) - single word
  • WILD(app*) - wildcard pattern
  • EDIT(apple, 1) - Levenshtein (edit distance, fuzzy search)

Conjunction operators

  • OR - boolean or
  • AND - boolean and
  • SEQ - sequence of words, phrase

Examples of queries:

  • AND(WORD(apple), OR(WILD(a*), EDIT(apple, 1)))
  • SEQ(WORD(hello), WORD(world))

Data Structures






dotnet publish -c Release --self-contained -r osx.10.13-x64


dotnet publish -c Release --self-contained -r win10-x64
You can’t perform that action at this time.