A PDF collection reader with built-in full text search engine
Written in Python / Electron / Elm / Javascript
-
Simple UI
-
Local database (You have controll 100% of your data)
-
Easy installation (No need to install external databases)
-
Multiplatform (Linux, Mac, Windows)
git clone https://github.com/mknz/mirusan.git
cd ./mirusan
cd ./search
pip install -r requirements.txt
cd ../electron
npm install
npm run compile
npm start
Mirusan automatically detects input language using Google's language-detection. Tokenizer or analyzer for indexing is chosen according to the detected language.
For following languages, Whoosh's built-in LanguageAnalayzer or StandardAnalyzer (for English) is used.
(though currently it does not work properly for Arabic.)
Arabic
Danish
Dutch
English
Finnish
French
German
Hungarian
Italian
Norwegian
Portuguese
Romanian
Russian
Spanish
Swedish
Turkish
For other languages, N-gram tokenizer (minsize=1, maxsize=2) is used.