Skip to content

dougy147/nose

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Not Optimal Search Engine

nose is a local search engine that computes tf-idf scores for every terms of any number of parsable documents. It can locally serve a small php webpage where you can browse and search in your computer folders. Inspired from seroost.

It is in development, no big documentation for now, use at your own risks

Quick launch

git clone https://github.com/dougy147/nose
cd nose
./nose -s

Functionalities

Option Function
-i Index a corpus of directories and/or files
-p Parse files and computes tf-idf
-e Index and parse directories and/or files respectively to their context
-f Find best matching files given a query
-q Quick find (computes tf-idf only for the queries terms)
-s Serve a local webpage
-h Display help menu

Indexing : Takes any number of files and/or directories as input. Then creates index.nose, containing a list of all files discovered. Indexer considers the corpus as a whole when computing tf-idf scores.

Parsing : Parse every extension-compatible files given an index (by default ./out/indexer/index.nose). After parsing, it computes every tf-idf and stores informations in full_dict.nose.

Exploring : Output one index per single directory. The best matching documents per single directories will be on top.

Finding : Find the best matching files given any user input.

Serving : Serve your indexes locally at http://127.0.0.1:1111.

Dependencies

  • awk / bc / find / grep
  • pdftotext (parse pdf) yay -S python-pdftotext
  • php (server mode)

About

local search engine written in bash

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published