Skip to content

arminbashizade/IR2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IR2

Information Storage and Retrieval course project phase 2: This program reads an input file of multiple documents separated with <p>, tokenizes them, stems the tokens using Paice/Husk algorithm, and creates a dictionary of the results. The dictionary is constructed using Ternary Search Tree which is implemented with a data structure called Train. The TST was implemented using different data structers, Train was the best in both memory and time.

The user can type in a query to search in the document set. The program merges the posting lists of the words to find the matching documents for the input vector. The output is the index of the documents which contain all of the queries tokens.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages