Text Based Information Retrieval System using the Vector Space Model, developed in partial fulfilment for the Information Retrieval Course offered at my University.
We have selected the Amazon Food Reviews Corpus for this project. You can download the corpus from this [drive link] (https://drive.google.com/file/d/0BzNf9u6dqAlhTmVzSFdKQVA1V0U/view?usp=sharing). After downloading the file, please extract it to the folder named Code Files.
Now go to the folder Code Files, and proceed as mentioned in the following instructions.
- Execute the python file indexer_syntactic.py
- Execute the python file norm_syntactic.py
CAUTION: Create an empty file called invindex_semantic.txt before running indexer_syntactic.py file
- Execute the python file vectors_syntactic.py
- Enter your query on the CLI (preferably about food since the corpus contains food reviews)
- Enter 'n', the number of documents you want to retrieve.
The Search Engine will then retrieve the top 'n' ranked files.
- Execute the python file indexer_syntactic.py
- Execute the python file norm_syntactic.py
CAUTION: Create an empty file called invindex_semantic.txt before running indexer_syntactic.py file
- Execute the python file vectors_syntactic.py
- Enter your query on the CLI (preferably about food since the corpus contains food reviews)
- Enter 'n', the number of documents you want to retrieve.
The Search Engine will then retrieve the top 'n' ranked files.