Skip to content

Inverted Indexing on a corpus along with boolean search retrieval.

Notifications You must be signed in to change notification settings

Wasiq-Malik/Inverted-Indexing

Repository files navigation


Inverted Indexer

An Inverted Indexer written in Python

Table of Contents

  1. About The Project
  2. Getting Started
  3. Usage
  4. License

About The Project

This creates an Inverted Index for a given corpus. Inverted Index is a mapping of content (Words, Numbers etc) to its position in various documents. This speeds up query searches on the whole corpus.

Built With

Getting Started

To get a local copy up and running follow these simple steps.

Prerequisites

Installation

  1. Clone the repo
    git clone https://github.com/WasiqMalik/Inverted-Indexing.git
  2. Install Requirements
    pip3 install nltk

Usage

How to Run

  1. Open up command line or terminal and navigate to the cloned repo's directory
    cd "PATH-TO-DIRECTORY"
  2. Place the blocks of your corpus in numbered sub-directories.
    e.g. "PATH-TO-DIRECTORY/1"
  3. Run the indexer.py file (use python if you have created it as an alias for python3)
    python3 indexer.py

License

Distributed under the MIT License.

About

Inverted Indexing on a corpus along with boolean search retrieval.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages