Skip to content
/ IR Public

Lab files for the course Information Retrieval

Notifications You must be signed in to change notification settings

EshaanAgg/IR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Information Retrieval

This repository contains all the relevant scripts and code for the course IR: Information Retrieval.

Assignments

Lab 1

  • Verify Zipf's Law for the provided dataset
  • Implement Porter's Stemmer and stemmer for Bengali
  • Explore the rules for the above

Lab 2

  • Implement pre-processing (tokenization, spot word removal and stemming)
  • Create a document index from the same
  • Implement boolean retrieval based on the same for the following languages
    • English
    • Bengali

Development Setup

  1. Please ensure that you have downloaded the unzipped datasets in a data directory with the name of the language that you want to analyse in the code.
  2. Install the required required dependencies with pip install -r requirements.txt
  3. Run the code file for the particular assignment from the root of the project. For example, python assignment2/english.py.

About

Lab files for the course Information Retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages