Performs tokenization, stemming, lemmatization, index creation, index compression and ranked retrieval of Cranfield documents
-
Updated
May 10, 2020 - Python
Performs tokenization, stemming, lemmatization, index creation, index compression and ranked retrieval of Cranfield documents
an information retrieval search engine
BUPT信息与知识获取大作业
Backend application for javascript snippet search engine. Data.csv is from 30 seconds of code's database, https://github.com/30-seconds/30-seconds-of-code/tree/master/snippets
This project implements an in-memory search engine for indexing and retrieving documents from a CSV file using Python and NLTK. It preprocesses text, builds an inverted index, and ranks documents based on relevance to a query using the Okapi BM25 algorithm.
Add a description, image, and links to the information-retrieval-engine topic page so that developers can more easily learn about it.
To associate your repository with the information-retrieval-engine topic, visit your repo's landing page and select "manage topics."