Keyword-Search-Engine

Keyword search has become a ubiquitous method for users to access text data in the face of information explosion. Type-ahead search predicts a word or phrase that the user may type in based on the partial string the user has typed. Top-k queries are answered by accessing inverted lists on trie leaf nodes. Since inverted lists are usually large, many compression techniques have been proposed to reduce the storage space and disk I/O time. However, these techniques usually perform decompression operations on the fly, which increases the CPU time.

The Generalized INverted IndeX (GINIX) is a more efficient index structure, which merges consecutive IDs in inverted lists into intervals to save storage space. With this index structure, more efficient algorithms can be devised to perform type-ahead keyword search, by taking the advantage of intervals. Specifically, these algorithms do not require conversions from interval lists back to ID lists. As a result, keyword search using GINIX can be more efficient than those using traditional inverted indices. In this project, top-k queries are answered by building a trie on top of GINIX. Trie data structure ensures faster query processing time and GINIX helps in reducing the storage space required for indexing.

Key features :

Exact Search
Prefix & Token Matching
Ranking
Top-k Answers
Automatic Query Completion

Index Structures :

Inverted Index
Trie
Generalized INverted IndeX (GINIX)
GINIX + Trie

Ranking Scheme : TF*IDF

The steps carried out for information retrieval can be summarized as follows:

Document collection
Preprocessing of documents
Index construction
Process input keywords from user
Convert user query to system specific query
Retrieve results speci=fic to the query using the index
Rank the returned results

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
css		css
js		js
setup		setup
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
auto_search.php		auto_search.php
dblp_logo.png		dblp_logo.png
index.php		index.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keyword-Search-Engine

About

Uh oh!

Releases

Packages

Uh oh!

Languages

remis1889/Keyword-Search

Folders and files

Latest commit

History

Repository files navigation

Keyword-Search-Engine

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages