Skip to content

patniharshit/Wikipedia-Search-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wikipedia-Search-Engine

Search Engine on Wikipedia dump with support for field queries

Requirements

  • Python 2.6 or above
  • Python libraries:
    • Porter Stemmer
    • XML Parser
    • NLTK

Index can be generated using:

  ./index.sh  "path_to_wiki_dump"

For Searching:

  python search.py

Sample Query

  • Plain query
  • Field query: "C:Plane B:Bus T:Air"

Term Field Abbreviations: b:Body, t:Title e:External Link, c:Category

You can download a small dump to test run from here.