Parsing and indexing Wikipedia articles
This is one of my courseworks I've done during my study. The program consists of four main modules:
- Indexer: this module handles indexing documents
- Parser: this module parses the XML data file creates documents(Page object) and send it to Indexer for indexing
- Searcher: this module handles searching
- WikipediaRetriever: this is a wrapper module for all functionalities. You should run this module if you want to use provided functionalities of the program. After running this module you will have 2 options: first one is indexing the data and the second option is to query in already created index.