Each of the search engine components are listed with some of their major include files.
- HTML Parser
- (HtmlParser.h): Parse the html for words, titlewords, and URLs
- Crawler
- (CrawlerApp.h): Start/stop all crawlers
- (CrawlerManager.h): Manage the distribution of URLs between machines
- (Crawler.h): Pop from frontier and crawl webpages
- (GetUrl.h): Download the HTML from the URL
- (Fontier.h): Manage the queue of URLs to be crawled
- Index
- (IndexConstructor.h): Define Index Construction internal interface
- (FileManager.h): Defines interface interacation with disk file chunks
- (Dictionary.h): Defines interface for index read access
- (DocumentsSerializer.h) Defines interface to serialzie document data
- (DictionarySerializer.h) Defines interface to serialize in-memory term postings list hashtable
- (EndDocSerializer.h) Defines interface to serialize end doc postings list
- Constraint Solver
- (abstractISR.h): Define AbstractISRs interface
- (constraint_solver.h): Find matching documents given query tree
- Query Compiler
- (query_Compiler.h): Preprocess query string, output final query tree
- (Transformation.h): Transform preprocesed string into query tree
- Ranker
- (ranker.h) Return the total score together with the url.
- (ISRSpan.h) Operations on the spans of ISRs.
- Front End / Web Server
- (SearchPlugin.h): Parse the user query, hand off to QueryServer, return HTML with results
- (QueryServer.h): Send the query to each RankServer, wait for results, and merge results
- (RankServer.h): Listen for queries, search index, and return ranked results