Responsible for managing & selection of the URLs (for crawlling)
To download the web page as per the politeness policies
Parses the HTML content and extract the metadata, images, links etc. from the downloaded content
Responsible for maintaining the inverted index of the retrieved/extracted data
Acts as Manager, coordinates the overall crwalling process and assigning tasks to various modules.