To build tools capable of retrieving and parsing information stored across the internet through a fully automated web crawler.
- Created an automated web crawler using Scrapy, Splash, and Selenium.
- Developed a Crawl Spider to navigate websites with dynamic content, effectively scraping JavaScript-driven pages.
- Worked on middleware to enhance the crawler's performance, prioritizing politeness in data retrieval.
- Technologies used: Python, Scrapy, Splash, Selenium.