A fully customizable web contents crawler for collecting ml dataset.
- text crawler (general text crawler)
- classified text crawler (crawls text contained by button, input placeholder, etc..)
- image crawler
- screen shot crawler
Follows general bridged contributing guideline
Crawlers powered by Scrapy with Python3. It'll later use Selenium for collecting screenshots & supporting client-side rendered apps.
(WIP) - tutorial will be provided soon
Goto ui-dataset for ml-ready dataset