This repo contains the source code used at the DNA Capitals Hackathon.
There are two main divisions
- Company Info Crawler - Ruby on Rails application to extract generic information from the target website. Elastic Search, SideKiq is used alongside with the application
- Web Scrapper - Python Notebooks used with beautifulSoup to extract specific information from the target website.