Site Lab aims to be an open-source replacement for website analysis tools such as BuiltWith, NerdyData, and DataNyze.
Site Lab is a Ruby on Rails application. It uses PostgreSQL as its database and Redis + Sidekiq for background processing.
How Does it Work?
Right now, it's fairly simple:
- The MetaInspector Gem retrieves some basic info about the site/URL
- There is a "Technology" model which stores regular expressions
- Technologies are matched against the source of the sites/URLs
- Much of the processing now happens in the background (via Sidekiq)
More complex analysis is in the works.
It's a Rails 4.1 app, so you'll need a dev environment that supports that (prolly RVM). You'll also need Redis installed and running (probably via Homebrew)
- Clone the repo
- Edit the database.yml file with your info
bundle installto install gems
bundle exec rake db:createto create the DB(s)
bundle exec rake db:seedto load the seed data
foreman start -p 3000to start the rails server & sidekiq locally on port 3000
While you can surely add sites/URLs one-by-one in the app, most use-cases will involve importing large sets of URLs from files or external sites. With that in mind, I've started a set of Rake tasks for importing URLs. Currently, it includes:
- Importing all startups from AngelList for a given market
- Importing all startup/product URLs listed on Producthunt
- Importing URLs from a text file (placed in app/import)
- Importing all startup URLs from VCDelta
rake -T to see the tasks and required parameters. There is also a sample text file in app/import.