Crawls the web, mining for phone numbers and metadata of people who are interested in Mobile Dating, then saves the results to your configured S3 Bucket.
Optionally you can have it send an email summarizing the results or queue a Resque job to process the results.
Clone the app
$ git clone https://github.com/dwilkie/dating_crawler.git
And then execute:
$ bundle
Configuration can be specified thorugh environment variables. The following environment variables can be used:
AWS_ACCESS_KEY_ID # required - your aws access key id
AWS_SECRET_ACCESS_KEY # required - your aws secret access key
AWS_S3_BUCKET # required - the bucket in which to upload the results
RACK_ENV # required - specifies your environment. Set to production for deployment
GMAIL_ACCOUNT # optional - the Gmail account to use when sending the results email
GMAIL_PASSWORD # optional - the Gmail password for the account above
RECIPIENT_EMAIL # optional - the recipient of the results email
REDIS_URL # optional - the redis URL to connect to for queuing the Resque job
RESQUE_QUEUE # optional - the resque queue in which to queue the job
RESQUE_WORKER # optional - the resque worker which will run the job
You can also configure the app by passing the configuration directly. E.g.
require './app/models/data_fetcher'
DataFetcher.new.fetch!(configuration)
See: the source for all available configuration options.
$ bundle exec rake data:fetch
require './app/models/data_fetcher'
DataFetcher.new.fetch!
$ heroku create
$ git push heroku master
$ heroku config:add AWS_ACCESS_KEY_ID=aws_access_key_id AWS_SECRET_ACCESS_KEY=aws_secret_access_key AWS_S3_BUCKET=aws_s3_bucket RACK_ENV=production
$ heroku config:add GMAIL_ACCOUNT=someone@gmail.com GMAIL_PASSWORD=secret RECIPIENT_EMAIL=someone@example.com
$ heroku config:add REDIS_URL=redis_url RESQUE_QUEUE=some_worker_queue RESQUE_WORKER=SomeWorker
$ heroku run rake data:fetch
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request