Skip to content

KevinColemanInc/mm-crawler

Repository files navigation

MM Crawler

This project scrapes profiles from meetme.com using docker. Follow the quick start and it will create an API for accessing all of the created profiles to be hosted on netlify.

Scrapes profiles between ages 19 and 35.

Quick start

Required ENV variables

  • MM_EMAIL=
  • MM_PASSWORD=

Steps

Note the locations in app.rb.

  1. docker-compose up --build
  2. ssh into docker host
  3. run ruby app.rb
  4. wait for sidekiq jobs to finish (takes about 1-2 hours)
  5. run ruby consolidate.rb - creates /results/profiles.json with all crawled profile info and relative photo paths

Netlify deploy

In the ~/Desktop/results, run netlify deploy. Make note of the domain.

Workers

  1. nearby_crawler - fetches the profiles nearby and stores them. Queues fetching photo json
  2. get_photo_jsons(member_id) - returns json of photos
  3. get_photo(url) - persists photos

Output folder structure (in Docker):

/results/

  • {member_id}.json
  • {member_id}_{photo_id}.jpg

Releases

No releases published

Packages

No packages published

Languages