MM Crawler

This project scrapes profiles from meetme.com using docker. Follow the quick start and it will create an API for accessing all of the created profiles to be hosted on netlify.

Scrapes profiles between ages 19 and 35.

Quick start

Required ENV variables

MM_EMAIL=
MM_PASSWORD=

Steps

Note the locations in app.rb.

docker-compose up --build
ssh into docker host
run ruby app.rb
wait for sidekiq jobs to finish (takes about 1-2 hours)
run ruby consolidate.rb - creates /results/profiles.json with all crawled profile info and relative photo paths

Netlify deploy

In the ~/Desktop/results, run netlify deploy. Make note of the domain.

Workers

nearby_crawler - fetches the profiles nearby and stores them. Queues fetching photo json
get_photo_jsons(member_id) - returns json of photos
get_photo(url) - persists photos

Output folder structure (in Docker):

/results/

{member_id}.json
{member_id}_{photo_id}.jpg

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
mm_crawler		mm_crawler
.gitignore		.gitignore
Dockerfile		Dockerfile
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
app.rb		app.rb
consolidate.rb		consolidate.rb
docker-compose.yml		docker-compose.yml
mm_crawler.rb		mm_crawler.rb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MM Crawler

Quick start

Required ENV variables

Steps

Netlify deploy

Workers

About

Releases

Packages

Languages

KevinColemanInc/mm-crawler

Folders and files

Latest commit

History

Repository files navigation

MM Crawler

Quick start

Required ENV variables

Steps

Netlify deploy

Workers

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages