Skip to content
A scraper gets run every 5 hours to find apartments automatically instead of looking up manually everyday
JavaScript Dockerfile Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.vscode
house-scraper-timer
utils
.dockerignore
.gitignore
Dockerfile
README.md
azure-pipelines.yml
host.json
package-lock.json
package.json
release-notification.sh

README.md

Welcome to house-finder 👋

Documentation

A task scheduler runs every now and then to automatically find apartments instead of doing it manully everyday. Tech: Azure function, Docker, Puppeteer, S3.

🏠 Homepage

Steps

  1. A function gets triggered on a timer.
  2. Gets existing data from S3 bucket.
  3. Goes to website 1, scrap all data.
  4. Compares with the existing data from S3, and remove duplicates to reduce overhead and only keep the new ones.
  5. Makes api calls to binmap to filter out unwanted data.
  6. Goes to website 2, repeat steps from 4 - 5.
  7. Upload both the latest data and old data to S3, so in the future don't scrap them again.
  8. Send email only with the latest data.

Install

npm install

Usage

npm run start

Run tests

npm run test

Author

👤 YIZHUANG

Show your support

Give a ⭐️ if this project helped you!

You can’t perform that action at this time.