Skip to content

Data mining and analysis for old houses for sale.

Notifications You must be signed in to change notification settings

Wizdore/finn_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Finn Scraper

This application scrapes, stores and notifies the user through Discord bot of scraping process. I wrote it to collect data for a data analysis task as finn doesnt show more than 50 pages of house at a time so it collects new house data everyday. I have done some preliminary data exploration in the jupyter notebook but there wasnt enough data make prediction models so I wrote a script that can collect new data every day.

What it does

  • Mines house on sale data from finn
  • Stores the data in a database in datastore/ directory
  • Notifies the user over discord on how many houses has been scraped
Tested on
  • Arch Linux on a Laptop
  • RaspberryPi OS on RaspberryPi 4B

The project also includes a bash script cronjob.sh that can be used to periodically run the application using crontab on linux. Im using pipenv for package and environment management, TinyDB for database and Discord's webhook to send message to discord server. The webhook needs to be saved in a .env file in the project directory as an environment variable named DISCORD_WEBHOOK, pipenv will load the environment variable.

TODO:

  • Deterministic Build System
  • More robust scraping error reporting
  • Scraping Job posts
  • Dashboarding