Skip to content

Watch a website and get alerted by email when it changes

Notifications You must be signed in to change notification settings

qbalin/site_watcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

site_watcher

This utility works best as a cron job. On a given website, it checks the presence of an element using an XPath selector. If the element is present, the program exits, if not, it sends an email.

Installation

cd site_watcher
yarn install

Config file

To specify which sites to watch, and which element to look for, create a file named config.json in the site_watcher folder:

touch config.json

Open it, and paste something like:

[
  {
    "page": "https://www.lego.com/en-us/product/pirate-ship-31109",
    "selector": "//*[@data-test='product-overview-availability']//descendant::*[text()='Coming Soon']",
    "name": "Lego"
  }
]

You can add a many entries as you like, just make sure their names are unique. This examples looks at the presence of the text "Coming Soon" on the Lego product page https://www.lego.com/en-us/product/pirate-ship-31109. As long as "Coming Soon" is present, I won't receive an email. Once it is absent, I will receive an email with a link to the site.

A file with a screenshot of the site will be created locally, with the name Lego.png, and the HTML will also be captured in a file named Lego.html. The presence of these files makes the watcher stop looking at this site, so that I do not get flooded with emails.

.env

To properly receive an email, you will need to configure an account that the script can connect to by username and password, and an email address to send notifications of changes.

touch .env

In the .env file, add this, replacing the dummy values with proper credentials:

EMAIL_ACCOUNT_PASSWORD=passwordToSendingAccount
EMAIL_ACCOUNT=sendingAccount
SEND_TO=emailAddressToNotify

Nodemailer is used to send emails. To configure a GMail account from which the notifications can be sent, checkout this page.

Raspberry Pi vs Dev machine

This script should run natively under Rapsberry Pi OS, provided Chromium is installed. To run it in debug mode on your development machine, replace the line

const browser = await puppeteer.launch({ executablePath: "/usr/bin/chromium-browser", args: ['--no-sandbox', '--disable-setuid-sandbox'] });

by

const browser = await puppeteer.launch({ headless: false });

Tor Support

This blogpost explains quite clearly how to setup Tor and add ports to get a different address for each use. Once you attributed a list of ports, you can add it to the .env file, like so:

TOR_PORTS=9050,9052,9053,9054,9055

A random port will be selected at each run.

Scrape responsibly!

Do not spam the websites you are watching! Watch them once a day, or less. Remember that every connection is energy spent :)

References

Under a raspberry pi, launch chrome with:

const browser = await puppeteer.launch({ executablePath: 'chromium-browser' });
OR 
executablePath: "/usr/bin/chromium-browser"

About

Watch a website and get alerted by email when it changes

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published