I built this Python script to get email notifications whenever new job ads have been posted on career sites relevant to me.
There is one central, modular script, job_scrape.py, calling individual functions in /websites/ each scraping a different career site.
Each scraper module works as follows:
- Parse* webpages containing current job listings.
- Process and store job details in a dictionary.
- Compare current postings with those from last execution.
- Extract new postings and, if applicable, filter on relevant criteria (e.g. location).
- Return a dictionary containing details of the filtered postings, or None when there are no new jobs.
These results are then joined as formatted email texts (MIMEMultipart class) in job_scrape.py. A secured SMTP connection will be started and the email will be send.
This script is executed every 24h as a scheduled task on PythonAnywhere.
*Note: For static webpages, the BeautifulSoup package is used to scrape and parse HTML-documents. For dynamic webpages, Selenium's WebDriver is utilized, initiating a headless browser to capture rendered data.