Skip to content

AleksaMCode/university-notices-email-notifier

Repository files navigation

University notices email notifier

Scraper for notices on Faculty of Electrical Engineering Banja Luka website. This project scrapes notices from a website and after ETL processing data is sent to the appointed email address through Yahoo SMTP, using smtplib library, in a form of a JSON file.

Table of contents

Introduction

I've always wanted to build a web scraper, and recently I found some free time recently to complete this project. Because the website is dynamic, scraping was done with Selenium API in addition to Beautiful Soup library. The project is written in such way that it can be run both on Windows and Linux.

Note:

  • In order for any of this to work one prerequisite is that you have installed Python 3 on your machine.
  • Be cautious when changing config.ini because it's tightly coupled with python code.
  • The code is tested both on Windows 10 and latest Linux Mint distribution.

Initial Setup

In this section, I will go over details how to set up this project on Linux. However, the majority of the steps are also applicable on Windows. Firstly, you will open the Command line and position yourself to the desired directory, after which you will need to clone this repository using git clone command.

$ git clone https://github.com/AleksaMCode/university-notices-email-notifier.git

Next, position yourself inside the project directory, create a virtualenv and then install all the needed packages from the requirements.txt file.

$ cd university-notices-email-notifier
$ virtuelenv -p python3 venv
$ source venv/bin/activate
(venv) pip install -r requirements.txt

Note:

All of these commands you can find in init.sh file that is located inside of the resources/scripts directory.

Config file setup

Before using this project, you need to adjust a couple of parameters stored in a config ini file. Firstly, you'll need to add an email address (user_email field) you wish to use to receive the email notification. If you wish to use Yahoo SMTP, you only need to update the email and password fields with your own credentials. Below you can find detail instruction how to set up Yahoo SMTP with your account. If for some reason you want to use another email provider, then you will need, in addition to the previously mentioned fields, to update fields that are provider specific, such as port and SMTP server. All of this information is stored in a config file in the SMTP section.

[SMTP]
smtp = smtp.mail.yahoo.com
email =
port = 587
password =
user_email =

Yahoo SMTP

Below you have a table of all the essential details you need:

SMTP server Port Requires SSL Requires TLS Authentication Username Password
smtp.mail.yahoo.com 587 Your Yahoo email address Your Yahoo Mail App Password, which isn't the same as your account password

Restrictions:

  • You can send maximum of 500 emails per day.
  • Some sources claim you can send maximum of 100 emails per hour.

In order to use Yahoo SMTP server, you need to create a dedicated App Password. Firstly you need to go to your account settings area and then click on Account Security after which you will click on Generate app password link under the Other ways to sign in section. After the popup is shown, you will need to enter your app name, which can be anything. Next, click the Generate password button. You should then see the 16-char long app password, which you will need to remember for later usage, as Yahoo will not be showing it to you again.

Scheduling scraping

Windows - Task Scheduler

First thing you need to create is a bat file which will connect the python.exe and notifier.py script. Open a directory in which you wish to create a bat file and open a PowerShell and type the following commands:

New-Item scraper.bat
"@echo of `r`n""C:\Users\Username\AppData\Local\Programs\Python\Python310\python.exe"" ""C:\Users\Username\university-notices-email-notifier\notifier.py"""

Note:
You will need to adjust the syntax above:

  • Set first path where your python.exe is stored.
  • Set second path where notifier.py script is stored.

In order to schedule the scraper using Window Scheduler, you will need to:

  • Open the Windows Control Panel, then click on the Administrative Tools and double-click on the Task Scheduler.
  • Choose the option `Create Task...`.
  • Type a name for this task (description is optional) in General tab and then click on Triggers tab.
  • Press on the New... and then in the newly opened New Trigger window choose to start the task 'One time' starting from 12:00:00 am.
  • In Advanced settings tick 'Repeat task every' and enter your desired frequency.
  • From the drop menu for a duration of choose 'Indefinitely' and press on OK.
  • Press on the tab and click on the New... button. There you will need to browse and find scraper.bat which is located inside of the resources/scripts directory.
  • Press OK twice.

Linux - Cron job

Firstly, you need to open crontab with the following command crontab -e. Once you enter the cron editor, you will need to add the cronjob command. For example, if you want to run this scraper every 30 minutes, you will enter:

0,30 * * * * /usr/bin/python /home/script/university-notices-email-notifier/notifier.py

Save your changes and exit the editor. For more details on how to specify frequency, visit this link.

Note:
Don't forget to exit Vim using :wq. :)

To-Do List

  • Replace json file attachment with html formatted email response.
  • Implement year specific command for notifications.
  • Implement year range command for notifications.
  • Move sensitive information, like password, from config file to environment variables.
  • Implement toast notifications.