Skip to content

david-jankoski/packt-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Packt Book of the Day Scraper

A small project that pushes desktop notifications to the user about free book of the day deal from the Packt website.

Packt Website

Packt is a website that offers learning materials for various kinds of IT related things. Each day there is a different book offered as a book-of-the-day deal for free! In order to get the book just head over to their website and make yourself an account which enables you to download the books in couple of different formats.

Motivation

Usually (at least for me) these things end up in some dark corner in the bookmarks - maybe I would go back and check the website for a couple of days in a row, but then I would slowly forget about it and every now and then remember to check back in on some free learning materials.
So I thought to piece together a small scraper that would get the book deal of the day and push a desktop notification when i log in. That way I could stay informed and never miss the chance to obtain some nice material on a topic of interest.

Code

The code is really just 2 R scripts sitting in the R/ dir.

  • packt_book_deal_scraper is the main script which goes out and grabs the title and the associated cover image of the book deal of the day. It uses the handy notifier package by Gábor Csárdi to push a desktop notification to the user. Additionally I was interested in doing some analysis of the languages, technologies and frameworks that are being offered over time - so it stores all the info in a csv file as well.

  • check_and_run_scraper is the script that gets executed on each user log on. It checks if it already has today's information in the data/ dir and launches the scraper if not, otherwise it does nothing.

In the data/ dir some sample data and images can be found.

Scheduling the scraper

I included a short explanation on easy ways to turn this into a scheduled task on linux and windows machines in schedule_scraper

To Do

  • For the analysis part need to find a way how to automatically detect the keyword that stands for a certain language/technology from the book title. I'm sure there is somewhere out there a nice list of all possible languages and technologies under the sun that could be matched against.

  • A possible adaptation for the purpose of this scraper would be to just store some keywords that one is interested in like e.g. "machine learning", "data structures c" and fuzzy-match against it in order to signal an offer tailored to one's interests.

In case anyone ever reads this far - thanks for taking the time and please let me know what doesn't work as expected and we'll fix things together.

About

notify me what's the book deal of the day?

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages