Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Did you come across a website in which you were really interested, and wanted to check for any updates, or maybe you are interested in getting updates if the documentation of your favorite website was updated, Is the answer is yes, Monitor-Web is your one stop solution, Monitor-Web tracks any changes in your favorite content and alerts you with …

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 LICENSE
Octocat-spinner-32 README
Octocat-spinner-32 crawler_config.py
Octocat-spinner-32 crawler_db_handling.py
Octocat-spinner-32 crawler_log.py
Octocat-spinner-32 sync.py
Octocat-spinner-32 website-list.txt
README
Introduction
============

Did you come across a website in which you were really interested, and wanted to check for any updates, or maybe you are interested in getting updates if the documentation of your favorite library or code repository was updated, If the answer is yes, Monitor-Web is your one stop solution, Monitor-Web tracks any changes in your favorite content and alerts you with proper log of differences. So, now never waste time surfing the web to check if there are any updates, Simply add the website you wish to monitor and relax, Whenever you need to check simply run the program and it will automatically sync for any changes or are provide you with a diff like output. It works best for static websites, mainly online HTML ebooks, online documentation, course lists, wiki's or something similar.

Dependencies
============

Monitor-Web is written in python and follows a procedural structure. It uses some awesome libraries most of them are standard and one of them is third party.

The only third-party library it supports is BeautifulSoup, development site:-

    http://www.crummy.com/software/BeautifulSoup/

Usage
=====

To start Monitor-Web :-

Install BeautifulSoup.
Download or Clone the repository.
and run sync.py

Politeness
==========

Crawling a webpage repetitively can cause bandwith loss to the target site. It is recomended to sync data, atleast after a 15 minute interval.

Output
======

Monitor-Web outputs the diffs to stderr, which ofcourse can be redirected to the desired file. In UNIX it can be done in the following way :-

./sync.py 2> output.diff

Author
======

Aneesh Dogra (lionaneesh-at-gmail-dot-com)
Something went wrong with that request. Please try again.