Scrapes news content from relevant news sites, parses it, tags it to MP's
Switch branches/tags
Nothing to show
Pull request Compare This branch is even with Sinar:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
adodb5
.gitignore
README.md
config.inc.php
lib.inc.php
scrapper.php
setup.sql
simple_html_dom.php
simplepie.inc.php
tags
tags.php

README.md

What does it do?

Pokes the following news sites, parses the content, tags it for use in relevant pages:

  • The Malaysian Insider
  • The Star Online
  • The Malay Mail
  • Utusan Malaysia (you really should look at their HTML)
  • Merdeka Review (Malay language version)
  • Free Malaysia Kini

Dependencies

  • PHP 5.3.x (with command line support)
  • MySQL 5.x + relevant PHP bindings
  • A good sense of humour

How do I use it?

$ php -q scrapper.php

Note that it's not been daemonized yet.

Errors are logged in error.log. Other output is to stdout.