Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Scrapes news content from relevant news sites, parses it, tags it to MP's
branch: master

This branch is even with Sinar:master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
adodb5
.gitignore
README.md
config.inc.php
lib.inc.php
scrapper.php
setup.sql
simple_html_dom.php
simplepie.inc.php
tags
tags.php

README.md

What does it do?

Pokes the following news sites, parses the content, tags it for use in relevant pages:

  • The Malaysian Insider
  • The Star Online
  • The Malay Mail
  • Utusan Malaysia (you really should look at their HTML)
  • Merdeka Review (Malay language version)
  • Free Malaysia Kini

Dependencies

  • PHP 5.3.x (with command line support)
  • MySQL 5.x + relevant PHP bindings
  • A good sense of humour

How do I use it?

$ php -q scrapper.php

Note that it's not been daemonized yet.

Errors are logged in error.log. Other output is to stdout.

Something went wrong with that request. Please try again.