Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
A web crawler in PHP
PHP
tree: 7f97f7258c

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
.gitignore
CONFIG_db_example.php
LIB_db_functions.php
LIB_encoding.php
LIB_exclusion_list.php
LIB_http.php
LIB_parse.php
LIB_resolve_addresses.php
LIB_simple_spider.php
LICENCE.txt
README.md
example_db.sql
listNodes_graphml_04_byDomain02_centralGov.php
spider.php

README.md

phpWebCralwer

A web crawler in PHP.

Note that an additional file, CONFIG_db.php, is required. This sets the database server, name and password, as well as various other global options. An example file is included.

TODO:

  • Interface the Public Suffix List, to get correct domains parsed for domains table
Something went wrong with that request. Please try again.