Skip to content

IanBod/WWW-Crawl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WWW-Crawl

The WWW::Crawl module provides a simple web crawling utility for extracting links and other resources from web pages within a single domain. It can be used to recursively explore a website and retrieve URLs, including those found in HTML href attributes, form actions, external JavaScript files, and JavaScript window.open links.

WWW::Crawl will not stray outside the supplied domain.

INSTALLATION & TESTING

To run author tests, set the environment variable RELEASE_TESTING

Installation tests are only run if Test::Mock::HTTP::Tiny in installed.  If you wish to run a full set of tests, ensure this module is installed before installing WWW::Crawl.

To install this module, run the following commands:

	perl Makefile.PL
	make
	make test
	make install

SUPPORT AND DOCUMENTATION

After installing, you can find documentation for this module with the
perldoc command.

    perldoc WWW::Crawl

You can also look for information at:

    RT, CPAN's request tracker (report bugs here)
        https://rt.cpan.org/NoAuth/Bugs.html?Dist=WWW-Crawl

    Search CPAN
        https://metacpan.org/release/WWW-Crawl


LICENSE AND COPYRIGHT

This software is Copyright (c) 2023 by Ian Boddison.

This program is released under the following license:

  Perl

About

Perl module to crawl a single website

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages