Fetching latest commit…
Cannot retrieve the latest commit at this time
|Failed to load latest commit information.|
NAME scrape.pl - simple HTML scraping from the command line ABSTRACT This is a simple program to extract data from HTML by specifying CSS3 or XPath selectors. SYNOPSIS scrape.pl URL selector selector ... # Print page title scrape.pl http://perl.org title # The Perl Programming Language - www.perl.org # Print links with titles, make links absolute scrape.pl http://perl.org a //a/@href --uri=2 # Print all links to JPG images, make links absolute scrape.pl http://perl.org a[@href=$"jpg"] DESCRIPTION This program fetches an HTML page and extracts nodes matched by XPath or CSS selectors from it. If URL is `-', input will be read from STDIN. OPTIONS --sep Separator character to use for columns. Default is tab. --uri COLUMNS Numbers of columns to convert into absolute URIs, if the known attributes do not everything you want. --no-uri Switches off the automatic translation to absolute URIs for known attributes like `href' and `src'. REPOSITORY The public repository of this module is http://github.com/Corion/App-scrape. SUPPORT The public support forum of this program is http://perlmonks.org/. AUTHOR Max Maischein `firstname.lastname@example.org' COPYRIGHT (c) Copyright 2011-2011 by Max Maischein `email@example.com'. LICENSE This module is released under the same terms as Perl itself.