Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
The HTML-Parser distribution is is a collection of modules that parse and extract information from HTML documents

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
c
hints
lib/HTML
t
.cvsignore
Changes
MANIFEST
MANIFEST.SKIP
Makefile.PL
Parser.pm
Parser.xs
README
mkhctype
mkpfunc
tokenpos.h
typemap

README

HTML-Parser-XS
--------------

This is an alpha release of a new XS based implementation of
HTML::Parser.  It should be completely compatible with the old
HTML-Parser-2.xx, but improves parsing speed as well as adding several
new options that control how parsing takes place.




HTML-Parser
-----------

This is a collection of modules that parse HTML text documents. These
modules used to be part of the libwww-perl distribution, but are now
unbundled in order to facilitate a separate development track.  Bug
reports and discussions about these modules can still be sent to the
<libwww@perl.org> mailing list.  Remember to also take a look at the
HTML-Tree module collection that create and extract information from
HTML syntax trees.

The modules present in this collection are:

  HTML::Parser - The parser base class.  It receives arbitrary sized
        chunks of the HTML text and will tokenize it by calling
        appropriate methods on itself.

  HTML::Entities - Provides functions to encode and decode text with
        embedded HTML &lt;entities&gt;.

  HTML::Filter - An HTML::Parser subclass that filters HTML text. You
        will need to make a subclass if you want it to do more than
        cat(1).

  HTML::HeadParser - A lightweight HTML::Parser subclass that extract
        information from the <HEAD> section of an HTML document.

  HTML::LinkExtor - An HTML::Parser subclass that extract links from
        an HTML document.

  HTML::TokeParser - An alternative interface to the basic parser
        that does not require subclassing.  Most simple parsing needs
        are probably best attacked with this module.



PREREQUISITES

In order to install and use this package you will need Perl version
5.004 or better.  If you intend to use the HTML::HeadParser you need to
have the libwww-perl distribution installed.



INSTALLATION

Just follow the usual procedure:

   perl Makefile.PL
   make
   make test
   make install



COPYRIGHT

  © 1995-1999 Gisle Aas. All rights reserved.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
Something went wrong with that request. Please try again.