Skip to content

Latest commit

 

History

History
1817 lines (1264 loc) · 65.2 KB

cpan_in_a_nutshell.pod

File metadata and controls

1817 lines (1264 loc) · 65.2 KB

CPAN in a nutshell

Data and Data Types

List/Array modules

Handy utility functions can be found in the core module (since 5.7.3) List::Util. Functions missing in this module are gathered in the CPAN module List::MoreUtils. This module also obsoletes the CPAN modules List::MoreUtil and List::Utils.

Serialization

Before choosing a serialization module, one needs to choose a serialization format first. Some differences in the formats are:

Completeness of serialization

Some serialization formats target to be able to serialize all aspects of Perl. All serialization formats can handle the basic types (scalars, lists, hashes), but for other types (regexps, blessed objects, even code references) on needs to choose either a Perl-generating serializer (e.g. Data::Dumper), a YAML-generating serializer, or Storable. Particularly the Bencode and JSON formats are not capable to handle all Perl types.

Inter-language interaction

If the serialization data should be passed to other programming languages, then a standardized serialization format like YAML, JSON, or Bencode should be used. Especially the output of Data::Dumper (pure perl) is not suited, so is probably not Storable (deserializers probably do not exist outside the Perl world). While XML is a standard, there's probably no standard for serializing data structures into XML data.

Readability for humans

Most serialization formats are plain text. Some are more readable than others. In the author's opinion, the readability of serialization formats ranges like following: YAML, Perl-based serialization, JSON, Bencode, XML, Storable (which is a binary format).

Speed

A benchmark with results can be found here: http://idisk.mac.com/christian.hansen/Public/perl/serialize.pl (if the URL is not available, then try the archive URL http://web.archive.org/web/20060507091304/http://idisk.mac.com/christian.hansen/Public/perl/serialize.pl).

Another comparison which is primarily aimed at caching may be found here: http://cpan.robm.fastmail.fm/cache_perf.html. Note that caching is almost always used together with some serialization means.

Security

Some serializer modules allow the deserialization of code references, which is possibly dangerous. But no module do this by default, typically a switch has to be turned on.

Namely one serialization format has to use Perl's eval or Safe::reval function for deserialization: Data::Dumper.

Storable may be configured to eval code references, which is possibly a dangerous operation. This option is off by default.

YAML

YAML as a format is neat: it's human readable, portable across languages, unfortunately too complex.

The first YAML module is the pure perl implementation YAML. It is somewhat slower than the competitors written in C/XS.

YAML::Syck is a C based parser for the YAML language. It's API compatible with YAML (has the same Load/Dump methods), but it's not as configurable as YAML. Performace seems to be quite good, also it seems to be more standard-compliant as YAML. For example, there's no way to get YAML dump/load utf-8 files, but in YAML::Syck it's at least configurable.

YAML::XS (part of the YAML-LibYAML distribution) is also a C based parser using libyaml instead of libsyck. This parser also support YAML 1.1. The 0.27 version has issues with latin1, but this is possibly solved in the forthcoming 0.28 release.

YAML::Tiny is a small parser/generator which only implements a subset of YAML.

CPAN::Meta::YAML handles a subset of YAML, just enough to parse CPAN Meta files. It has the advantage for being in the perl core since 5.13.9.

JSON/Javascript

JSON is a human readable format which is based on javascript syntax. It's not so complex as YAML (especially is unable to serialize many perl types like blessed objects, regexps, code references, self-referential data...), but has the other advantages of YAML. A pure perl implementation is JSON.

JSON::XS is another C based parser/generator for JSON. It seems to be the only one which handles Unicode correctly.

JSON::Syck is a C based parser for the JSON language.

JSON::Any is a wrapper which uses any installed available JSON parser/generator. Using this module is a safe way to protect one from the API changes in JSON and JSON::XS.

JSON::PC is now declared as non-maintained and buggy.

Data::JavaScript serializes data structures from perl to javascript, can handle circular structures, but output is quite verbose.

Data::JavaScript::Anon is also a serializer for perl data to javascript syntax, cannot handle circular references, but the output is more terse compared to Data::JavaScript.

CBOR

CBOR::XS: CBOR (Concise Binary Object Representation) is a binary serialization which follows the JSON data model, and adds some additional capabilities, like serialization of perl objects. According to the manpage, it is faster than JSON::XS and Storable.

XML

XML::Dumper is a serialisation module for dumping Perl objects from/to XML. As usually with XML, the output is very verbose and probably slower than other modules. It's possible to include a DTD in the generated XML document. There's no support for code references.

XML::Simple can also be used for serializing/deserializing.

XML::Smart --- the bad point about XML::Simple is, that one has to carefully specify what goes into a hash or array or scalar, otherwise the structure of the resulting data structure is somewhat undefined. XML::Smart tries to solve this problem by using an universal data type which can be used either as hash or array or scalar. However, I never tried this module in reality.

Bencode

The following modules can handle Bencode data: Convert::Bencode, Convert::Bencode_XS.

Standard-less formats

Storable is in the core since 5.7.3. It's fast (because implemented in XS and using a binary format), but it's not human readable. The file format is forward compatible and the module versions are backward compatible, which means you can read all Storable-generated files with a recent Storable version, but only a Storable file generated by an old version can be read by all Storable versions. Storable has neat features like customizable per-class hooks and the ability to (de)serialize code references. Storable can (de)serialize using strings and filehandles; the latter helps in reducing the memory footprint when (de)serializing large data structures.

Data::Dumper is in the core since 5.005. The output is human-readable as it's perl code. Data::Dumper comes with an optional XS part, but has also a pure-perl part. Deserializing is done through eval() which is probably dangerous, Safe::reval should be preferred. Deserializing is a little bit unhandy. Data::Dumper is also able to serialize code references.

Sereal::Decoder and Sereal::Encoder are fast and space-efficient. More objectives are described in https://github.com/Sereal/Sereal/blob/master/README.pod.

Data::Dump::Streamer is maybe worthwhile to take a look on XXX.

Data::Denter is a predecessor of YAML.

Data::Serializer is a meta class for handling a couple of serialization methods together with encryption, compression and checksumming.

FreezeThaw is ancient, pure-perl, and much slower than other modules.

The benchmark by Christian Hansen also mentions the module RPC::XML.

PHP::Serialization handles a serialization format which is popular in the PHP world. But note that the current version 0.31 is not recommended by the maintainer, due to some severe known bugs. Another PHP serializer is PHP::Session::Serializer::PHP.

Data::Pond is dealing with the "Perl-based open notation for data". It's somewhat similar as Data::Dumper, as the generated serialization data is perl code and may be eval'ed, but it has also a deserialization method to avoid the possibly dangerous eval operation. Also, Data::Pond can only serialize a subset of Perl data, probably to make the format language-independent.

Data::MessagePack is a binary "packer" for perl data structures. See http://cpanratings.perl.org/dist/Data-MessagePack for reviews on this serializer.

Google's protocol buffers format is implemented by Google::ProtocolBuffers.

Validation

Data::FormValidator is a rather old validation module which works with CGI-like data, but may also be used for validation of generic hashes.

Kwalify implements the same-named schema language originally appeared in the Ruby community.

Data::Rx and Data::Schema are new players in the data validation business.

JSON::Schema is a module handling a schema language originally written for JSON data, but may also be used for simple Perl data structures.

Sah is another schema for data structures.

Hash tools

To get ordered hashes, use either Tie::IxHash (which is old and seems to be rock-stable), Tie::InsertOrderHash (has problems with older perls i.e. 5.005) or Tie::Hash::Indexed (this is an XS implementation of Tie::IxHash, and, acoording to the documentation, usually twice as fast as Tie::IxHash). And there are even more: Tie::StoredOrderHash (which has a comparison over the others), Tie::LLHash, Tie::Hash::ButMoreFun.

For case insensitive hashes, use either Tie::CPHash or Hash::Case.

Locking hash keys and/or values is done with the core module Hash::Util. See also "Constants".

To create a "flat" representation from a nested hash structure Hash::Flatten may be used. Its Pod documentation mentions more similar modules: CGI::Expand and Tie::MultiDim. Hash::Fold is similar to Hash::Flatten, but is built using Moose.

Iterators

Check for Iterator and Object::Realize::Later (lazy evaluation???). Object::Iterate introduces an iterate method to loop conveniently over a list-like object.

Traverse

Data::Rmap traverses complex data structures and makes it possible to modify the elements in-place. Related modules are: Data::Dmap, Data::Walk (which implements a File::Find-like interface), Data::Transformer, Data::Traverse, Data::Visitor.

Date and time modules

A complete implementation is offered by DateTime, with a plethora of plugins and additions (parsers, formatters, locale and time zone support...). A list of DateTime-related modules may be found at http://datetime.perl.org/?Modules.

A classic one is Date::Calc, a rather complete implementation of date/time functions with a functional (using somewhat non-perlish naming) und object-oriented interfaces. It's written in C, but the functionality is also available in the pure perl Date::Calc::PP module (and historically there was another pure perl replacement Date::Pcalc).

Time::Moment is a C based module which is faster than DateTime. It has functionality for constructing a date object from epoch, parameters, and iso8601-like strings; and for arithmetic, comparisons and strftime-like formatting.

The standard perl functions and modules: time, localtime, strftime and mktime in POSIX, Time::Local.

Other efforts are: Time::Piece, Date::Handler, Date::Manip ... A description of the many date and time modules may be found here http://www.perl.com/pub/a/2003/03/13/datetime.html (but the page is unfortunately outdated --- DateTime is not mentioned)

Faking time

fixedtime changes functions time, gmtime and localtime to a specified delta. It is a lexical pragma and therefore only works with perl 5.10 and better.

Time::Fake does the same thing, but is not lexically scoped and therefore works also with older perls.

Time::Warp seems to do similar things (to be evaluated).

Class support

Autogenerating accessors

Class::Struct: in the perl core, has a strange syntax and I remember other issues (no backward compatibility). Not very well suited for OO, it's more for C-struct like things.

Class::Accessor: stable. Has variants for fast access Class::Accessor::Fast and lvalue access Class::Accessor::Lvalue (but note that lvalue access has problems in perl and is still marked as experimental).

Class::Accessor::Fast::XS is compatible with Class::Accessor::Fast, but implemented in XS and somewhat faster.

accessors: simple syntax, has also some variants. A possible downside is that the internal hash members are preceded with a dash. This makes serialized objects (e.g. as YAML) not-so-nice to read and write.

Class::AccessorMaker: ???

For an explanation what Object::Tiny does, see http://use.perl.org/~Alias/journal/34329. This module only implements read-only accessors, no mutators. The author claims that Object::Tiny is smaller and faster than Class::Accessor::Fast.

Moose does everything for OO, it seems, including generating accessors.

Coat is inspired by Moose and implements a subset of Moose's features. Coat is self-contained.

Mouse is another Moose clone missing some features but is (according to the documentation) much faster than Moose. Other even tinier versions of Moose are Moo and Mo.

Inside-out technique

Class::InsideOut::Manual::About explains how Class::InsideOut works and has a list of other modules implementing the inside-out technique.

Sorting

Sort::Key is (according to the author) the fastest way to sort an array by key.

Sort::Naturally simultaneously sorts lexically and nummerically.

Sort::Versions, CPAN::Version, version, Version::Compare may be used to compare and sort version numbers.

Weak references

Nowadays the preferred way is to use weaken in Scalar::Util, a core perl module. Former modules were WeakRef ...

Constants

The core Perl pragma constant is able to create sigil-less constants. sigil-less is problematic because such constants are not easily embedable in strings, and may not be used on the left-hand side of a fat comma or as a hash key in a simple way. The perl interpreter may do good optimizations with constants created using this pragma.

Readonly, Readonly::XS, and Const::Fast are modules which create constants (or read-only variables) which have sigils, meaning that the disadvantages above disappear. The performance though is worse, beginning with Readonly being the slowest and Const::Fast being the fastest, but probably cannot be as fast as constants created by the contant pragma.

Data::Lock also uses Perl's internal feature to make a variable immutable, therefore also being quite fast. Attribute::Constant adds some syntactic sugar to this module.

Another module exposing Perl's internal readonly flag is Scalar::Readonly. It works on scalars only, though.

The core module Hash::Util has a set of functions to "lock" the keys and/or values of a hash.

Cloning

Storable has the function dclone(). Most data types can be cloned, including code references (if Deparse/Eval is turned on, but with some restrictions regarding lexicals); but some data types cannot, like regexps.

Clone is a module dedicated for cloning.

Scalar::Util::Clone is another clone module. Its documentation claims that it is faster than Storable's dclone.

Data::Clone, another XS-based clone module. It has a different policy than Storable, in that not every data type is deeply cloned. Especially blessed object are not cloned unless marked as clonable (i.e. by defining a "clone" method). Reviews at CPAN ratings suggest that it's faster than Storable.

Accessing

Data::Path provides XPath-similar accessing methods to data structures. A similar approach is Data::DPath, which has a comparison table between these both modules.

For just accessing a deep data structure Data::Diver may also be used. Using this module it is not necessary to differentiate between subhashs and subarrays. Autovivification is avoided.

Comparing

A number of modules are available for comparing complex data structures, for example Data::Compare. This one lets you to ignore some hash elements in the comparison, if needed. List::Compare and Array::Compare may only do comparisons on lists/arrays. Using Array::Compare is not recommended anymore, as it uses the rather heavy-weight Moose module.

Is is also possible to do comparisons by serializing data structure (e.g. using Data::Dumper or Storable) and do a string compare of the serialized data. Make sure that the output is canonical, i.e. by using Sortkeys(1) with Data::Dumper or by setting $Storable::canonical.

See also below for comparing within test scripts (Test::Differences, Test::Deep, "is_deeply" in Test::More).

To compare files, use the core module File::Compare.

Merging

The following modules deal with merging data structures: Hash::Merge, Hash::Merge::Simple, Data::Merger, Data::ModeMerge.

Caching

See CHI, Cache, Cache::Cache, and others.

Database Interfaces

Object persistance

Storable (see above) and all other (de)serialization modules. See also http://poop.sourceforge.net for a comprehensive list of perl object persistance modules.

Database Abstractions

Class::DBI is popular. An offspring of this module is DBIx::Class. There are also Alzabo, Rose::DB. See also http://poop.sourceforge.net/, as database abstraction layers and object persistence systems are mostly the same thing. Unfortunately this page is somewhat outdated (from 2003) and it does not even mention DBIx::Class.

SQL Abstrations

Schema handling

SQL::Translator seems to do everyhing: schema translation from everything to everything, including diagrams and creating diffs. The object model seems to be clean, separating parsers, producers, filters and schema objects. Unfortunately it has also everything as PREREQ_PM, including GD, GraphViz, Excel-related modules... Another downside is that certain operations are quite slow, e.g. parsing a schema from an SQL file or database. Supported database engines are MySQL, Oracle, PostgreSQL, SQLite and many others.

DBIx::DBSchema may me used for translating an existing database schema (which will be read from a database) and create SQL statements for another database engine. Especially there's no support for parsing a schema from an SQL file. Supported database engines are MySQL, PostgreSQL and SQLite, while Sybase and Oracle are partially supported (from the docs).

MySQL::Diff has a limited task of showing differences between two mysql schemas (either from file or from a database) and create a series of ALTER TABLE statements to bring one schema to the other. Development to this module has stopped a lot of years ago, meaning that newer MySQL versions (4.x, 5.x ...) are not supported without manual tweaks.

SQL Builder

SQL::Abstract is inspired by DBIx::Abstract, but does only the SQL generation part. SQL::Abstract::Limit adds a portable LIMIT emulation to that module.

The SQL::Maker documentation claims that it's more extensible than SQL::Abstract. SQL::Maker is inspired by DBIx::Skinny.

With SQL::Interpolate one still writes SQL statements, but perl variables may be magically interpolated into them.

More approaches: SQL::DB::Schema, SQL::Query and many more at CPAN.

File database systems

File database systems or dbm systems are simple database systems with usually just a key-value relation and only one table per file.

A short overview of standard DBM systems which come with perl can be looked at the AnyDBM_File manpage. DB_File has the most "+" in the AnyDBM_File table, but has some problems: the database file format changes across berkeley db versions, and there are reportedly many problems with corrupted databases.

DBM::Deep is a pure-perl implementation which can also handle deep nested structures as values (other DBM implementations can only handle scalars as values).

BerkeleyDB interfaces berkeley db like DB_File, but has a much richer API (e.g. support for transactions).

BDB allows asynchronous access to berkeley db.

CDB_File is a "constant" database. This means that the database is created only once, but reading is very fast. A pure perl variant is CDB_Perl.

To have MLDBM sits on top of the other DBM implementations and enables storing of deep nested structures. Unlike DBM::Deep, storing of object information is also possible.

MLDBM::Sync::SDBM_File can be used to overcome the size limitations of SDBM_File.

DB_File::Lock adds a locking layer to DB_File.

MLDBM::Sync adds a locking layer to MLDBM files.

Search/fulltext engines

WAIT, a fulltext engine. CPAN has support for this module.

Senna, an interface to the Senna fulltext search engine (from Pod: "a fast, embeddable search engine that allows fulltext search capabilities").

CLucene and Lucene are interfaces to the CLucene C++ search engine. From looking at reports and ratings, it seems that Lucene is better supported.

Plucene, a Perl port of the Lucene search engine.

Search::FreeText: free text indexing module for medium-to-large text corpuses

Text::English: comes with the perlindex script which lets users search over all installed Perl documentation, including installed CPAN modules.

More: Search::InvertedIndex, Sphinx::Search, SWISH, KinoSearch, Search::Indexer ...

Database utilities

For turning a database query result into a HTML table, one can use DBIx::XHTML_Table or HTML::Table::FromDatabase.

Development Support

Test modules

Generic test modules

A comprehensive overview of test modules can be found at http://qa.perl.org/test-modules.html (An overview of the testing modules available on CPAN).

Recommended Test module: Test::More. Provides more functionality, better extensibility and more diagnostics than the old Test module. Downside: Test::More is part of the core only since 5.7.3.

Worthwhile additions to Test::More: Test::Differences to display a diff-like output of the comparison of two strings or data structures.

Test::NoWarnings: make sure that there are no warnings generated in your test suite. This is the opposite of Test::Warn.

Test::Warn: test the creation of warnings.

Test::Distribution: test if a distribution is correct and complete (POD checking, all modules compile, existance of standard files like README). Simple to use.

Test::Deep: compare deep structures with a lot of features. For easier tests, is_deeply from Test::More should suffice. For large data structures, using Data::Compare may be faster than is_deeply.

Test::Without::Module: emulates the non-existance of modules (e.g. for optional features). Modules with similar functionality: Devel::Hide, Module::Mask. Test::Without::Module had some problems with Tk, which are solved now. Devel::Hide works well. Module::Mask not tested yet.

Test::LongString helps in comparing long strings by only showing relevant output. Test::Differences may also used for this: it shows the differences in complex objects in a diff style.

Test::HTTPStatus: a very simple module for just checking the return value (status code) of an URL.

Test::URI: Check various parts of Uniform Resource Locators.

Test::HTML::Content: Perl extension for testing HTML output (e.g. expected links, content, xpath ...).

Test::HTML::Lint: make lint tests to HTML.

Test::HTML::Tidy: similar to Test::HTML::Lint, but uses the external tidy library.

Test::WWW::Mechanize: bring WWW::Mechanize and Test::More together

Test::WWW::Simple: simple content tests.

CGI::Test: ???

Other test modules

Test::XML: Compare XML in perl tests.

Development, Perl Interna

Tracing

Simple tracing can be done with Devel::Trace. Similar output is generated by the "t" (trace) option of the normal perl debugger. Devel::TraceCalls allows programmable tracing by subs or packages. The Devel::CallTrace module is another one, which includes in its Pod documentation a list of similar modules, so look there for a short description for any of them.

Carp::Always (formerly called Carp::Indeed) turns every warn/die into a stack trace including used parameters. Seems to be more powerful than Acme::JavaTrace and Devel::SimpleTrace, which do not dump parameters.

Profiling

Devel::NYTProf is currently the state of the art profiler in the Perl world. It allows subroutine- and statement-level profilers. Reports are created as a series of HTML pages, but even other programs (e.g. kcachegrind) may be used.

Devel::FastProf and Devel::SmallProf: profilers on line basis. The first one has less impact on the execution time of the script (according to the author, 3 to 5 times slower) and has less profiling output, and the latter is much slower (about 50x), but has more profiling output. At least Devel::FastProf turns out to be quite useful.

Devel::DProf: ancient, has problems with complicated stuff (e.g. Tk scripts, saw segmentation faults and garbled profile files). Profiles only on sub basis.

There are other profilers, which are untested: Devel::Profiler, some apache-specific profilers like Apache::Profiler, Devel::Profile, Devel::DProfLB, DashProfiler ...

Memory tracing

Devel::Leak: counts used scalars, so it's possible to determine whether there's an increase of used SVs.

Devel::Leak::Object (was recommended at cpanratings).

Devel::TrackObjects: track the usage of objects, which can also be used to find memory leaks.

Devel::FindRef is designed to find memory leaks by creating a detailed report of all yet active scalars with all references to it.

Compatibility support

Perl::MinimumVersion determines the minimal required perl version for given perl code. Unfortunately it's far from perfect, many modern perl constructs are not yet recognized. See http://rt.cpan.org/Public/Bug/Display.html?id=28916.

The corelist script, part of the Module::CoreList module distribution, knows when modules were part of the perl core, if ever.

Module prerequisites

Module::PrintUsed prints the list of used modules at the end of a script run.

Perl::PrereqScanner gets a list of used modules by scanning perl files.

Another helper module: Module::Extract::Use.

Versioning control systems

VCS (old, unmaintained), VCI (new, at the moment actively developed) and others. TBD: Look which one is the best nowadays.

Git is included in the git core.

Git::PurePerl is a pure perl interface to Git repositories.

Regexp tools

Regexp::Assemble: combine multiple regexpes into one. A faster but more restricted module doing the same is Regexp::Trie. Other modules doing similar things are Regexp::Optimizer and Regex::PreSuf. Regexp::Debugger shows nicely how the regexp engine is working, step by step.

Documentation

Pod

XXX podlators, Pod::Simple, various pod2html modules, App::Pod2CpanHtml XXX

Pod::Simple: a robust Pod parser. It's part of the perl distribution since 5.10.0.

Pod::POM: converts Pod documents into an object module, which can in turn be translated into different markups like HTML, text, or again Pod. Unfortunately the Pod parser has a couple of issues (see RT bug tickets).

Tk::Pod: a Tk Pod viewer widget.

File Handle Input/Output

IO

Reading/writing to a scalar

Since 5.8.0, this can be done with three-arg open and a scalar ref instead of a filename. For former perl versions, use either IO::Scalar or IO::String. See http://cpanratings.perl.org/dist/IO-String for a rating which compares IO::String and IO::Scalar (where the former seems to be better than the latter).

Compression

gzip

PerlIO::gzip and PerlIO::via::gzip both provide transparent gzip compression and decompression while reading from/writing to file handles.

IO::Compress::Base and IO::Uncompress::Base are base classes for a number of modules dealing with compression, including gzip, zip, and bzip2. This modules are in core since perl 5.9.4.

File Name Systems Locking

File type detection

File::Type uses exclusively an internal database. In its documentation it explains its advantages over File::MimeInfo and File::MMagic. Unfortunately the module seems to be rather unmaintained since 2004.

MIME::Types: the module is a file extension - mime type mapping. The module is not looking at file's magic. It is supposed to work just like the MIME type recognition of Apache. A similar module with a simpler interface is Media::Type::Simple.

LWP::MediaTypes is part of libwww-perl. Like MIME::Types, it deduces the mime type from the file name.

File::MimeInfo::Magic needs the mime-info database from the freedesktop project.

File::MMagic uses either an small internal database (with about 100 file formats) or the host's magic file as found in /etc/magic or elsewhere. Unfortunately the location of the magic file is different from system to system, so it would be nice if there was an "auto-detect" option.

An XS version of File::MMagic is available as File::MMagic::XS. Its documentation presents a benchmark where it is more than 20x faster than the pure-perl version.

File::Type::WebImages only determines common image file types ("web image file types") using magic recognition. Its documentation lists advantages over File::Type and File::MMagic.

File::LibMagic is a perl interface to the libmagic library, which means that you get the same results like the file(1) command but without the overhead of calling and parsing an external application.

File system

Copying files

File::Copy is in the core. There's no support for recursive copying or file attribute preservation.

File::NCopy is able to make recursive copies and preserve file attributes.

File::Copy::Recursive is an alternative module for copying files recursively.

Moving files

File::Copy has a move function which also can handle moving across filesystem boundaries (which the perl function rename cannot).

File::PerlMove provides a shell command pmv to rename files using perl expressions. Another implementation doing this is rename (originally by Larry Wall, and currently exists in numerous variants), and sometimes part of operating system installs (e.g. available for Debian or as a FreeBSD port). And another one is File::Rename.

Walking a file hierarchy

File::Find is in the core.

File::Find::Rule is different XXX. Many plugins.

Other modules here are: File::Next, Path::Class::Rule, Path::Iterator::Rule, File::Find::Rule, File::Find::Object, File::Find::Iterator. These are covered in this rjbs article: http://rjbs.manxome.org/rubric/entry/1981. rjbs also mentions the following modules which do not do lazy iteration, so these are probably slow on large trees: File::Find::Declare, File::Find::Match, File::Find::Node. And finally there's File::Find::Wanted.

Directory::Iterator::XS is a recursive directory lister, implemented as XS.

File::Zglob provides globbing like the core File::Glob, but adds zsh's "**" pattern, which can be used to recursively descend into subdirectories.

File::Locate uses the locate/slocate database which is usually updated every night on Unix boxes.

A speed research: http://rjbs.manxome.org/rubric/entry/1981.

Changing the directory

For a simple change of the current working directory, use the perl builting chdir.

File::pushd allows a temporary change of the current working directory, which is automatically reverted one the created guard object goes out of scope.

File::chdir can do the same, but it uses a localized variable $CWD instead.

File::Tools provides a number of commonly found file-handling Unix tools as Perl functions; amongst them also pushd and popd. Note that this version is not capable of automatically revert the directory.

Cwd::Guard works similar as File::pushd. It's using fchdir under the hood, so should be safe against directory renames while the guard is in effect.

File::cd is another module for temporary directory changes with a different interface: here a callback function is supplied for the code which should run within the changed directory.

Free disk space

Filesys::DfPortable is probably the best choice. It supports both Windows and Unix systems.

Filesys::Df and Filesys::Statvfs seem to work on Unix systems. Both need a C compiler. Filesys::Df gives more portability, while Filesys::Statvfs can return more information about the file system.

Filesys::DiskSpace: no usage of external commands, just using syscalls through syscall.ph. Support seems to be stalled since many years (there are test failures and old bug reports on rt.cpan.org).

Filesys::DiskFree parses the output of the Unix tool df. It does not work with modern BSD systems, for instance. The module is very old and seems to be unsupported.

Locking

Perl already has core support for flock(). But there are modules which are wrappers around it, especially for creating lock files, or implement locking a different way.

File::Flock is a wrapper around flock(). It knows about blocking and non-blocking locks and has a convenient OO interface which removes the lock file if the object goes out of scope. The distribution includes also File::Flock::Subprocess which handles locks in subprocesses for operating system which do not do this normally (e.g. Solaris), but this makes the dependency chain a little bit longer.

File::Flock::Tiny is another, probably smaller wrapper around flock().

File::Basic::Flock is similar like File::Flock, but the user has to create the lock file himself.

File::FcntlLock uses fcntl() instead of flock() for locking. One advantage is that fcntl() can do range locking. Disadvantage is that this method is not available everywhere.

The locking algorithm of LockFile::Simple does not use any system calls like flock() or lockf(). It has even support for locking over NFS. It is reasonable safe from race conditions, but not absolutely.

The following modules have also some support for locking: Paranoid::Lockfile (but check the RT queue here!).

File::Lockfile is not recommended, as race conditions are easily possible (checked with at least v1.0.5).

Sys::RunAlone is a convenient wrapper around flock() for scripts which should run only once at a time, optionally with a retry option and being silent or loud, if there's another script running.

Graphics

Graphic libraries

The classic one is GD. It is not exactly easy to install, because it has as libgd as a prerequisite, which is not always pre-installed. To complicate matters, there are incompatible versions of GD and libgd (1 and 2). Older versions of libgd1 had gif support, which was removed for patent reasons. So many OS distributions still use the older version to remain backwards compatibility. Maybe these things will settle as the patent expired and newer libgd2 and GD2 versions come with gif support again. libgd2 also has true color support, and there is also truetype/freetype font support.

Image::Magick (also known as PerlMagick) is, compared to GD, slow but creates superior results. It's said that the API is not stable and tends to change often. It's also not so easy to install because of the dependency of the imagemagick program/library. An alternative to ImageMagick is its offspring http://www.graphicsmagick.org/|GraphicsMagick, which also comes with a Perl binding.

Imager is a library which comes with the graphic library bundled in the perl module distribution. So you only need the low-level libraries like libpng and libjpeg installed on your system.

Image::Imlib2: apparently an interface to the imlib2 library. Not tested yet.

Prima is originally a GUI toolkit, but it can handle image operations also in a non-windowing environment.

Not a perl module, but still usable from a perl script/module is netpbm (to be found on sourceforge). netpbm is a collection of filter programs which can be piped together. For example: to resize a gif image just use: giftoppm file.gif | pnmresize <size> | ppmtogif > newfile.gif.

GraphViz is a graph library. Feed in a graph or tree and get a graph as gif or postscript or Perl/Tk canvas output or ...

Cairo is a 2d vector graphics library and able to create png files, amongst other formats.

SVG

Image::LibRSVG converts svg graphics into popular image formats.

SVG::GD vs. GD::SVG: use the GD API to create SVG graphivs. [XXX: What are the differences between both?]

SVG provides an API to create SVG graphics.

Cairo is a 2d vector graphics library and able to create SVG graphics, amongst other formats.

Image information

Image::Size and Image::Info may be used to get information about an image file (like size, mime format etc.).

Image::EXIF is an interface to EXIF information. Some EXIF information can also be found in Image::Info. There's also Image::ExifTool, which is (contrary to Image::EXIF) written in pure perl.

GD can be misused to determine if a file is an image at all by loading the image into a GD::Image object (meaning at least that it's an image format supported by GD), and has a getBounds method to get the dimensions of the image. But this is not very efficient, especially for large images.

Image::Magick provides a "Ping" method which gets only basic information (size and format) from the image without loading it completely, so it's more efficient than GD.

Charts

GD::Graph is a nice package to create plots, charts, pie diagrams etc. via the GD module.

GD::Graph3D is like GD::Graph, but provides 3d-looking graph.

Chart::ThreeD::Pie is another module to create 3d pie diagrams via GD.

Chart::Clicker is based on the Cairo graphics library and many people think that its output is extremely beautiful.

Language Extensions

Exporting

The standard way for exporting symbols from a package to another is the core module Exporter. See "COMPARISONS" in Sub::Exporter explaining the strengths and weaknesses of Exporter, Exporter::Lite, Exporter::Easy, Exporter::Simple (this one's broken for perl 5.12 and newer and seems to be not fixed anymore), Perl6::Export, Perl6::Export::Attrs, Exporter::Renaming, Class::Exporter, Exporter::Tidy.

Interfacing with C code

The "standard" way of interface to C code or a C library is the XS language, part of core perl. See perlxs and perlxstut for more information.

SWIG (http://www.swig.org/) is a language-agnostic approach, supporting also other programming languages like PHP, Python, Tcl and Ruby.

Using FFI, C::DynaLib or P5NCI C library functions may be accessed without the need to write intermediate C code at all. Under Windows, Win32::API may be used to access the Windows API.

Last, with Inline::C C code may be embedded in Perl code.

Interfacing with other languages

See the Inline family of modules.

Mail and Usenet News

Sending and Receiving

Mail::Send and Mail::Mailer are used for composing and sending simple messages. The modules are old and work somehow, but one has to actually read the source rather than the documentation to learn all the caveats. There's no support for constructing MIME mails. There are multiple backends (sendmail, mail, smtp ...) built in.

Mail::Sendmail: XXX check this one. Just for sending.

Net::SMTP: A low-level module for accessing a SMTP server for sending mails. No support for constructing messages.

Mail::Sender a higher-level module around Net::SMTP ???

Mail::Box: a comprehensive suite of mail manipulating modules. This include: message construction, mbox/mail folder manipulation (including access to POP3 mail accounts), sending mails. Support for MIME mails.

Mail::Cclient also gives access to various mailbox formats (including IMAP mail accounts). The module relies on an additional C library.

Mail::IMAPTalk is an IMAP client interface.

Generating

The mentioned modules have mostly only support for generating plain messages without MIME and attachments. For more complex emails with attachments you need something like MIME::Lite, MIME::Tools or Email::MIME. There are claims that at least MIME::Lite is buggy and that Email::MIME should be preferred.

Miscellaneous

Module installation

The classic module is CPAN, which comes with core perl. An alternative is CPANPLUS, which was temporarily part of the perl core (but not anymore since 5.16.0). App::cpanminus (aka cpanm) is another alternative.

CPAN::Mini is a module for creating a minimal mirror of the CPAN repository.

Logic programming

A list can be found at http://www.perlmonks.org/index.pl?node_id=424075. A discussion at http://www.mail-archive.com/sw-design@metaperl.com/msg00115.html.

Building software

ExtUtils::MakeMaker is the old-fashioned standard way of building perl modules and applications. Unfortunately it depends on the existance of a make tool on the system. For a pure-perl build system try Module::Build.

Module::Install is a build description system which provides a nice DSL and is extensible using plugins. It's self-contained; distributions are expected to ship Module::Install and all dependencies. Under the hood it is using ExtUtils::MakeMaker and relies on a make tool.

Dist::Zilla is a distribution builder.

A make replacement as a perl module: Make, together with a command line tool pmake.

Commands::Guarded is an interesting module for guarded execution of commands (XXX better description!).

PANT is meant as an ant replacement to help to automate a build environment. It is usually used as a meta script which calls external make commands and has some built-ins like NewerFile, CopyFile etc.

Cons is a make replacement. Cons files are written in pure perl.

Slay::Makefile provides a make variant called slaymake which can run both shell commands and perl code.

Parallel::Depend is a dependency system which has a focus on parallel execution. It provides also a dependency system which looks remotely like a makefile. It's able to run both shell comands and perl code.

Parsers

Marpa::XS is a parser for grammars which can be written in BNF.

See http://egparser.sourceforge.net/ about a comparison between Parse::RecDescent, Parse::Yapp, and Perl-byacc.

CRC

String::CRC32: fast, written in C.

Digest::Crc32: very slow, written in pure perl.

There are also Digest::CRC and String::CRC.

Password generators

There are String::Random and String::MkPasswd.

Logging

A list of available logging modules can be found at: http://search.cpan.org/dist/Log-Dispatch/lib/Log/Dispatch.pm#RELATED_MODULES.

The documentation page of Log::Fast, a logger written for speed, have a speed comparison against other logging modules at: http://search.cpan.org/dist/Log-Fast/lib/Log/Fast.pm#DESCRIPTION.

Logfile parsing

David Newcum lists in http://cpanratings.perl.org/user/interiot a number of modules which do logfile parsing: Logfile::Access, http://www.oreilly.com/catalog/perlwsmng/chapter/ch08.html, Regexp::Log, HTTPD::Log::Filter, AWStats' awstats.pl, http://www.fourmilab.ch/fourmilog/archives/2005-03/000500.html, and, as the worst example, Apache::ParseLog.

System administration

Sysadm::Install: a box full of tools typical for automated installation tasks. Looks very nice.

Commands::Guarded: this is more a paradigma the a toolbox: every action has a should-condition and an action. So it's possible to restart a script after a previous failure and determining errors is not done by checking error coded, but by checking the should-condition.

Rex stands for remote execution and is a collection of modules to help for system administrating remote servers.

Geo modules

Distance calculation

The core module Math::Trig has a <great_circle_distance> function.

Geo::Distance offers a couple of different algorithms for distance calculation. Geo::Distance::XS is a faster XS implementation of this module.

GIS::Distance also offers different algorithms for distance calculation. The returned values are objects here. GIS::Distance::Fast is a faster XS implementation of this module.

Geocoding

Geo::Coder::Google is an interface to the Google Geocoding API. Results seems to be good, at least for the Germany area. There are variants for the two API versions Geo::Coder::Google::V2 and Geo::Coder::Google::V3. There's also an alternative implementation for the V3 API only: Geo::Coder::Googlev3.

Geo::Coder::GoogleMaps is a similar module which just uses a different interface to the API (JSON instead of XML). The results should be of the same quality like Geo::Coder::Google.

Geo::Coder::PlaceFinder is a similar to the Yahoo PlaceFinder geocoding service. It requires an app id. The previous geocoding service was served by Geo::Coder::Yahoo, but the old API was shutdown in 2011.

Geo::Coder::Bing is an interface to the geocoding API from http://www.bing.com. It does not require any API key. Results seem to be good, at least for the Germany area.

Geo::Coder::OSM is another interface an API using OpenStreetMap data. No API key is required.

Geo::Coder::Cloudmade is an interface to the Cloudmade API, which is using data from the OpenStreetMap project. An API key is required.

Geo::Coder::US is an interface to the free US geo data.

Geo::Coder::Mapquest is an interface to the beta Mapquest Geocoding Web Service. The service requires an API key. According to the manpage of version 0.01, the results are often not good for the US area, and there are no results for addresses outside the US.

Other

Geo::Google is an interface to Google maps.

Weather

Working modules

Yahoo::Weather does not need an API key, seems to still work, and has data for international locations. It provides current weather conditions, forecast, sunrise/sunset data.

Weather::Underground does not need an API key, seems to still work, and has data for international locations. It provides current weather conditions, no forecast, sunrise/sunset and moonrise/moonset data.

Geo::METAR parses METAR weather data. It does not do fetching at all. A sample source for METAR data is http://weather.noaa.gov/cgi-bin/mgetmetar.pl?cccc=$ICAO (put in an icao airport code).

Non-working modules

Weather::Com: the Pod says "Note that weather.com is retiring the XML Data Feed."

Weather::Google does not work anymore, because Google ceased the iGoogle weather API. See https://rt.cpan.org/Ticket/Display.html?id=79260.

Geo::Weather is US centric and does not seem to work anyway. See also https://rt.cpan.org/Ticket/Display.html?id=1820.

http://search.cpan.org/perldoc?weather looks like a convenient script, but unfortunately the location data is stored in the script itself, and there does not seem to be a way how to change this from cmdline.

Unknown status

There are more modules dealing with weather. These are all untested:

Terminal input

Use Term::Readkey for one-key-at-a-time input (visible or invisible). Useful also for invisible password input.

Term::ReadLine: a perl module similar to readline. Enabled line editing and history handling. Comes in two flavours: Term::ReadLine::Perl and Term::ReadLine::Gnu (XXX what's the difference between those two?)

IO::Prompt: from the SYNOPSIS it's a nice module for prompting a string and getting an answer. Has a lot of useful options (defaults, input filters ...). But it does not seem to run under Windows.

Progress indicators

Term::ProgressBar draws a progress bar to the terminal. Term::ProgressBar::Quiet does the same, but checks first if the output device is a terminal at all (useful for scripts which may run as cron jobs, for instance).

Time::Progress may also draw a progress bar and has a rich set of formatting parameters (e.g. showing the ETA etc.).

Tk::ProgressBar for a graphical progress bar. IWL::ProgressBar for a web progress bar based on the IWL library.

Term::Spinner draws a classic spinner (a rotating bar). Term::Twiddler does the same thing, but seems to allow for more configuration. For multiple spinners, one can use Term::MultiSpinner.

More in the same league: Term::Activity, a huge framework ProgressMonitor, Acme::Spinner, Acme::Spinners (with a set of different spinners).

AI

A long link list for AI and NLP stuff can be found here: http://perlmonks.org/?node_id=399498.

Bio

A long link list for Bioinformatics stuff can be found here: http://perlmonks.org/?node_id=399498.

Job Queues

Distributed systems are TheSchwartz and Gearman.

Networking Devices IPC

Network Utilities

IP address matching

Net::IP::Match::Regexp is a module to match IP addresses against ranges. This module contains also a list of similar modules with a comparison (Net::IP::Match, Net::IP::Match::XS, Net::IP::Resolver).

Another module for parsing, manipulating and looking up IP network blocks is Net::Netmask.

SSH

Net::SSH is a simple wrapper around the ssh(1) command. Cannot send proper command line arguments to remote. Does not support arbitrary ssh options. It seems that there's no active development anymore.

Net::SSH::Perl is a ssh implementation written in perl and does not need an ssh installed.

Net::SSH::Expect also uses the system's ssh like Net::SSH, but adds also the capabilities of Expect into it.

Net::SSH2 is a wrapper built around libssh2.

Net::OpenSSH is a new player. The Pod documentation of this module has a comparison against the other available ssh implementations.

Option Parameter Config Processing

Config/Ini files

YAML is both good for (de)serialization and config files. API is very simple to use. Note the overwhelming bug list on http://rt.cpan.org, but for normal configuration files it should work OK.

Config::IniFiles: most people recommend this module for Windows-ini styled configuration files. More modules in this game: Config::Tiny, Config::IniHash, Config::Mini, and Config::INI::Simple. Config::Any::INI can read ini files, but not write them.

XML::Simple: out-of-the-box serialization is not possible without fiddling with a lot of options

Config::Model: a configuration framework for parsing and analyzing all kinds of configuration file. Includes generating a curses screen to let the user input the options. Have many features like automatic comment creation, help screens, different levels (novice/intermediate/...).

String Language Text Processing

Best reference is probably http://perl-xml.sourceforge.net/, containing the Perl XML FAQ.

Some benchmarks can be found at http://www.xmltwig.com/article/simple_benchmark/.

Parser

XML::Parser: the mother of all XML processing modules. It works, though on a very low level (if you compare with other fields Perl is strong). I suspect it's just the fact that a DOM/XML tree does not easily fit into the hash/array/scalar world of Perl.

XML::Parser::EasyTree: like XML::Parser with Tree style, but the generated tree is more readable.

XML::Parser::Lite: a pure perl implementing a complete XML parser, without any other dependencies.

XML::Tiny is also a pure perl parser implementing parsing of a rather deliberately chosen subset of XML. Therefore, especially if you don't have control over the XML source, it is not recommended to use it in real production (and if you have control over the source, most times it is preferable to use another data serialization format like YAML, JSON, Storable or so).

XML::SAX::PurePerl: a pure perl XML parser with SAX2 interface

Parser/Generators

XML::LibXML, an interface to the libxml2 library. The library comes with a number of methods to access an XML document: using DOM functions with XPath support, or using a pull parser (SAX), or a push parser (XML::LibXML::Reader).

XML::Simple, a high-level interface. For serious work there's a lot of options to fiddle with, but the documentation is extensive and marks the options which are important and which are not.

XML::Mini, an implementation of a DOM-like API. Module seems to be buggy and unmaintained.

XML::Twig is feature-rich, has support for in-memory and chunked parsing, xpath operations, good documentation, a section on competitors. The author admits that it is slower than XML::LibXML, but uses less memory. Never used it, though.

XML::Bare is, according to the docs, a fast XML parser which implements only a subset of XML. Its doc has a comparison table about the parsing performance of the existing XML parsers.

Normalization/Canonicalization

There are XML::Normalize::LibXML and XML::CanonicalizeXML available. The latter implements W3C recommendations.

XML::LibXML::Document has also canonicalisation methods (toStringC14N...).

Feeds

Parsing and generating RSS: XML::RSS generally works fine, but may be quite slow on parsing some types of RSS feeds (taking ~5 seconds on normal computers). XML::RSS::Parser is more light-weight than XML::RSS and provides only parsing capabilities. It seems to be significantly faster than XML::RSS (probably dependent on the used XML::SAX driver). XML::RSS::Tools provides client capabilities and expects an XSLT stylesheet which does the converstion of rss content. XML::RSS::Feed runs on top of XML::RSS and provides caching abilities.

Parsing and generating Atom: XML::Atom...

Utilities

Tk::XMLViewer displays an XML tree in a Tk window.

Templating systems

A comparison of web centered templating systems can be found at http://perl.apache.org/docs/tutorials/tmpl/comparison/comparison.html. And here's another one: http://www.perl.com/pub/2001/08/21/templating.html. Here's a benchmark: http://search.cpan.org/perldoc?bench_various_templaters.pl (part of http://search.cpan.org/dist/Template-Alloy/. The examples directory of HTML::Template::Compiled also includes a benchmark script, but without numbers in it.

Standalone templating systems

There are too many around there. Just to name some:

Template (Template-Toolkit), a very popular and feature-rich system. Good documentation, good support.

HTML::Template, another popular system. Not so feature-rich like Template-Toolkit. Focus on HTML.

HTML::Template::Compiled uses the same templating language as HTML::Template, but internally compiles the template to perl code, so it's faster in persistent systems (about 2-4x).

Text::ScriptTemplate, a asp/jsp-like system to embed perl in a template. Small and useful.

Template::Alloy is, according to its documentation, a fast templating system which can emulate some of the other popular systems, including Template-Toolkit and HTML::Template. Unfortunately the XS version, Template::Alloy::XS does not work anymore for a long time.

Text::Xslate is a scalable template engine for Perl5, and according to its documentation, very fast (claiming to be 100x faster than TT2).

Meta templating systems

Any::Template is a wrapper around the major templating systems using the same API for all.

Apache

Apache::ASP a framework with a jsp/asp-like templating system (XXX check), to be used in conjuction with mod_perl.

EmbPerl (HTML::Embperl). More than just templating XXX?

Mason (HTML::Mason). More than just templating XXX?

Apache::SimpleTemplate, a mod_perl handler for using jsp/asp-like templates. It seems to be rather lightweight in comparison to Apache::ASP.

PDF handling

PDF::Create is a pure-perl module and works fine for generating pdf documents, including ones with simple vector graphics.

PDF::API2 is big and complete. But it seems to be slower than PDF::Create.

In CAM::PDF's SEE ALSO Pod section there's a comparison between a couple of PDF-related modules.

Cairo is a 2d vector graphics library and able to create PDF files, amongst other formats.

There are some templating solutions (PDF::Reuse, PDF::Template). And there's also (non-perl) pdflatex!

Approximate matching

String::Approx uses the Levenshtein edit distance. The core is implemented in C. Older versions of String::Approx (version 2) also have a slower perl implementation.

String::Similarity returns a similarity index of two strings.

String::Trigram finds similar strings by trigram (or 1, 2, 4, etc.-gram) method.

Text::LevenshteinXS calculates the Levenshtein edit distance (which is used in String::Approx, for example) between two strings. Text::Levenshtein is a pure perl implementation.

More CPAN modules: Text::Brew, Text::Fuzzy, Text::Compare, Text::Dice, Text::Levenshtein::XS, Text::Levenshtein::Damerau::PP, Text::Levenshtein::Damerau::XS, Text::WagnerFischer. See also https://metacpan.org/pod/distribution/Text-Dice/ex/benchmark.pl for benchmark results.

Just for completeness: there's also the command line agrep, see http://en.wikipedia.org/wiki/Agrep.

Stripping accents

Text::Unidecode: highly recommended, handles all Unicode characters. There are also Text::Unaccent, Text::StripAccents and Text::Undiacritic.

User Interfaces

GUI toolkits

Tk is mature and runs on X11, Windows and MacOSX under XDarwin. Further alternatives are: Wx, Gtk2, Qt, Prima, Win32::GUI etc. (A GUI toolkit comparison can be found at http://www.perlmonks.org/?node_id=108708). X11::Protocol for a low-level interface to the X11 windowing system --- good in conjuction with a higher level GUI toolkit like Tk.

Qt is only for the library version 2.0, while the current version is at 4.0.

IWL is a widget library for the web.

Console based GUI toolkits

Low level: Curses.

Higher level: CDK, Curses::UI, Curses::Widgets.

Here's a comparison between Curses::UI and Curses::Application (sorry, German only): http://groups.google.com/group/de.comp.lang.perl.misc/browse_thread/thread/6e8da30ef98fb386#0df9b3146ac61a05

GUI builder

For Tk: ZooZ is actively developed. It is not exactly of the type "drag'n'drop", but seems to be useful. specPerl (not on CPAN) is old, but probably useful. guido (on Sourceforge) hast no active development anymore, and is not in a useful state.

Tk widgets

Date widgets

Tk::Date: a highly configurable text only entry widgets.

Tk::DateEntry: a BrowseEntry-like widgets which pops up a calendar

Tk::DatePick: a date widget which is only selectable using back/forward buttons (one for each date, month, year).

Tk::MiniCalendar: the calendar view looks like Tk::DateEntry, but the widget is not linked to a BrowseEntry widget.

(Multicolumn) Listboxes, Tables

Tk::Listbox: a core Tk widget. Cannot use multiple columns, other than defining a fixed font a fake multiple columns. Since Tk804 it's possible to define a limited set of per-entry configuration (foreground color, others?).

Tk::HList: a core Perl/Tk widget, which has headers (through Tk::ResizeButton also clickable and useable to resize columns), multiple columns, different styles for every cell, the possibility to use widgets or images in cells.

Tk::MListbox: XXX, are there others?

Tk::Table: a pure Perl/Tk table implemenentation. Rather complete, but slow on large tables, as each cell is represented by a widget.

Tk::TableMatrix: an XS extension to create tables. Complete and much faster than Tk::Table.

Error handling

If the standard Tk module Tk::ErrorDialog is loaded, then fatal errors (only) will go into a dialog instead to STDERR.

Tk::Stderr redirects all STDERR output to a toplevel window. The window may be iconified and will auto-deiconify if there's new STDERR output.

Tk::Carp works similar to CGI::Carp by redirecting warn and die calls, in this case to Tk dialogs. Note that this module does not catch STDERR (unlike Tk::Stderr).

World Wide Web

Client

LWP, LWP::UserAgent, LWP::Simple (higher level, has also HTTP methods, but no methods like POP3, SMTP). LWP: top-level module, use this for CPAN updates LWP::UserAgent: use this for doing requests in the WWW space LWP::Simple: use this if you want an very simple interface IO::All: for an even simpler interface

WWW::Curl is a HTTP client built on top of libcurl. According to the manpage LWP is still preferred to use, unless one needs more speed or parallelism.

There are other light-weight HTTP implementations, which are not complete as LWP, but smaller in size and module number and probably also faster. These are Furl, HTTP::Lite and HTTP::Client (where the latter needs HTTP::Lite as a dependency and the version 1.51 seems to be broken). HTTP::MHTTP is also a very low-level implementation, which is probably faster than LWP, but for example lacks out-of-the-box support for name-based virtual hosts.

HTTP::GHTTP needs a non-standard library to be installed, and a C compiler. It's probably faster than LWP.

There is a number of extensions to LWP::UserAgent, like LWP::UserAgent::WithCache and LWP::UserAgent::Cache::Memcached (with caching support), LWPx::ParanoidAgent (an user agent with additional restricions, e.g. refusing connecting to internal addresses), LWP::UserAgent::Determined and LWP::UserAgent::ExponentialBackoff (user agents retrying a few times in case of transient errors) ...

LWP::ParallelUA is an implementation on top of LWP to allow parallel fetching. Unfortunately this module does not work anymore with recent LWP installations.

libnet (Net::FTP) etc. (more low-level, has FTP, but no HTTP (?), POP3, SMTP etc.).

WWW::Mechanize: a class on top of LWP::UserAgent, can be used to mechanize WWW requests. WWW::Automate is similar, but outdated and not supported anymore. HTTP::Recorder is a supporting tool for WWW::Mechanize: it acts like a proxy and spits out WWW::Mechanize scripts. Useful for automated test systems, if it'd be supported (but look at http://rt.cpan.org and the overwhelming bug list).

WWW::Mechanize::Shell is a shell for driving WWW::Mechanize and creating scripts for this module.

Web::Scraper is a web scraping toolkit using HTML/CSS selectors or XPath expressions.

Scrappy is a powerful web harvester and spider.

Server

HTTP::Proxy to create a HTTP proxy.

HTTP::Daemon for a pure perl HTTP daemon. Other pure perl http daemons: tinyhttpd, httpi (links missing). For perl support in the Apache server see mod_perl and the CGI support in Apache.

Starman - a high-performance preforking PSGI/Plack web server.

Monoceros - a PSGI/Plack server with event driven connection manager, preforking workers, inherited from Starman.

Web Frameworks

There's a wiki page some of the existing frameworks for web development: http://www.perlfoundation.org/perl5/index.cgi?web_frameworks

Catalyst --- The Elegant MVC Web Application Framework

Maypole. A MVC system for Apache or cgi-bin.

See also HTML::Mason and maybe Apache::ASP, which are more than just templating systems (is this true XXX?).

Jifty.

Dancer - lightweight yet powerful web application framework.

Plack - PSGI toolkit, may also be used to create web sites.

CGI based frameworks

CGI::Application

CMS

Contentment --- a perl based Web CMS (check it XXX).

Wiki

CGI::Wiki: nice, modular, not out-of-the-box, but relatively easily configurable.

Kwiki: popular.

http://www.twiki.org (not a CPAN module). Also popular, feature-rich, slow.

URI handling

URI has full-flavored support for different URI schemes. One major problem: URI only supports the "&" query param separator and not the ";" separator. The latter is recommended by RFC XXX; also this means URI cannot parse query parameter string generated by CGI correctly.

CGI has limited support for URI handling through the url() and query_string() functions and the new() constructor.

URI::Query handles just the querystring part.

Apache/mod_perl support

Compression

A number of modules solve the problem "automatically (gzip) compress outgoing data if the client supports it". Probably the most comprehensive mod_perl handler (just by looking at the documentation size) is Apache::Dynagzip (which is a Apache::Filter-implementing filter). More light-weight solutions are Apache::Compress (also in conjunction with Apache::Filter) and Apache::GzipChain (in conjunction with Apache::OutputChain). All three can handle static files. Apache::Dynagzip has recipes how to work with CGI or Apache::Registry scripts).

CGI::Compress::Gzip is only suitable for CGI scripts. It automatically turns the standard output into a gzip-compressed stream, if suitable.