Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Initial checkin.

  • Loading branch information...
commit c0f208f514a2fd38a69eb1f57c1cbdadf7399535 0 parents
@chromatic authored
7 CREDITS
@@ -0,0 +1,7 @@
+Please add yourself to this file in alphabetical order in this format:
+
+N: <name>
+E: <email address>
+
+N: chromatic
+M: chromatic@wgz.org
36 README
@@ -0,0 +1,36 @@
+What is Modern Perl?
+--------------------
+
+Perl is a popular, powerful, and widely used programming language. Over its
+twenty year lifespan, it's powered millions of systems worldwide, moving
+trillions of dollars. More importantly, it's helped countless people get their
+work done effectively.
+
+The Perl community has a reputation for clever solutions -- and a reputation
+for institutional knowledge that isn't always clear to novices and neophytes.
+Modern Perl, written with this knowledge, can be very clean, very maintainable,
+and even more powerful than you imagine.
+
+That knowledge should be available to everyone. This book will teach you how
+to program Perl, but it will also teach you how to *think* about Perl, so that
+you can use it to your full advantage.
+
+http://www.modernperlbooks.com/mt/2009/01/why-modern-perl.html
+
+
+Contributing to Modern Perl
+---------------------------
+
+For now, this draft work is licensed under a Creative Commons
+Attribution-Noncommercial-No Derivative Works 3.0 United States License. For
+more details, see:
+
+ http://creativecommons.org/licenses/by-nc-nd/3.0/us/
+
+Please feel free to point people to this repository. Suggestions and
+contributions are welcome. Please do not redistribute with modifications.
+
+This book will be available under a less restrictive license when it comes out
+in print from Onyx Neon Press:
+
+ http://www.onyxneon.com/
395 outline.pod
@@ -0,0 +1,395 @@
+=head1 Modern Perl
+
+The justification for the book -- or, how to think about Perl to make learning
+and understanding it easier.
+
+=head2 The Perl Philosophy
+
+Fundamental features of the language and its design.
+
+=over
+
+=item context
+
+L<context_philosophy>
+
+=item expressivity
+
+=item implicit ideas
+
+L<implicit_ideas>
+
+=item perldoc
+
+L<perldoc>
+
+=back
+
+=head2 The CPAN
+
+Evolution happens outside the core I<on purpose>.
+
+=head2 The Perl Community
+
+How and where to find other people to learn and to share.
+
+=head2 The Language Itself
+
+A basic explanation for elements of the language itself.
+
+=head3 Syntactic Elements
+
+An explanation of the individual syntactic elements which make up the language.
+
+=head3 Names
+
+L<names>
+
+Mostly identifiers. How they work, what's valid, and what's not.
+
+=head3 Values
+
+L<values>
+
+Strings, numbers, everything without a sigil.
+
+=head3 Operators
+
+This has to be more than just a list of operators (and how do you explain the
+difference between perlop and perlfunc?).
+
+=head3 Control Flow
+
+Loops, jumps, and labels. Defer calling functions? Exceptions??
+
+=head3 Variables
+
+How did I miss this one? Probably don't want to get into references yet.
+Should mention magic global variables.
+
+=head3 Barewords
+
+There are several types. Do special tokens such as BEGIN, INIT, and
+__END_/__DATA__ count?
+
+=head3 Functions
+
+Anything invocable. Again, probably don't want to get into references.
+
+=head3 Objects and Methods
+
+How do you discuss these without discussing references?
+
+=head2 Data Types
+
+Containers; Perl's built-in data types.
+
+=head3 Scalars
+
+Should be simple, provided that the reference part stays put. Should mention
+numerification/stringification, magical auto-increment.
+
+=head3 Arrays
+
+Make it clear that arrays and lists are very different things. Discuss
+distinction between indexes and length? Avoid $# altogether? (Don't have to
+be comprehensive, just clear.)
+
+C<$var1, $var2, $var3> -- why arrays are useful
+
+=head3 Hashes
+
+Insertion order is appropriate. So is stringification.
+
+The "variable variable name" problem.
+
+=head3 References
+
+Stringification and magical increment/decrement don't work. Brace
+disambiguation is ugly, but usually suffices.
+
+=head2 Operators
+
+The types of things you find in C<perldoc -f>.
+
+=head3 Context
+
+This is perhaps the most important insight into Perl on its own as a language,
+so it belongs in a position where everyone will read it. It might need its own
+short chapter.
+
+=head3 Operator Types
+
+Not sure what exactly this will be; perhaps infix, prefix, and built-in
+functions.
+
+=head3 Coercion
+
+Here's why there's a specific operators chapter. Working effectively with Perl
+means taking advantage of operators to enforce stringy, numeric, etc contexts
+on dynamic values.
+
+=head2 Functions
+
+=head3 Parameters
+
+Positional, named, reference. Most people use positional; mention the others.
+
+=head3 Context
+
+Not sure this belongs here, but having brought it up in the context of
+operators, it might be appropriate. Certainly return context is interesting.
+
+=head3 Scope
+
+May be more appropriate conceptually before discussing parameter handling, but
+seems to flow better here. Difference between global and lexical.
+
+=head3 Anonymous Functions
+
+Must come after references and functions. Mention typeglobs? The name doesn't
+matter to Perl. Any CV is invokable.
+
+=head3 Closures
+
+Combines scopes, lexicals, and functions. This is the time to talk about
+binding and closing over. Delayed computation. Encapsulation. Abstraction.
+
+=head2 Regular Expressions (*)
+
+This won't be a fun chapter, but it's a necessary chapter. What's the minimal
+I can write to be effective?
+
+=head3 Basic Matching
+
+Literals, quantifiers, metacharacters, alternations, character classes.
+
+=head3 Regexp Flags
+
+/e, /x, /s, /m
+
+=head3 Compiled and Composed Regexes
+
+C<qr//>
+
+=head3 Named Captures
+
+Describe Perl 5.10 features; mention older syntax in passing.
+
+=head3 Assertions and Extended Regexes
+
+Briefly describe lookbehind and lookahead. Leave it at that level of detail.
+Mention the limitations of regular expressions? New section?
+
+=head2 Useful Operators
+
+Things you use all the time but might bite you.
+
+=head3 IO
+
+print, say, open, readline
+
+=head3 Array Ops
+
+push, pop, shift, unshift, splice
+
+=head3 List Ops
+
+map, grep, for
+
+slicing
+
+Ranges?
+
+sort
+
+List::Util
+
+=head3 Hash Operations
+
+keys, values, each (watch the iterator problem)
+
+=head3 Packages and Modules
+
+A separate chapter?
+
+use
+require
+import()
+
+=head2 Objects
+
+Perhaps start with syntactic elements and then show Mouse? Vice versa? Must
+cover packages and modules first, at least.
+
+=head3 Packages
+
+Maybe need an encapsulation and modules chapter -- BEGIN and import() plus
+exporting. What's the best way to export?
+
+=head3 Blessed References
+
+Very bare-bones object stuff. Should mention constructors.
+
+=head3 Inheritance
+
+Really not a fan.
+
+=head3 Other Forms of Code Reuse
+
+Mixins, roles.
+
+=head3 Mouse
+
+Note that this is the single place in which I want to describe something other
+than the default language behavior. Hoist this up earlier?
+
+=head3 Reflection
+
+C<isa()> and C<can()>
+
+=head2 Style
+
+This may need to merge with idioms. It also might get cut down too. There's a
+lot to say in here. I don't want to repeat Perl Best Practices, but I do want
+to show off some of the power of the language when used appropriately. Cutting
+this up between Idioms and Beautiful Features may work best.
+
+=head3 Testing
+
+Might explain this much earlier, right after syntactic features, so that people
+can use it to explore on their own (and the text can use it to show off
+examples).
+
+=head2 Making the Most of Perl / Writing Real Programs
+
+Things you must use to get the most out of Perl. Context would appear here,
+but it's so fundamental to understanding the language that it has to come much
+earlier.
+
+=head3 Perldoc and POD
+
+Documentation and using the documentation.
+
+=head3 CPAN
+
+Probably deserves its own chapter.
+
+=head3 C<strict> and C<warnings> modes, C<use 5.010>
+
+Probably will have introduced these earlier.
+
+=head3 Lexical Pragmas
+
+Swap this with the previous? Depends on flow of narrative.
+
+=head3 Code Generation
+
+Don't want to get too far afield into C<Devel::Declare> and don't want to push
+C<AUTOLOAD> too heavily, but C<BEGIN>-time manipulations are very, very
+powerful.
+
+=head3 Taint Mode
+
+General security?
+
+Exceptions?
+
+=head3 Attributes
+
+This is getting esoteric, and there are grotty bits of Attribute::Handlers I
+don't want to explain. However, they're useful.
+
+=head2 Idioms
+
+A handful of common idioms often found in well-written Perl programs.
+
+=head3 Dispatch Table
+
+"I know what action I want to take based on input I get."
+
+=head3 Iteration
+
+The difference between C<for> and C<while>.
+
+=head3 C<map>-based Transformations
+
+Thinking in lists.
+
+=head3 Or-Else/Dor-Else
+
+The orcish maneuver.
+
+=head3 Early Exit Guard with Postfix Conditional
+
+return if...
+return unless...
+
+=head2 What to Avoid
+
+Here are features of Perl I wish would go away. I can't say "The internals",
+can I? Explain how it's so painful you wish it weren't there or how it's
+almost impossible to get right.
+
+=head3 Dative Syntax
+
+Barewords plus parsing sugar = avoid.
+
+=head3 Method/Sub Equivalence
+
+The best you can do is treat them differently.
+
+=head3 Tie
+
+You may run into this. You shouldn't have to.
+
+=head3 Prototypes
+
+They don't work the way you think they do.
+
+=head3 Typeglobs and Reflection
+
+Use Mouse instead.
+
+=head2 Avoid When Possible
+
+Features that don't work quite right but you can't quite avoid.
+
+=head3 Reference Syntax
+
+Don't double $$, collapse multiple arrows, use braces copiously.
+
+=head3 Barewords
+
+You can almost always avoid them.
+
+=head3 Global Variables
+
+You only need to know a few; localize the rest.
+
+=head3 Missing Defaults
+
+No C<strict>, C<warnings> by default. Sigh. What else?
+
+=head3 Blessed Hashes
+
+Covered in OO; mention here.
+
+=head3 C<isa()> and C<can()>
+
+Maybe covered in OO?
+
+=head3 C<AUTOLOAD>
+
+In general, you can avoid this.
+
+=head3 C<UNIVERSAL>
+
+In conjunction with attributes....
+
+=head2 What's Missing
+
+Core Modules
+Date/Time
+Cookbooky Stuff
+Build.PL/Module::Build ?
8 sections/anonymous_functions.pod
@@ -0,0 +1,8 @@
+=head3 Anonymous Functions
+
+function references
+anonymous functions
+autopromotion of blocks into anonymous functions
+ - grep/map/sort
+ - sort is a weird example though, because it can take a name
+ - Test::Exception
15 sections/closures.pod
@@ -0,0 +1,15 @@
+=head3 Closures
+
+Named closures
+ - variable will not stay shared
+Anonymous closures
+
+Delayed computation with closures
+ - iterators
+ - thunks
+
+Encapsulation with closures
+ - closure/object equivalence (? too advanced ?)
+
+Abstraction with closures
+ - I have no idea what this means anymore
130 sections/context_philosophy.pod
@@ -0,0 +1,130 @@
+=head3 Context
+
+Z<context_philosopy>
+
+Like spoken languages, Perl has the notion of X<context>. The meaning of an
+idea depends on its surroundings. This may sound strange and foreign in a
+programming language -- surely predictability is more important than
+expressivity sometimes -- but you're already proficient with how Perl uses
+context. At least, you understand it in written English.
+
+Context in Perl means that certain operators have different behavior if you
+want zero, one, or many things from them. That is to say, it's possible (and
+likely) that Perl code will do something different if you say "Fetch me zero
+results; I don't care about any results" than if you say "Fetch me one result"
+or "Fetch me many results."
+
+Likewise, certain contexts make it clear that you expect a numeric value, or a
+value that's either true or false.
+
+Context can be tricky if you try to write or read Perl code as a series of
+single expressions standing apart from from their environment. You may find
+yourself slapping your forehead after a long debugging session only to find
+that your assumptions about context were incorrect.
+
+However, if you're cognizant of contexts, you'll find that they often make your
+code clearer, more concise, and more flexible.
+
+=head4 Void, Scalar, and List Context
+
+One of the axes of context governs I<how many> items you expect. To express
+this context, evaluate an expression as an rvalue and see what you do with the
+results.
+
+Suppose you have a function called C<some_expensive_operation()> which performs
+an expensive calculation and can produce many, many results. If you call the
+function on its own and never do anything with its return value, this is X<void
+context>:
+
+ some_expensive_operation();
+
+Calling the function and assign its return value to a single element evaluates
+the function in X<scalar context>:
+
+ my $single_result = some_expensive_operation();
+
+Assigning the results of calling the function to an array or a list, or using
+it in a list evaluates the function in X<list context>:
+
+ my @all_results = some_expensive_operation();
+ my ($single_element) = some_expensive_operation();
+ process_list_of_results( some_expensive_operation() );
+
+The second line of the previous example may look confusing; the parentheses
+there give a hint to the compiler that although there's only a scalar, this
+assignment should occur in list context. It's equivalent to assigning to a
+scalar and a temporary array, and then throwing away the array:
+
+ my ($single_element, @rest) = some_expensive_operation();
+
+Evaluating a function or expression in list context apart from assignment can
+occasionally produce confusion. Remember that argument lists and lists
+themselves -- especially lists used in hash initializers -- propogate list
+context to the expressions they contain. Thus both of these calls to
+C<some_expensive_operation()> are in list context, even if it's not immediately
+obvious:
+
+ process_list_of_results( some_expensive_operation() );
+
+ my %results =
+ (
+ cheap_operation => $cheap_operation_results,
+ expensive_operation => some_expensive_operation(), # OOPS!
+ );
+
+The latter example probably will not do what you expect, at least if you expect
+it to return a single item. See the L<scalar_operator> operator for the
+solution.
+
+One easy way to think of the amount type of context is to compare it to
+subject-verb number agreement in English. You already know how to do this,
+even if you don't remember English classes. If you can spot the error in the
+sentence "Perl are a fun language", you've trained your mind to spot similar
+types of errors.
+
+=head4 Numeric, String, and Boolean Context
+
+Another type of context determines how Perl understands a piece of data -- not
+I<how many> pieces of data you want, but what the data means. You've probably
+already noticed that Perl's flexible about figuring out if you have a number or
+a string and converting between the two as you want them. The type contexts
+help explain how it does so.
+
+Suppose you want to compare the contents of two strings. The C<eq> operator
+tells you if the strings contain the same information:
+
+ say "Catastrophic crypto fail!" if $alice eq $bob;
+
+You may have had a baffling experience where you I<know> that the strings are
+different, but they still compare the same:
+
+ my $alice = 'alice';
+ say "Catastrophic crypto fail!" if $alice == 'Bob'; # OOPS
+
+In exchange for not having to declare (or at least track) explicitly what
+I<type> of data a variable contains or a function produces, Perl offers
+specific type contexts that tell the compiler how to treat a given value at a
+specific time. That is, the C<eq> operator treats its operands as strings by
+enforcing X<string context> on them. The C<==> operator enforces numeric
+context.
+
+Perl will do its best to convert from strings to numbers and back depending on
+the operators you use; it's important to use the proper operators for the type
+of context you want.
+
+There's also a X<boolean context> which occurs when you use a value in a
+conditional statement -- in the previous examples, the results of the C<eq> and
+C<==> operators were in this boolean context.
+
+In rare circumstances, you may need to force an explicit context where no
+appropriately typed operator exists. To force a numeric context, add zero to a
+variable. To force a string context, concatenate a variable with the null
+string. To force a boolean context, double the negation operator:
+
+ my $numeric_x = 0 + $x; # forces numeric context
+ my $stringy_x = '' . $x; # forces string context
+ my $boolean_x = !!$x; # forces boolean context
+
+In general, type contexts are less difficult to understand and see than the
+amount contexts. Once you realize they exist, you'll rarely make mistakes with
+them.
31 sections/cpan.pod
@@ -0,0 +1,31 @@
+=head2 The CPAN
+
+Explaining the CPAN
+Explaining how to use it?
+
+Where do I cover features of Perl that make it useful without explaining them?
+
+Modularity:
+ - packages
+ - modules
+ - objects
+
+Community standards:
+ - installability
+ - testing
+ - documentation standards
+ - CPAN accoutrements:
+ - CPAN testers
+ - RT accounts
+ - ratings
+ - history and documentation
+ - dependencies
+ - rough standards on licensing
+
+Usability:
+ - nearly configuration-free running
+ - CPAN client included in core
+ - strong core culture of use and recommendation
+
+Broad applicability:
+ - 7258 uploaders/66400 modules/17268 distributions
24 sections/implicit_ideas.pod
@@ -0,0 +1,24 @@
+=head3 Implicit Ideas
+
+Z<implicit_ideas>
+
+Like many spoken languages, Perl provides shortcuts to use where conciseness is
+appropriate. One such implicit idea is the notion of L<context>, where both
+the compiler and a programmer reading the code can understand the expected
+number of results or the type of an operation from existing information without
+adding explicit additional information to disambiguate.
+
+The best example of such features is the default variable, C<$_>. It's most
+notable in its I<absence>: many of Perl's built in operations perform their
+work on the contents of C<$_> in the absence of an explicit other variable.
+You can still use C<$_> as the variable, but it's unnecessary.
+
+For example, the C<chomp> operator removes any trailing newline sequence from
+the given string:
+
+ my $uncle = "Bob\n";
+ say "'$uncle'";
+ chomp $uncle;
+ say "'$uncle'";
+
+Without an explicit variable, C<chomp> removes the trailing newline sequence from C<$_>. Thus, these two lines of code are equivalent:
109 sections/names.pod
@@ -0,0 +1,109 @@
+=head3 Names
+
+Z<names>
+
+Perl X<names> or X<identifiers> follow simple rules. A variable name, a
+function name, a package name, a class name, and a filehandle name must all
+start with a letter or an underscore. They may optionally include any
+combination of letters, numbers, and underscores. These are all valid Perl
+identifiers:
+
+ my $name;
+ my @_private_names;
+ my %Names_to_Addresses;
+
+ sub anAwkwardName3;
+
+ package a_less_awkward_name;
+
+These are invalid Perl identifiers:
+
+ my $invalid name;
+ my @3;
+ my %~flags;
+
+ package a-lisp-style-name;
+
+When the C<utf8> pragma is in effect, you may use any valid UTF-8 characters in
+identifiers, provided that they still start with a letter or underscore and
+optionally contain one or more alphanumeric or underscore characters.
+
+The rules are different for X<symbolic lookups>, however. These rules only
+apply to literal names found in source code. Perl's dynamic nature makes it
+possible to refer to entities with names generated at runtime or provided as
+input to a program. This is more difficult than the straightforward approach,
+but it's also more flexible -- if slightly more dangerous. In particular,
+invoking subroutines or methods indirectly or looking up namespaced symbols
+lets you bypass Perl's parser, which is the only part of Perl that enforces
+these grammatic rules. This can produce confusing code, however.
+
+=head4 Variable Names and Sigils
+
+X<Variable names> always have a leading sigil which suggests the type of the
+variable. X<Scalar variables> have a leading dollar sign (C<$>) character.
+X<Array variables> have a leading at sign (C<@>) character. X<Hash variables>
+have a leading percent sign (C<%>) character:
+
+ my $scalar;
+ my @array;
+ my %hash;
+
+These sigils offer a sort of namespacing for the variables, where it's possible
+(though often confusing) to have variables of the same name but different
+types:
+
+ my ($bad_name, @bad_name, %bad_name);
+
+Perl won't get confused, but the person reading the code will.
+
+Perl 5 uses X<variant sigils>, where the sigil on a variable may change
+depending on what you do with it. For example, to access a (scalar) element of
+an array or a hash, the sigil changes to the dollar sign (C<$>) character:
+
+ my $hash_element = $hash{ $key };
+ my $array_element = $array[ $index ]
+
+Using a scalar element of an aggregate as an lvalue in an expression imposes
+L<scalar context> on the rvalue.
+
+Similarly, accessing multiple elements of a hash or an array -- an operation
+known as L<slicing> -- uses the at symbol (C<@>) as the leading sigil and
+enforces list context:
+
+ my @hash_elements = @hash{ @keys };
+ my @array_elements = @array[ @elements ];
+
+The most reliable way to determine the type of a variable -- scalar, array, or
+hash -- is to look at the operations performed on it. Scalars support all
+basic operations, such as string, numeric, and boolean manipulations. Arrays
+support indexed access through square brackets. Hashes support keyed access
+through curly brackets.
+
+=head4 Package-Qualified Names
+
+Occasionally you may need to refer to functions or variables in a separate
+namespace. Often you will need to refer to a class by its X<fully-qualified
+name>. These names are collections of package names joined by double colons
+(C<::>). That is, C<My::Fine::Package> refers to a logical collection of
+variables and functions.
+
+While the standard naming rules apply to package names, by convention
+user-defined packages all start with uppercase letters. The Perl core reserves
+lowercase package names for built-in pragmas, such as C<strict> and
+C<warnings>. This is a policy enforced by community guidelines, however.
+
+Namespaces do not nest in Perl 5. Perl enforces no logical relationship
+between C<Some::Package> and C<Some::Package::Refinement>. Consider choosing
+naming and code-organization schemes which the apparent relationship between
+the names of these classes or namespaces makes obvious.
+
+=begin footnote
+
+This isn't I<entirely> true. When Perl looks up a symbol in
+C<Some::Package::Refinement>, it looks in the C<main::> symbol table for a
+symbol representing the C<Some::> namespace, then in there for the C<Package::>
+namespace, and so on. This is merely a storage mechanism, however. It has no
+further implications on the relationships between parent and child or sibling
+packages.
+
+=end footnote
23 sections/perl_community.pod
@@ -0,0 +1,23 @@
+=head2 The Perl Community
+
+perl.org
+ - lists
+ - beginners
+ - project pages
+ - hosting
+ - dev.perl.org
+
+cpan.org
+ - see the CPAN section
+
+perlmonks.org
+ - a community of Perl programmers, centered around answering questions
+
+use.perl.org
+perlsphere.net
+planet.perl.org
+perlbuzz.com
+
+development
+ - rt.perl.org
+ - p5p mailing list
65 sections/perldoc.pod
@@ -0,0 +1,65 @@
+=head3 Perldoc
+
+Z<perldoc>
+
+One of Perl's most useful and least appreciated features is the C<perldoc>
+utility. This program is part of every full Perl 5 installation. It displays
+the documentation of every Perl module installed on the system -- whether a
+core module or one installed from the CPAN -- as well as thousands of pages of
+Perl's copious core documentation.
+
+=begin notetip
+
+If you prefer an online version, U<http://perldoc.perl.org/> hosts recent versions of the Perl documentation.
+
+=end notetip
+
+The default behavior of C<perldoc> is to display the documentation for a named
+module or a specific section of the core documentation:
+
+ $ B<perldoc Modern::Perl>
+ $ B<perldoc perlfaq>
+
+The first command-line example extracts documentation written for the
+C<Modern::Perl> module and displays it in a form appropriate for your screen.
+Per long-held community guidelines, CPAN modules tend to have documentation
+organized among similar lines, with a description of the module, sample uses,
+then more detailed explanations of how to use the code in your own projects.
+While the amount of documentation varies by author, the form of the
+documentation is remarkably consistent.
+
+The second command-line example displays a pure documentation file, the table
+of contents to the Frequently Asked Questions about Perl documents. Browsing
+this file will help you understand what Perl is capable of and how to solve
+common problems in Perl.
+
+=begin notetip
+
+Similarly, C<perldoc perltoc> will display the table of contents for I<all> of
+the Perl documentation. Browse that file too.
+
+=end notetip
+
+The C<perldoc> utility has many more abilities (see C<perldoc perldoc>). Two
+of the most useful are the C<-q> and the C<-f> flags. The C<-q> flag takes a
+keyword or keywords and searches only the Perl FAQ, displaying all results.
+Thus C<perldoc -q sort> returns three questions: I<How do I sort an array by
+(anything)?>, I<How do I sort a hash (optionally by value instead of key)?>,
+and I<How can I always keep my hash sorted?>.
+
+The C<-f> flag displays the core documentation for a built-in Perl function.
+C<perldoc -f sort> explains the behavior of the C<sort> operator. If you don't
+know the name of the function you want, use C<perldoc perlfunc> to see a list
+of functions.
+
+=begin notetip
+
+C<perldoc perlop> and C<perldoc perlsyn> document Perl's symbolic operators and
+syntactic constructs; C<perldoc perldiag> explains what Perl's warning messages
+mean.
+
+=end notetip
+
+Perl's built-in documentation system is L<POD>, and you can and should use that
+to document your own code, so that C<perldoc> works with it as well as it does
+the core documentation.
518 sections/regular_expressions.pod
@@ -0,0 +1,518 @@
+=head2 Regular Expressions
+
+Perl's powerful ability to manipulate text comes in part from its inclusion of
+a computing concept known as X<regular expressions>. A regular expression (or
+X<regexp>) is a I<pattern> which describes characteristics of a string of text.
+A X<regular expression engine> interprets a pattern, applying it to strings of
+text to identify those which match.
+
+The C<perlre> documentation describes Perl regular expressions in copious
+detail, including advanced features not covered here.
+
+=head3 Literals
+
+The simplest regexps are simple substring patterns:
+
+ my $name = 'Chatfield';
+ say "Found a hat!" if $name =~ /hat/;
+
+In this snippet of code, the regular expression is C<hat>. Formally, this
+means "the C<h> character, followed by the C<a> character, followed by the C<t>
+character, appearing anywhere in the string." The forward slashes delineate
+the regexp (think of them as a form of quoting operator) and the C<=~> operator
+applies the regexp on its right to the string or expression on the left. In
+scalar context, this operator evaluates to a true value if the match succeeded.
+A negated form of this operator exists: C<!~>.
+
+=head3 The C<qr//> Operator and Regexp Combinations
+
+Regexps are first-class entities in modern Perl; you can build them with the
+C<qr//> operator:
+
+ my $hat = qr/hat/;
+ say 'Found a hat!' if $name =~ /$hat/;
+
+... and combine them into larger and more complex patterns:
+
+ my $hat = qr/hat/;
+ my $field = qr/field;
+
+ say 'Found a hat in a field!' if $name =~ /$hat$field/;
+ # or
+ like( $name, qr/$hat$field/, 'Found a hat in a field!' );
+
+=begin note
+
+The C<like()> function from C<Test::More> works much like C<is()>, except that
+its second argument is a regular expression object produced by C<qr//>.
+
+=end note
+
+This is more useful as regexps grow more complex.
+
+=head3 Quantifiers
+
+Regexp literals have little value over the use of the C<index> operator, yet
+there is some benefit to clarity. The use of the binding operators (C<=~> and
+C<!~>) indicate that the purpose is to apply a pattern to a string. Regular
+expressions get more powerful through the use of X<regexp quantifiers>, which
+allow you to specify how often a regexp component may appear in a matching
+string. The simplest quantifier is the X<zero or one quantifier>, or C<?>:
+
+ my $cat_or_ct = qr/ca?t/;
+
+ like( 'cat', $cat_or_ct, "'cat' matches /ca?t/" );
+ like( 'ct', $cat_or_ct, "'ct' matches /ca?t/" );
+
+Any atom in a regular expression followed by the C<?> character means "match
+zero or one of this atom." Thus this regular expression matches if there are
+no C<a> characters immediately following a C<c> character and immediately
+preceding a C<t> character, and it matches if there is one and only one C<a>
+character between the C<c> and C<t> characters.
+
+The X<one or more quantifier>, or C<+>, matches only if there is at least one
+of the preceding atom in the appropriate place in the string to match:
+
+ my $one_or_more_a = qr/ca+t/;
+
+ like( 'cat', $one_or_more_a, "'cat' matches /ca+t/" );
+ like( 'caat', $one_or_more_a, "'caat' matches /ca+t/" );
+ like( 'caaat', $one_or_more_a, "'caaat' matches /ca+t/" );
+ like( 'caaaat', $one_or_more_a, "'caaaat' matches /ca+t/" );
+
+ unlike( 'ct', $one_or_more_a, "'ct' does not match /ca+t/" );
+
+There is no theoretical limit to the number of quantified atoms which can
+match.
+
+The X<zero or more quantifier> is C<*>; it matches if there are zero or more
+instances of the quantified atom in the string to match:
+
+ my $zero_or_more_a = qr/ca+t/;
+
+ like( 'cat', $zero_or_more_a, "'cat' matches /ca*t/" );
+ like( 'caat', $zero_or_more_a, "'caat' matches /ca*t/" );
+ like( 'caaat', $zero_or_more_a, "'caaat' matches /ca*t/" );
+ like( 'caaaat', $zero_or_more_a, "'caaaat' matches /ca*t/" );
+ like( 'ct', $zero_or_more_a, "'ct' matches /ca*t/" );
+
+This may seem useless, but it combines nicely with other regexp features to
+indicate that you don't care about what may or may not be in that particular
+position in the string to match.
+
+Finally, you can specify the number of times an atom may match with X<numeric
+quantifiers>. C<{n}> means that a match must occur exactly I<n> times.
+
+ # equivalent to qr/cat/;
+ my $only_one_a = qr/ca{1}t/;
+
+ like( 'cat', $only_one_a, "'cat' matches /ca{1}t/" );
+
+C<{n,}> means that a match must occur at least I<n> times, but may occur more
+times:
+
+ # equivalent to qr/ca+t/;
+ my $at_least_one_a = qr/ca{1,}t/;
+
+ like( 'cat', $at_least_one_a, "'cat' matches /ca{1,}t/" );
+ like( 'caat', $at_least_one_a, "'caat' matches /ca{1,}t/" );
+ like( 'caaat', $at_least_one_a, "'caaat' matches /ca{1,}t/" );
+ like( 'caaaat', $at_least_one_a, "'caaaat' matches /ca{1,}t/" );
+
+C<{n,m}> means that a match must occur at least I<n> times and cannot occur
+more than I<m> times:
+
+ my $one_to_three_a = qr/ca{1,3}t/;
+
+ like( 'cat', $one_to_three_a, "'cat' matches /ca{1,3}t/" );
+ like( 'caat', $one_to_three_a, "'caat' matches /ca{1,3}t/" );
+ like( 'caaat', $one_to_three_a, "'caaat' matches /ca{1,3}t/" );
+ unlike( 'caaaat', $one_to_three_a, "'caaaat' does not match /ca{1,3}t/" );
+
+=head3 Metacharacters
+
+Regular expressions get more powerful as individual regexp atoms get more
+general. For example, the C<.> character in a regular expression means "match
+any character except a newline". If you wanted to search a list of dictionary
+words for every word which might match 7 Down ("Rich soil") n a crossword
+puzzle, you might write:
+
+ for my $word (@words)
+ {
+ next unless $word =~ /l..m/;
+ say "Possibility: $word";
+ }
+
+Of course, if your list of potential matches were anything other than a list of
+words, this metacharacter could cause false positives, as it also matches
+punctuation characters, whitespace, numbers, and many other characters besides
+word characters. A better option is to use the C<\w> metacharacter, which
+represents all alphanumeric characters and the underscore:
+
+ next unless $word =~ /l\w\wm/;
+
+Use the C<\d> metacharacter to match digits:
+
+ # not a robust phone number matcher
+ next unless $potential_phone_number =~ /\d{3}-\d{3}-\d{4}/;
+ say "I have your number: $potential_phone_number";
+
+Use the C<\s> metacharacter to match whitespace, whether a literal space, a tab
+character, a carriage return, a form-feed, or a newline:
+
+ my $two_three_letter_words = qr/\w{3}\s\w{3}/;
+
+These three metacharacters have negated forms. To match any character
+I<except> a word character, use C<\W>. To match a non-digit character, use
+C<\D>. These are somewhat rare in practice, but they can be very expressive.
+The non-space metacharacter, C<\S>, is more common.
+
+If the range of allowed characters in these four groups isn't specific enough,
+you can specify your own X<character classes> by enclosing them in square
+brackets:
+
+ my $vowels = qr/[aeiou]/;
+ my $maybe_cat = qr/c${vowels}t/;
+
+=begin notetip
+
+The curly braces around the name of the scalar variable C<$vowels> is a hint to
+the parser to disambiguate the name. Without that, Perl would interpret the
+variable name as C<$vowelst>, which causes a compile-time error about an
+unknown variable.
+
+=end notetip
+
+If the characters in your character set form a contiguous range, you can use
+the hyphen character (C<->) as a shortcut to express that range.
+
+ my $letters_only = qr/[a-zA-Z]/;
+
+Mention the hyphen to the start of the class to include it:
+
+ my $interesting_punctuation = qr/[-!?]/;
+
+Just as the word and digit class metacharacters (C<\w> and C<\d>) have
+negations, so you can negate a character class. Use the caret (C<^>) as the
+first element of the character class to mean "anything I<except> these
+characters":
+
+ my $not_a_vowel = qr/[^aeiou]/;
+
+=begin notetip
+
+As you might expect, use a caret anywhere but this position to make it a member
+of the character class. A dash in a negated character class must I<follow> the
+negating carat.
+
+=end notetip
+
+=head3 Greediness
+
+The C<+> and C<*> quantifiers by themselves are X<greedy quantifiers>; they
+match as many times as possible. This is particularly pernicious when using
+the tempting-but-troublesome "match any amount of anything" pattern C<.*>:
+
+ # a poor regexp
+ my $hot_meal = qr/hot.*meal/;
+
+ say 'Found a hot meal!' if 'I have a hot meal' =~ $hot_meal;
+ say 'Found a hot meal!'
+ if 'I did some one-shot, piecemeal work!' =~ $hot_meal;
+
+That's bad enough, but the problem is more obvious when you expect to match a
+short portion of a string. Greediness always tries to match as much of the
+input string as possible I<first>, backing off only when it's obvious that the
+match will not succeed. Thus you may not be able to fit all of the results
+into the four boxes in 7 Down if you go looking for "loam" with:
+
+ my $seven_down = qr/l${letters_only}*m/;
+
+You'll get C<Alabama>, C<Belgium>, and C<Bethlehem> for starters. The soil
+might be nice there, but they're all too long -- and the matches start in the
+middle of the words.
+
+X<Regexp anchors> force a match at a specific position in a string. The
+X<start of line anchor>, or C<\A>, ensures that any match will start at the
+beginning of the string:
+
+ # also matches "lammed", "lawmaker", and "layman"
+ my $seven_down = qr/\Al${letters_only}{2}m/;
+
+Similarly, the X<end of line anchor>, or C<\Z>, ensures that any match will
+I<end> at the end of the string.
+
+ # also matches "loom", which is close enough
+ my $seven_down = qr/\Al${letters_only}{2}m\Z/;
+
+If you're not fortunate enough to have a Unix word dictionary file available,
+the X<word boundary metacharacter>, or C<\b>, matches only at the boundary
+between a word character (C<\w>) and a non-word character C(<\W>):
+
+ my $seven_down = qr/\bl${letters_only}{2}m\b/;
+
+=begin notetip
+
+Like Perl, there's more than one way to write a regular expression. Consider
+choosing the most expressive and maintainable one.
+
+=end notetip
+
+Sometimes you can't anchor a regular expression. In those cases, you can turn
+a greedy quantifier into a parsimonious quantifier by appending the C<?>
+quantifier:
+
+ my $minimal_greedy_match = qr/hot.*?meal/;
+
+In this case, the regular expression engine will prefer the I<shortest>
+possible potential match, increasing the number of characters identified by the
+C<.*?> token combination only if the current number fails to match. Because
+C<*> matches zero or more times, the minimal potential match for this token
+combination is zero characters:
+
+ say 'Found a hot meal' if 'ilikeahotmeal' =~ /$minimal_greedy_match/;
+
+If this isn't what you want, use the C<+> quantifier to match one or more
+items:
+
+ my $minimal_greedy_at_least_one = qr/hot.+?meal/;
+
+ unlike( 'ilikeahotmeal', $minimal_greedy_at_least_one );
+
+ like( 'i like a hot meal', $minimal_greedy_at_least_one );
+
+The C<?> quantifier modifier also applies to the C<?> (zero or one matches)
+quantifier as well as the range quantifiers. In this case, it causes the
+regexp to match as few times as possible.
+
+In general, the greedy modifiers C<.+> and C<.*> are tempting but dangerous
+tools. For simple programs which need little maintenance, they may be quick
+and easy to write, but non-greedy matching seems to match human expectations
+better. If you find yourself writing a lot of regular expression with greedy
+matches, test them thoroughly with a comprehensive and automated test suite
+with representative data to lessen the possibility of unpleasant surprises.
+
+=head3 Capturing
+
+It's often useful to match part of a string and use it later; perhaps you want
+to extract an address or an American telephone number from a string:
+
+ my $area_code = qr/\(\d{3}\)/;
+ my $local_number = qr/\d{3}-?\d{4}/;
+ my $phone_number = qr/$area_code\s?$local_number/;
+
+=begin notetip
+
+The parentheses in C<$area_code> need preceding backslashes to escape them for
+reasons which will become obvious in a moment.
+
+=end notetip
+
+Given a string, C<$contact_info>, which contains contact information, you can
+apply the C<$phone_number> regular expression and X<capture> any matches into a
+variable with C<named captures>:
+
+ if ($contact_info =~ /(?<phone>$phone_number))
+ {
+ say "Found a number $+{phone}";
+ }
+
+That construct can look like a big wad of punctuation, but it's fairly simple
+when you can recognize it in one chunk:
+
+ (?<capture name> ... )
+
+The parentheses enclose the entire capture. The C<< ?< name > >> construct
+must follow the left parenthesis. It provides a name for the capture buffer.
+The rest of the construct within the parentheses is a regular expression. If
+and when the regexp matches this fragment, Perl stores the captured portion of
+the string in the special variable C<%+>: a hash where the key is the name of
+the capture buffer and the value is the portion of the string which matched the
+buffer's regexp.
+
+Parentheses are special to Perl 5 regular expressions; by default they perform
+the same grouping behavior as parentheses do in regular Perl code. They also
+enclose captures. To use literal parentheses in a regular expression, you must
+preface them with a backslash, just as in the C<$area_code> variable.
+
+Named captures are new in Perl 5.10, but captures have existed in Perl for many
+years. You may encounter X<anonymous captures> as well:
+
+ if ($contact_info =~ /($phone_number))
+ {
+ say "Found a number $1";
+ }
+
+In this code, the parentheses enclose the fragment to capture, but there is no
+regexp directive giving the I<name> of the capture. In this case, Perl stores
+the captured substring in a series of magic variables starting with C<$1> and
+continuing for as many capture groups are present in the regexp. The I<first>
+matching capture that Perl finds goes into C<$1>, the second into C<$2>, and so
+on.
+
+While the syntax for named captures is longer than for anonymous captures, they
+provide additional clarity. You do not have to count the number of opening
+parentheses to figure out whether a particular capture is C<$4> or C<$5>, and
+composing regexps from smaller regexps is much easier, as they're less
+sensitive to changes in position or the presence or absence of capturing in
+individual atoms.
+
+=begin notetip
+
+Name collisions are still possible with named captures, though that's less
+frequent than number collisions with anonymous captures. Consider avoiding the
+use of captures of any kind in regexp fragments; save it for top-level regexps
+themselves.
+
+=end notetip
+
+=head3 Grouping and Alternation
+
+Previous examples have all applied quantifiers to simple atoms. They can also
+apply to more complex subpatterns as a whole:
+
+ my $pork = qr/pork/;
+ my $beans = qr/beans/;
+
+ like( 'pork and beans', qr/\A$pork?.*?$beans/,
+ 'maybe pork, definitely beans' );
+
+If you expand the regexp manually, the results may surprise you:
+
+ like( 'pork and beans', qr/\Apork?.*?beans/,
+ 'maybe pork, definitely beans' );
+
+This still matches, but consider a more specific pattern:
+
+ my $pork = qr/pork/;
+ my $and = qr/and/;
+ my $beans = qr/beans/;
+
+ like( 'pork and beans', qr/\A$pork? $and? $beans/,
+ 'maybe pork, maybe and, definitely beans' );
+
+It can be useful to express a regexp in terms of "this or that". This is the
+purpose of the X<alternation> metacharacter, C<|>.
+
+ my $rice = qr/rice/;
+ my $beans = qr/beans/;
+
+ like( 'rice', qr/$rice|$beans/, 'Found some rice' );
+ like( 'beans', qr/$rice|$beans/, 'Found some beans' );
+
+The alternation metacharacter indicates that either preceding fragment may
+match. Be careful about what you interpret as a regexp fragment, however:
+
+ like( 'rice', qr/rice|beans/, 'Found some rice' );
+ like( 'beans', qr/rice|beans/, 'Found some beans' );
+ unlike( 'ricb', qr/rice|beans/, 'Found some weird hybrid' );
+
+It's possible to interpret the pattern C<rice|beans> as meaning C<ric>,
+followed by either C<e> or C<b>, followed by C<eans> -- but that's incorrect.
+Alternations always include the entire fragment to the nearest regexp
+delimiter, whether the start or end of the pattern, an enclosing parenthesis,
+another alternation character, or a square bracket. In the interest of
+reducing confusion, consider using named fragments in variables
+(C<$rice|$beans>) or grouping alternation candidates in X<non-capturing
+groups>:
+
+ my $starches = qr/(?:pasta|potatoes|rice)/;
+
+The C<(?:)> sequence is like capturing parentheses, except that it doesn't
+capture. It only performs grouping. Thus it doesn't interfere with counting
+for positional captures.
+
+=begin notetip
+
+If you print a compiled regular expression, you'll see that its stringification
+includes an enclosing non-capturing group; C<qr/rice|beans/> stringifies as
+C<(?-xism:rice|beans)>.
+
+=end notetip
+
+Note also that you can have more than two alternatives in a pattern.
+
+=head3 Other Escape Sequences
+
+Perl interprets several characters in regular expressions as X<metacharacters>.
+They mean something different than literal characters. For example, square
+brackets always denote a character class and parentheses group and optionally
+capture pattern fragments.
+
+If you want to refer to a I<literal> instance of a metacharacter, you must
+X<escape> it with a backslash (C<\>). Thus C<\(> refers to a single left
+parenthesis and C<\]> refers to a single right square bracket. C<\.> refers to
+a literal period character instead of the "match anything but an explicit
+newline character" atom.
+
+Other useful metacharacters that often need escaping are the pipe character
+(C<|>) and the dollar sign (C<$>). Don't forget about the quantifiers either:
+C<*>, C<+>, and C<?> also qualify.
+
+If this all makes you think your patterns will be full of slashes, get familiar
+with the X<metacharacter disabling characters>. The C<\Q> metacharacter
+disables metacharacter processing until it reaches the C<\E> sequence. This is
+especially useful when taking match text from a source you don't control when
+writing the program:
+
+ my ($text, $literal_text) = @_;
+
+ return $text =~ /\Q$literal_text\E/;
+
+In this case, the C<$literal_text> argument can contain anything -- the string
+C<** ALERT **>, for example, and Perl will not interpret the zero-or-more
+quantifier as a quantifier. Instead, it will parse the regexp as C<\*\* ALERT
+\*\*> and attempt to match literal asterisk characters.
+
+=begin note
+
+Be careful about processing regular expressions from untrusted user input,
+however. It's possible to craft a malicious regular expression which can
+perform an effective denial-of-service attack against your program.
+
+=end note
+
+=head3 Assertions
+
+The regexp anchors (C<\A> and C<\Z>) are a form of X<regexp assertion>, which
+requires that a condition is present but doesn't actually match a character in
+the string. That is, the regexp C<qr/\A/> will I<always> match, no matter what
+the string contains. The metacharacters C<\b> and C<\B> are also assertions.
+
+X<Zero-width assertions> match a I<pattern>, not just a condition in the
+string. Most importantly, they do not consume the portion of the pattern that
+they match. For example, to find a cat on its own, you might use a word
+boundary assertion:
+
+ my $just_a_cat = qr/cat\b/;
+
+... but if you want to find a non-disastrous feline, you might use a
+X<zero-width negative look-ahead assertion>:
+
+ my $safe_feline = qr/cat(?!astrophe)/;
+
+This construct, C<(?!...)>, matches the phrase C<cat> only if the phrase
+C<atastrophe> does not immediately follow.
+
+There's also a X<zero-width positive look-ahead assertion>:
+
+ my $disastrous_feline = qr/cat(?=astrophe)/;
+
+... which matches the phrase C<cat> only if the phrase C<astrophe> immediately
+follows. This may seem useless, as a normal regular expression can accomplish
+the same thing, but consider if you want to find all non-catastrophic words in
+the dictionary which start with C<cat>. One possibility is:
+
+ my $disastrous_feline = qr/cat(?!astrophe)/;
+
+ while (<$words>)
+ {
+ chomp;
+ next unless /\A(?<some_cat>$disastrous_feline.*)\Z/;
+ say "Found a non-catastrophe '$+{some_cat}'";
+ }
+
+Note the curious behavior of the construct. Because the assertion is
+zero-width, it consumes none of the source string. Thus the anchored C<.*\Z>
+pattern fragment must be present; otherwise the capture would only capture the
+C<cat> portion of the source string.
29 sections/scoping.pod
@@ -0,0 +1,29 @@
+=head3 Scope
+
+lexical scope
+ - my
+ - most important scoping
+ - not in symbol table; resolved at compile time
+ - allows closures (whether you intend them or not)
+
+dynamic scope
+ - local/our
+ - relies on global variables
+ - visible from outside namespace/lexical scope
+ - our subtly broken
+
+state scoping
+ - state
+ - lexical scoping but retains value after initialized
+ - interaction with closures?
+
+implicit lexicalization
+ - my $_
+ - what is the scope of C< for my $i ( ... ) >?
+ - which constructs localize $_ and which do not?
+
+necessary localization
+ - $/, $", other globals
+ - $!, $@
+ - sometimes $_
+ - an exception: $| (? - I think so)
5 sections/values.pod
@@ -0,0 +1,5 @@
+=head3 Values
+
+Z<values>
+
+
Please sign in to comment.
Something went wrong with that request. Please try again.