Skip to content

Latest commit

 

History

History
552 lines (367 loc) · 15.3 KB

references.pod

File metadata and controls

552 lines (367 loc) · 15.3 KB

References

Perl usually does what you expect, even if what you expect is subtle. Consider what happens when you pass values to functions:

You probably expect that, outside of the function, $name contains Chuck, even though the value passed into the function gets reversed into kcuhC--and that's what happens. The $name outside the function is a separate scalar from the $name inside the function, and each one has a distinct copy of the string. Modifying one has no effect on the other.

This is useful and desirable default behavior. If you had to make explicit copies of every value before you did anything to them which could possibly cause changes, you'd write lots of extra, unnecessary code to defend against well-meaning but incorrect modifications.

Other times it's useful to modify a value in place sometimes as well. If you have a hash full of data that you want to pass to a function to update or to delete a key/value pair, creating and returning a new hash for each change could be troublesome (to say nothing of inefficient).

Perl 5 provides a mechanism by which you can refer to a value without making a copy of that value. Any changes made to that reference will update the value in place, such that all references to that value will see the new value. A reference is a first-class, built-in scalar data type in Perl 5. It's not a string, an array, or a hash. It's a scalar which refers to another first-class data type.

Scalar References

The reference operator is the backslash (\). In scalar context, it creates a single reference which refers to another value. In list context, it creates a list of references. Thus you can take a reference to $name from the previous example:

To access the value to which a reference refers, you must dereference it. Dereferencing requires you to add an extra sigil for each level of dereferencing:

The double scalar sigil dereferences a scalar reference.

Complex references may require a curly-brace block to disambiguate portions of the expression. This is optional for simple dereferences, though it can be messy:

If you forget to dereference a scalar reference, it will stringify or numify. The string value will be of the form SCALAR(0x93339e8), and the numeric value will be the 0x93339e8 portion. This value encodes the type of reference (in this case, SCALAR) and the location in memory of the reference.

Array References

You can also create references to arrays, or array references. This is useful for several reasons:

  • To pass and return arrays from functions without flattening

  • To create multi-dimensional data structures

  • To avoid unnecessary array copying

  • To hold anonymous data structures

To take a reference to a declared array, use the reference operator:

Now $cards_ref contains a reference to the array. Any modifications made through $cards_ref will modify @cards and vice versa.

You may access the entire array as a whole with the @ sigil, whether to flatten the array into a list or count the number of elements it contains:

You may also access individual elements by using the dereferencing arrow (->):

The arrow is necessary to distinguish between a scalar named $cards_ref and an array named @cards_ref from which you wish to access a single element.

Slice an array through its reference with the curly-brace dereference grouping syntax:

In this case, you may omit the curly braces, but the visual grouping they (and the whitespace) provide only helps readability in this case.

You may also create anonymous arrays in place without using named arrays. Surround a list of values or expressions with square brackets:

This array reference behaves the same as named array references, except that the anonymous array brackets always create a new reference, while taking a reference to a named array always refers to the same array with regard to scoping. That is to say:

... both $sunday_ref and $monday_ref now contain a dessert, while:

... neither $sunday_ref nor $monday_ref contains a dessert. Within the square braces used to create the anonymous array, the @meals array flattens in list context.

Hash References

To create a hash reference, use the reference operator on a named hash:

Access the keys or values of the hash by prepending the reference with the hash sigil %:

You may access individual values of the hash (to store, delete, check the existence of, or retrieve) by using the dereferencing arrow:

You may also use hash slices by reference:

Note the use of curly brackets to denote a hash indexing operation and the use of the array sigil to denote a list operation on the reference.

You may create anonymous hashes in place with curly braces:

As with anonymous arrays, anonymous hashes create a new anonymous hash on every execution.

Function References

Perl 5 supports first-class functions. A function is a data type just as is an array or hash, at least when you use function references. This feature enables many advanced features (closures). As with other data types, you may create a function reference by using the reference operator on the name of a function:

Without the function sigil (&), you will take a reference to the function's return value or values.

You may also create anonymous functions:

The use of the sub keyword without a name compiles the function as normal, but does not install it in the current namespace. The only way to access this function is through the reference.

You may invoke the function reference with the dereferencing arrow:

Think of the empty parentheses as denoting an invocation dereferencing operation in the same way that square brackets indicate an indexed lookup and curly brackets cause a hash lookup. You may pass arguments to the function within the parentheses:

You may also use function references as methods with objects (moose); this is most useful when you've already looked up the method:

Filehandle References

Filehandles can be references as well. When you use open's (and opendir's) lexical filehandle form, you deal with filehandle references. Stringifying this filehandle produces something of the form GLOB(0x8bda880).

Internally, these filehandles are objects of the class IO::Handle. When you load that module, you can call methods on filehandles:

You may see old code which takes references to typeglobs, such as:

This idiom predates lexical filehandles, introduced as part of Perl 5.6.0 in March 2000... so you know how old that code is.. You may still use the reference operator on typeglobs to take references to package-global filehandles such as STDIN, STDOUT, STDERR, or DATA--but these represent global data anyhow. For all other filehandles, prefer lexical filehandles.

Besides the benefit of using lexical scope instead of package or global scope, lexical filehandles allow you to manage the lifespan of filehandles. This is a nice feature of how Perl 5 manages memory and scopes.

Reference Counts

How does Perl know when it can safely release the memory for a variable and when it needs to keep it around? How does Perl know when it's safe to close the file opened in this inner scope:

Perl 5 uses a memory management technique known as reference counting. Every value in the program has an attached counter. Perl increases this counter every time something takes a reference to the value, whether implicitly or explicitly. Perl decreases that counter every time a reference goes away. When the counter reaches zero, Perl can safely recycle that value.

Within the inner block in the example, there's one $fh. (Multiple lines in the source code refer to it, but there's only one reference to it; $fh itself.) $fh is only in scope in the block and does not get assigned to anything outside of the block, so when the block ends, its reference count reaches zero. The recycling of $fh calls an implicit close() method on the filehandle, which closes the file.

You don't have to understand the details of how all of this works. You only need to understand that your actions in taking references and passing them around affect how Perl manages memory--with one caveat (circular_references).

References and Functions

When you use references as arguments to functions, document your intent carefully. Modifying the values of a reference from within a function may surprise calling code, which expects no modifications.

If you need to commit a destructive operation on the contents of a reference without affecting the reference itself, copy its values to a new variable:

This is only necessary in a few cases, but it's good policy to be explicit in those cases to avoid surprises for the callers. If your references are more complex (nested_data_structures), consider the use of the core module Storable and its dclone (deep cloning) function.

POD ERRORS

Hey! The above document had some coding errors, which are explained below:

Around line 3:

A non-empty Z<>

Around line 354:

A non-empty Z<>

Around line 435:

A non-empty Z<>

Around line 469:

Deleting unknown formatting code N<>

Around line 482:

A non-empty Z<>