# Introduction to Programming Perl

<img src="https://github.com/keithjtamu/perl/blob/master/image/perl_camel.jpg?raw=1" align="right" height="80" width="80" />

Texas A&amp;M High Performance Research Computing (TAMU HPRC)
<br />
Keith Jackson
<br />
<a href="https://hprc.tamu.edu/training/intro_perl.html">course page</a>
<br />
<a href="http://perldoc.perl.org">perldoc.perl.org</a>
<br />
<a href="https://github.com/keithjtamu/perl">samples in github</a>





# HPRC Info

| Item | Value |
| -------------: | :------------- |
|  **Homepage** | <a href="https://hprc.tamu.edu/">hprc.tamu.edu</a>  |
|  **E-mail** | <a href="mailto:help@hprc.tamu.edu">help@hprc.tamu.edu</a> |
| **Phone** | 979-845-0219 |
| **Address** | <a href="http://aggiemap.tamu.edu?bldg=0425">Henderson Hall, 222 Jones St</a> |

# Agenda <a class="anchor" id="TOC"></a>

<img src="https://github.com/keithjtamu/perl/blob/master/image/agenda.jpg?raw=1" align="right" />

- [executing your program](#executing)
- [finding documentation](#documentation)
- [statement syntax](#syntax)
- [variables](#variables), [constants](#constants), and [operators](#operators)
- [control flow](#control)
- [error messages](#errors)
- [I/O](#IO)
- [regular expressions](#regex)
- [references](#references)

# How to Run a Perl Program  <a class="anchor" id="executing"></a>


Program files usually end in "**.pl**" suffix.  Run it from command line with **perl** command:

```bash
   perl sum.pl
```

See <a href="https://github.com/keithjtamu/perl/blob/master/sum.pl">sum.pl</a>.

# Connect to Titan vm

* Open MobaXterm
* Open local terminal
* `ssh titan.tamu.edu`
* Use your NetID password

```
pwd
cd Introduction_to_Perl
ls
cat sum.pl
perl sum.pl
```

# Making Script Executable

In order to run a Perl script or other shell script as a program, we need the **<tt>#!</tt>** (_sha-bang_) on the first line, followed by the interpreter.  You can put the full path to **perl** or **<tt>/bin/env perl</tt>**

```
./sum.pl
ls -l sum.pl
chmod +x sum.pl
ls -l sum.pl
./sum.pl
```

# Run Perl with **-e**

You can run short programs with **-e** option:

```bash
   perl -e 'printf("%.15f\n", 2*atan2(1,0))'
```


In [None]:
%%bash

perl -e 'printf("%.15f\n", 2*atan2(1,0))'

3.141592653589793


# Testing With **<tt>eval perl</tt>**

```bash
eval perl
@a = ('red', 'green', 'blue');
print @a, "\n";
print @a . "\n";
print "@a\n";
```

In [None]:
%%bash

eval perl
@a = ('red', 'green', 'blue');
print @a, "\n";
print @a . "\n";
print "@a\n";


redgreenblue
3
red green blue


# REPL Perl Shells

**REPL** = Read-Eval-Print-Loop

- **perlconsole**
  - installed on ada cluster **/scratch/training/Perl/bin/perlconsole**
  - available from [CPAN](http://search.cpan.org/~sukria/perlconsole-0.4/perlconsole)
- **perli**
  - portable
  - available from [github](http://search.cpan.org/~sukria/perlconsole-0.4/perlconsole")
- **ips**
  - requires rlwrap
  - three lines of code (from [comment](https://stackoverflow.com/a/22840242) )
  
```bash
#!/bin/sh
echo 'This is Interactive Perl shell'
/bin/rlwrap -A -pgreen -S"perl> " /usr/bin/perl -wnE'say eval()//$@'
```


# Documentation <a class="anchor" id="documentation"></a>

1. Unix man pages
```bash
man perl
man perlop
man perlfunc
perldoc -f sort
```
2. Websites, such as http://perldoc.perl.org

   (get specific man pages, like <b>perlop</b> at http://perldoc.perl.org/perlop.html)

# Perl Statement Syntax <a class="anchor" id="syntax"></a>

- statements separated by semicolons: ";"
- statement blocks surrounded by curly braces: "{ }"
- comments preceded by pound sign (hash) : "#"



In [None]:
%%perl

# set the top value of our loop
$max = 5;

$i = 1;    # initialize i
while ($i <= $max)
{
   $a[$i] = $i ** 2 + 1;
   printf("%3d %8.2f\n", $i, $a[$i]);
   $i++;
}

  1     2.00
  2     5.00
  3    10.00
  4    17.00
  5    26.00


# Variable Names <a class="anchor" id="variables"></a>

Names in Perl:
  - start with a letter
  - contain letter, numbers, and underscores "\_"
  - case sensitive

Major types:
  - \$ scalars (single value)
  - @ lists
  - % hash tables

# Scalars

- Start with dollar sign "\$"
- Types include:
  - integer
  - floating point
  - string
  - binary data
  - reference (like a pointer)
- Perl is not strongly typed  


In [None]:
%%perl

$a = 2.75;           # scalar, floating point

$color = 'yellow';   # scalar, string

print "a is $a\n";
print "color is $color\n";

a is 2.75
color is yellow


# Lists

- start with an at symbol "@"
- also called "arrays"
- one-dimensional
- index zero-based
- not strongly typed (can contain mixture of scalar types)
- index syntax uses square brackets

In [None]:
%%perl

@a = ( 2, 3, 5, 7, 11 );
print "a sub three is $a[3]\n";
$a[5] = 13;
print "a is @a\n";

# cast list to scalar to get length
$len = @a;
print "len is $len\n";

# dollar hash gives you index of last element
$last_index = $#a;

print "last index is $last_index\n"

a sub three is 7
a is 2 3 5 7 11 13
len is 6
last index is 5


# Hash Tables

- start with percent sign "\%"
- implemented as a list with special properties
- key-value pairs
- keys are unique
- keys can be any scalar value
- values can be any scalar value
- index syntax users curly braces

In [None]:
%%perl

%h = ( 'name' => "Sam", phone => "555-1212");

print "name is $h{name}\n";
print "phone is $h{phone}\n";
$h{age} = 27;

@keys = keys %h;
print "h has keys: @keys\n";

name is Sam
phone is 555-1212
h has keys: phone age name


# Same Name, Different Variables

- same name can be reused for scalars, lists, hashes, subroutines, file handles, etc.
- each of the following variables refer to something different:

```
$x = 5;
$X = -3.2;
@x = ( 3, 7, 23 );
%x = ( height => 23, width = 14 );
&x($arg1, $arg2);
```

In [None]:
%%perl

# it is bad practice to use the same name for list and scalar
@ary = (3, 5, 7, 9);
$ary = 22;

print "@ary\n";
print "$ary\n";
print "$ary[2]\n"



3 5 7 9
22
7


# Assignment

- use single equal sign
- left-hand side must be variable (memory location)
- otherwise known as "L-value"

```
$a = 2.75;          # scalar, floating point
$color = 'yellow';  # scalar, string
# array of four strings
@ary = ( "Perl", "C++", "Java", "FORTRAN", );
# hash, two keys (strings), numeric values
%ht = ( "AAPL" => 282.52, "MSFT" => 24.38, );
```

# Operator Assignment

- ***var* *op*= *val*** same as ***var* = *var* *op* *val***

```
$a += 10;   $a = $a + 10;       # add 10 to $a
$b *= 2.5;         # multiply $b by 2.5
$name .= ', Jr.';  # append to $name
$mode &= 0722;     # apply bitwise mask to $mode
```

# Numeric Constants <a class="anchor" id="constants"></a>

```
10       # decimal integer
0722     # octal integer
0xF3E9   # hexadecimal integer
-2.532   # floating point
6.022E23 # floating point (scientific)
```

# String Constants

```
'simple'       # single quotes
"one\ttwo"     # double quotes
<<"END_TEXT";  # here document (dbl quotes)
1\t15kg\t3:25
2\t9kg\t0:22
END_TEXT
```

See <a href="https://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Operators">https://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Operators</a>.

# Quotes

- single quotes: taken literally
- double quotes: interpolate variables and escape sequences
  - \n newline
  - \t tab
- here document: multiline (terminating string indicates quote type

In [None]:
%%perl

$name     = 'Jupyter';
$greeting = "Hello, my name is $name.";
$mistake  = 'Hello, my name is $name.';

print "greeting: ", $greeting, "\n";
print "mistake : ", $mistake, "\n";


greeting: Hello, my name is Jupyter.
mistake : Hello, my name is $name.


In [None]:
%%perl

$name = 'Jupyter'; $age = 3; $favcolor = 'maroon';

print <<"END_STATEMENT";   # semicolon goes before quote
 This is a paragraph about $name, who is $age years old.
 $name has a favorite color: $favcolor.

 And now, a tab-aligned table:

 15\t23\t83
 22\t 4\t 6
END_STATEMENT


 This is a paragraph about Jupyter, who is 3 years old.
 Jupyter has a favorite color: maroon.

 And now, a tab-aligned table:

 15	23	83
 22	 4	 6


# Operators <a class="anchor" id="operators"></a>

- numeric operators are basically like C/C++/Java, plus **\*\*** for exponent
- string operators **.** (concatenate) and **x** (repeat)
- regular expression operators **=~**, **!~**, **s///**, **tr///**, etc.
- logical operators (from C/C++/Java or English)
- relational operators (numeric and stringwise are different)

See: http://perldoc.perl.org/perlop.html

# Numeric operators

```
$x + 3      -4.3 / $z       2 ** 10
$i++        --$j            $f % $mod
```

# Relational Operators

```
# numeric
$x == 3      $a >= -4.3       $m != $n

# stringwise
$y eq "AT"   $title lt 'm'    $q ne 'B'
```

# Sort Comparison Operators
<img src="https://github.com/keithjtamu/perl/blob/master/image/scales.png?raw=1" align="right" />

|   a |  b                       |
| ---: | :-----                |
| -1 | Left is less than right |
|  0 | Left equals right |
|  1 | Left is greater than right |

```
# numeric      stringwise
$x <=> $y      $s cmp $t
```

# Logical Operators

```
# C-style
$ready && ($y > 2)    !$done     $e || $r

# lower precedence
$ready and $y > 2    not $done    $e or $r

# ternary conditional
($d != 0) ? ($n / $d) : "Inf"
```

# Other Operators

```
# string concatenation
"First " . $item

# string repetition
"AB" x 10

# range operator
1..10        0..$#ary
```

In [None]:
%%perl

$first = "Juan"; $last = "Lopez";

$full = $first . " " . $last;

print "$last, $first or $full\n";

$howdies = "Howdy! " x 10;

print $howdies, "\n";

@list = (1..55);

print "@list\n";

Lopez, Juan or Juan Lopez
Howdy! Howdy! Howdy! Howdy! Howdy! Howdy! Howdy! Howdy! Howdy! Howdy! 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55


# Perl Control Statements <a class="anchor" id="control"></a>

- Conditionals:
  - **if**/**elsif**/**else**
  - **unless** (inverse of "**if**")
  - experimental **given**/**when** like **switch**/**case**
- Loops:
  - **for(;;)**
  - **foreach ()**
  - **while()**
  - **until()**

In [23]:
%%perl

$grade = 44;

if ($grade >= 90)
{
    $letter = 'A';
}
elsif ($grade >= 80)
{
    $letter = 'B';
}
elsif ($grade >= 70)
{
    $letter = 'C';
}
else
{
    $letter = 'F';
}

printf("A grade of %d is a(n) %s\n", $grade, $letter);

A grade of 44 is a(n) F


In [24]:
%%perl

$grade = 88;

unless ($grade >= 70)
{
    $letter = 'F';
}
else
{
    $letter = ($grade >= 90) ? 'A' : (($grade >= 80) ? 'B' : 'C')
}

printf("A grade of %d is a(n) %s\n", $grade, $letter);

A grade of 88 is a(n) B


In [26]:
%%perl

# Newer features require explicit version request
use v5.14;

no if ($] >= 5.018), 'warnings' => 'experimental';


my $grade = 88;
my $letter;

# "given" similar to "switch" in C/C++/Java
given ($grade)
{
    # "when" similar to "case"
    when ($_ >= 90) { $letter = 'A' }
    when ($_ >= 80) { $letter = 'B' }
    when ($_ >= 70) { $letter = 'C' }
    default { $letter = 'F' }
}


printf("A grade of %d is a(n) %s\n", $grade, $letter);

A grade of 88 is a(n) B


# Conditional After Statement

- put statement *before* conditional
- does not need braces
- no way to handle "else"
- can be confusing

```
print "f is even\n" if ($f % 2 == 0);

print "not capitalized\n"
    unless ($name =~ /^[A-Z]/);
```

# Logical Operator as Conditional

- Perl does "lazy" logical evaluation
- does not need braces
- no way to handle "else"
- can be confusing

```
($y != 0) &&           # C-style
    $ratio = $x / $y;
    
(-f $myfile) or        # word style
    die "File $myfile does not exist!";
```

# While Loops

- **while(*condition*)**
- Conditional runs before the loop body each time
  - if true, runs the loop body
  - if false, continues after loop

```perl
while ($answer ne 'q')
{
  ...
  # need something to eventually make condition false
}
```

In [30]:
%%perl

$i = 0;
while ($i < 7)
{
   printf("%d %d\n", $i, $i *23);
   $i += 2;
}

0 0
2 46
4 92
6 138


# C-Style **for** Loops

1. initialize
2. test
3. increment

```
for ($i = 1; $i < 10; $i++) {
  print "$i\n";
}
```

Equivalent:

```
$i = 1;
while ($i < 10) {
   print "$i\n";
   $i++;
}
```

In [32]:
%%perl

for ($i = 1; $i <= 10; $i++)
{
   printf("%3d %4d\n", $i, $i ** 2);
}

  1    1
  2    4
  3    9
  4   16
  5   25
  6   36
  7   49
  8   64
  9   81
 10  100


# Foreach Loops

- specify **foreach *var* (*list*)**
- loop variable gets each value in the list

In [33]:
%%perl

@colors = ('red', 'orange', 'yellow', 'green', 'blue', 'purple');

# $col is loop variable, @colors is list
foreach $col (@colors)
{
    print "color is $col\n"
}

color is red
color is orange
color is yellow
color is green
color is blue
color is purple


# Spanning a List

- iterate over indices, from 0 to $#array

```
foreach $i (0..$#ary)
{
   printf("%3d %5.2f\n", $i, $ary[$i]);
}
```

In [34]:
%%perl

foreach $i (0..20)
{
   $sq[$i] = $i ** 2;
   $cb[$i] = $i ** 3;
}

print "@sq\n";
print "@cb\n";

0 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400
0 1 8 27 64 125 216 343 512 729 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000


# foreach (keys %hash)

- **keys(*hash*)** generates list of keys (random order)



In [36]:
%%perl

%cost = ( AAPL => 174.13, IBM => 139.03, MSFT => 112.44);

foreach $name (keys %cost)
{
   if ($cost{$name} > 150)
   {
      print "Sell $name\n"
   }
}

Sell AAPL


# while ( ( k, v ) = each %hash)

- **each(*hash*)** iterates through key, value pairs (random order)



In [37]:
%%perl

%cost = ( AAPL => 174.13, IBM => 139.03, MSFT => 112.44);

while (($k, $v) = each %cost)
{
   if ($v > 150)
   {
      print "Sell $k\n"
   }
}

Sell AAPL


# Map Function

- iterates over list
- loop variable is $_
- syntax can be tricky


In [38]:
%%perl

%cost = ( AAPL => 154.48, IBM => 146.78, MSFT => 74.26);

map { print "Sell $_\n" if ($cost{$_} > 150) } (keys %cost);

Sell AAPL


# Common Syntax Errors <a class="anchor" id="errors"></a>

- missing braces around block
- missing prefix symbol before variable name
- using one equal sign instead of two, meaning assignment instead of test
- missing close quote

In [45]:
%%perl

# get a random number between 0 and 1
$remaining = rand(1.0);

if ($remaining > 0.5)
   print("have remaining work to do\n");

# if ($remaining > 0.5){
#    print("have remaining work to do\n");}

syntax error at - line 6, near ")
   print"
Execution of - aborted due to compilation errors.


CalledProcessError: Command 'b'\n# get a random number between 0 and 1\n$remaining = rand(1.0);\n\nif ($remaining > 0.5)\n   print("have remaining work to do\\n");\n\n# if ($remaining > 0.5){\n#    print("have remaining work to do\\n");}\n'' returned non-zero exit status 255.

In [49]:
%%perl

@ary = (2, 3, 5, 7, 11, 13, 17, 19);

ary[20] = 33;
# $ary[20] = 33;

for ($i = 0; $i <= $#ary; $i++)
{
   print $ary[$i], "\n";
}

syntax error at - line 4, near "ary["
Execution of - aborted due to compilation errors.


CalledProcessError: Command 'b'\n@ary = (2, 3, 5, 7, 11, 13, 17, 19);\n\nary[20] = 33;\n# $ary[20] = 33;\n\nfor ($i = 0; $i <= $#ary; $i++)\n{\n   print $ary[$i], "\\n";\n}\n'' returned non-zero exit status 255.

In [52]:
%%perl

# use warnings;

@ary = (2, 3, 5, 7, 11, 13, 17, 19);

for ($i = 0; $i <= $#ary; $i++)
{
   if ($ary[$i] = 7)
  #  if ($ary[$i] == 7)
   {
       print "value is 7\n";
   }
   print $ary[$i], "\n";
}

value is 7
7
value is 7
7
value is 7
7
value is 7
7
value is 7
7
value is 7
7
value is 7
7
value is 7
7


# Errors and Warnings
<img src="https://github.com/keithjtamu/perl/blob/master/image/ConfusedComputer.png?raw=1" align="right" />

- Warning is “non-fatal”, can still keep going
- Error can be at:
  - Compile time, e.g., syntax errors
  - Run time:
    - Numeric, e.g., division by zero
    - Reference type, e.g., hash vs. List
    - Object method

# Warnings

- start perl with **-w** option on command line
- **use warnings;** pragma

Turns on extra warnings of common problems.  Compiler will complain but keep running the program.

See <a href="https://github.com/keithjtamu/perl/blob/master/bounds.pl">bounds.pl</a>.

In [55]:
%%perl

my @a = (1, 2);

my $b = $a[0] + $a[1] + $a[2];

print "b = $b\n";

b = 3


# "should be =="

```
print "c is ten\n" if ($c = 10);
```

```
$ ./twoeq.pl
Found = in conditional, should be == at ./twoeq.pl line 8.
```

See <a href="https://github.com/keithjtamu/perl/blob/master/twoeq.pl">twoeq.pl</a>.

In [60]:
%%perl

my $c = int(rand(20));

print "c is $c\n";
print "c is ten\n" if ($c = 10);
print "c is $c\n";


c is 18
c is ten
c is 10


# Syntax Errors

```
while val < 10
```

```
$ perl synerr.pl
syntax error at synerr.pl line 4, near "while val "
syntax error at synerr.pl line 8, near "}"
Execution of synerr.pl aborted due to compilation errors.
```

See <a href="https://github.com/keithjtamu/perl/blob/master/synerr.pl">synerr.pl</a>.

# Runaway Strings

```
$ ./closeq.pl
Scalar found where operator expected at ./closeq.pl line 8, near "print "$a"
  (Might be a runaway multi-line "" string starting on line 3)
        (Do you need to predeclare print?)
Backslash found where operator expected at ./closeq.pl line 8, near "$a\"
        (Missing operator before \?)
String found where operator expected at ./closeq.pl line 8, at end of line
        (Missing semicolon on previous line?)
syntax error at ./closeq.pl line 8, near "print "$a"
Can't find string terminator '"' anywhere before EOF at ./closeq.pl
```

See <a href="https://github.com/keithjtamu/perl/blob/master/closeq.pl">closeq.pl</a>.

# DIY Diagnostics

- do your own checks before errors are generated:

```
die "denominator zero " if ($d == 0);
$r = $n / $d;

(-f $myfile)
   or die "no such file $myfile";
open(FD, "<$myfile");
```

- **die()** prints message to STDERR and exits program
- **warn()** prints message to STDERR and continues running
- see also <a href="https://perldoc.perl.org/Carp.html">Carp</a> library

# System Error String **$!**

If a system call fails, look at **$!** variable

```
open FH, $myfile, "r" or
    die "open $myfile: $! ";
```

```
$ perl nofile.pl
open needfile: No such file or directory at nofile.pl line 4.
```

See <a href="https://github.com/keithjtamu/perl/blob/master/nofile.pl">nofile.pl</a>.

# Variable Scope

- using the “**my**” declaration makes a variable local to the statement block or file.
- don’t use “**local**” declaration unless you understand it—the “**my**” declaration is almost always what you want.
- the “**our**” declaration is used for declaring global variables within packages (modules).

# Examples of Local Variables

- put multiple declared variables in list, using parentheses

```
my $a;
my @f;
my $x = "initial value";
my ($i, $j, $k);
foreach my $item (@ilist) {
    $sum += $item;
}
```


# Example of Scope

```
my @numlist = (3, 4, 5);

foreach my $item (@numlist) {
    print "item = $item\n";
}
print "item = $item\n";
```

See <a href="https://github.com/keithjtamu/perl/blob/master/scopeex.pl">scopeex.pl</a>.

# Strict Pragma

Put “**use strict**” pragma at top to require use of “**my**” declarations.

```
use strict;
$x = 15;
my $y = 19;
print ”y = $y\n";
```


In [61]:
%%perl

use strict;
$x = 15;
my $y = 19;
print "y = $y\n";


Global symbol "$x" requires explicit package name (did you forget to declare "my $x"?) at - line 3.
Execution of - aborted due to compilation errors.


CalledProcessError: Command 'b'\nuse strict;\n$x = 15;\nmy $y = 19;\nprint "y = $y\\n";\n'' returned non-zero exit status 255.

# File Handles <a class="anchor" id="IO"></a>

- STDIN, STDOUT, and STDERR correspond to the stdin, stdout, and stderr of C/C++.
- A simple Perl filehandle is a name by itself, typically all caps.
- Filehandles can be scalar variables, too.
- Objects from **IO::File** and similar library modules have advantages.

# Printing

- **print()** prints a list of strings
- **printf()** formats output as in C/C++/Java
- **syswrite()** unbuffered, low-level (useful for binary data)

# When Using Filehandles
- **print *FH* *LIST* **
- **printf *FH* *format-string*, *arg1*, ...**
- **syswrite *FH*, *data*, *length* **

(note the comma only used in **syswrite()** after filehandle)

# Print Examples

```
print "Hello, world!\n";
print STDOUT "Hello, world!\n"; # same
print STDERR "File not found:", $fname, "\n";
printf MYRPT "%d items processed\n", $count;
printf MYRPT ("%d items processed\n", $count);      # same
```

# Reading Input

- **< >**
  - input from STDIN
  - buffered input
- **< *FH* >**
  - input from *FH*
  - buffered input
- **sysread(*FH*, *data*, *count*)**
  - low-level read from *FH*
  - unbuffered

# Opening a File

- the open function opens a file for reading, writing, appending, or more
- can specify a bareword file handle name or a scalar variable

```
open(MYINFO, "<info.dat") or
    die("open info.dat: $! ");
open $fh, ">logfile" or die $!;
```

# Choosing a Mode

| mode  | normal | bi-directional |
| ---:  | :----: | :-----------:  |
| read  |  **<** |  **+<**        |
| write |  **>** |  **+>**        |
| append |  **>>** |  **+>>**        |

```
open(MYINFO, ”+<info.dat") or
    die("open info.dat: $! ");
open $fh, $fname, ">>" or die $!;
```

# Putting it Together

```
open RAW, $rfile, "<";
open $ofh, ">>results";
while ($line = <RAW>) {
    @useful = myprocess($line);
    printf $ofh "%d,%6.2f,%s\n", @useful;
}
```

# Reading From Commands

- put pipe "**|**" at beginning to print to (as filter)
- put pipe "**|**" at end to read output from (as source)

```
# run the ps command and pipe the output to PS
open(PS, "/bin/ps -auxww|") or die $!;

# read all of the output (removing processes running by root or apache)
@lines = grep !/^(root|apache)/, <PS>;

# print first ten lines
print @lines[0..10];

```

In [63]:
%%perl

open(PS, "/bin/ps -auxww|") or die $!;

@lines = grep !/^(root|apache)/, <PS>;

print @lines[0..10];

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND


# File Encodings

Specify a particular character encoding for reading and writing

```
my $handle   = undef;
my $filename = "/some/path/to/a/textfile/goes/here";
my $encoding = ":encoding(UTF-8)";

open($handle, ">> $encoding", $filename)
    || die "$0: can't open $filename for appending: $!";
print $handle @lines;
```

# Working With Binary Files

- Use [**sysopen()**](http://perldoc.perl.org/functions/sysopen.html), [**sysread()**](http://perldoc.perl.org/functions/sysopen.html), and [**syswrite()**](http://perldoc.perl.org/functions/sysopen.html)
- Do not mix with [**<>**](http://perldoc.perl.org/perlop.html#I%2fO-Operators), [**print()**](http://perldoc.perl.org/functions/print.html), [**printf()**](http://perldoc.perl.org/functions/printf.html), and [**say()**](http://perldoc.perl.org/functions/say.html) which use buffered I/O
- See the [Perl pack() tutorial](https://perldoc.perl.org/perlpacktut.html#Packing-and-Unpacking-C-Structures) to learn how to read fixed-length records properly aligned in memory to the correct data types

# Regular Expressions <a class="anchor" id="regex"></a>

- Regular expressions are patterns designed to concisely match a set of strings which follow the rules of the given pattern
- Regular expressions have a long history in Unix (**ed**, **grep**, **vi**, **awk**)
- Perl extends the traditional regular expressions, adding new rules

# Quick Examples

```
$name =~ /Mich/;     # Michael, Michelle, Michigan, ...

$shell =~ /^[abck]sh/;     # csh, ksh (not bash)

$fname !~ /.*\.[ch]$/;     # not a C source

$command =~ s?^?/usr/bin?;    # prepend dir

$dosfile =~ tr/A-Z/a-z/;      # change case
```

# Main Regexp Operators

|**Operator** | **Use** | **Return** |
| ------------- |:-------------:| -----:|
| qr/_pattern_/ | precompile | regexp |
| /_pattern_/    | match      | boolean |
| m{_pattern_}    | match      | boolean |
| s/_pattern_/_replacement_/ | substitute  | count of replacements |
| s{_pattern_}{_replacement_} | substitute  | count of replacements |
| tr/_set1_/_set2_/ | transliterate  | count of replacements |
| y?_set1_?_set2_? | transliterate  | count of replacements |


# [**split()**](https://perldoc.perl.org/functions/split.html) and [**grep()**](https://perldoc.perl.org/functions/grep.html)

- **split()** function divides a string using regexp for the separator pattern

```
split(/[:,]/, 'a:fg:x:::2,2:3 KB');
```

- **grep()** function finds matches in a list

```
grep /^A.*s$/, qw(Adams Aaron Avons arts);
```

In [65]:
%%perl

$input = 'Adams:Aaron:Avons:arts';
@full = split(/:/, $input);
print "full = (@full)\n";

@picks = grep /^A.*s$/, @full;
print "picks = (@picks)\n";

full = (Adams Aaron Avons arts)
picks = (Adams Avons)


# Metacharacters

| Character | Use |
|------|-----|
| \\ | Quote the next metacharacter |
| ^  | Match start of line |
| .  | Match any one character |
| $  | Match end of line |
| &#124; | Alternation |
| () | Grouping |
| \[ \] | Character class |

See [Regexp tutorial](http://perldoc.perl.org/perlre.html)

# Quantifiers

| Character | Use |
|------|-----|
| &#42;| 0 or more |
| +  | 1 or more |
| ?  | 0 or 1 |
| {n}  | exactly _n_ |
| {n,} | _n_ or more |
| {n,m} | at least _n_ but no more than _m_ |

See [Regexp tutorial](http://perldoc.perl.org/perlre.html)

# Escape Sequences

| Character | Use |
|------|-----|
| \\t \\n \\033 | C-style control characters |
| \\l  | lowercase next character |
| \\u  | uppercase next character |
| \\L  | lowercase until **\\E** |
| \\U  | uppercase until **\\E** |
| \\E  | end case modification |
| \\Q  | disable metacharacters until **\\E** |

See [Regexp tutorial](http://perldoc.perl.org/perlre.html)

# Character Classes

| Character | Use |
|------|-----|
| \\w | word character: ** \[a-zA-Z0-9\_\] ** |
| \\W  | non-word character: ** \[^a-zA-Z0-9\_\] ** |
| \\s  | space character (space, tab) |
| \\S  | non-space |
| \\d  | digit character: ** \[0-9\] ** |
| \\D  | non-digit character: ** \[^0-9\] ** |
| \\1 \\2 \\3  | back refrences to groupings with parens |

See [Regexp tutorial](http://perldoc.perl.org/perlre.html)

# Capture Buffers

- ** \( \) ** grouping is saved in buffers
  ** \\1 **, ** \\2 **, ... or \(better\) ** \$1 **, ** \$2 **, ...
    
```
$line =~ /^(\w+) (\d+)\s*(\w+)?$/;
$name = $1; $count = $2; $optlabel = $3;

$fullname =~ s/^(\w+), (\w+)$/$2 $1/?;

($fn, $ln) = ($N =~ /^(\w+) \w+ (\w+)$/);
```

# Process Counting

```
use IO::File;     # import module

# open passwd file for reading
my $pwtbl = new IO::File "</etc/passwd";

# run ps command, open output as a pipe for reading
open my $pscom, "ps hauxww |";

# get real name from fifth column, indexed to login in first column
my %realname = ();
while (<$pwtbl>) {
    @fields = split(/:/);
    next unless ($fields[2] > 1000);
    $realname{$fields[0]} = $fields[4];
}

# scan processes, counting by login
my %numprocs = ();
while (<$pscom>) {
   my ($login) = (m/^(\w+)/);
   next unless (exists $realname{$login});
   $numprocs{$login} = 0
        unless (exists $numprocs{$login});
   $numprocs{$login}++;
}

# print a summary of totals by user
foreach my $login (sort keys %numprocs) {
    printf("%4d procs for %9s (%s)\n",
        $numprocs{$login}, $login,
        $realname{$login});
}
```

In [66]:
%%perl

use IO::File;     # import module

# open passwd file for reading
my $pwtbl = new IO::File "</etc/passwd";

# run ps command, open output as a pipe for reading
open my $pscom, "ps hauxww |";

# get real name from fifth column, indexed to login in first column
my %realname = ();
while (<$pwtbl>) {
    @fields = split(/:/);
    next unless ($fields[2] > 1000);
    $realname{$fields[0]} = $fields[4];
}

# scan processes, counting by login
my %numprocs = ();
while (<$pscom>) {
   my ($login) = (m/^(\w+)/);
   next unless (exists $realname{$login});
   $numprocs{$login} = 0
        unless (exists $numprocs{$login});
   $numprocs{$login}++;
}

# print a summary of totals by user
foreach my $login (sort keys %numprocs) {
    printf("%4d procs for %9s (%s)\n",
        $numprocs{$login}, $login,
        $realname{$login});
}


# References <a class="anchor" id="references"></a>

<img src="https://github.com/keithjtamu/perl/blob/master/image/pointer-dog.gif?raw=1" />

# References

Perl references are scalars which contain a pointer to:
    - another scalar
    - an array
    - a hash table
    - a subroutine
    - typeglobs

# References to Variables

```
$sc_ref = \$number;   # scalar

$ar_ref = \@namelist; # array

$hs_ref = \%lookup;   # hash

$sb_ref = \&mysub;    # subroutine
```

# References to Anonymous

```
$ar_ref = [ 4, 3, 3, 7 ];       # array

$hs_ref = { m => 6, n => 9 };   # hash

$sb_ref = sub { return(shift(@_) + 1) };   # subroutine
```

# Dereferencing

Dereference by:
    
1. using type symbol (** \$ **, ** @ **, ** % **, ** & **) then the reference variable ( in curly braces ** { } ** if necessary )
2. access an element by inserting arrow ** -> ** between reference variable and the element specifier: square brackets ** \[ \] ** for arrays, curly braces ** { } ** for hashes, and parens ** \( \) ** around subroutine arguments after arrow

# Bracing References

```
$sc_ref = \$number;   # scalar

printf("%d\n", ${$sc_ref});
printf("%d\n", $number);

$ar_ref = \@namelist; # array

push(@{$ar_ref}, "Henry");
push(@namelist, "Henry");

$hs_ref = \%lookup;   # hash

@logins = keys %{$hs_ref};
@logins = keys %lookup;

$sb_ref = \&mysub;    # subroutine

$rc = &{$sb_ref}($arg1, $arg2);
$rc = mysub($arg1, $arg2);
```

# Leave Off the Braces

You don't always have to surround the reference variable with braces, as long as doing so doesn't create ambiguity.

```
@{$ar_ref}    @$ar_ref
%{$hs_ref}    %$hs_ref
```

# Subelements

- You can use braces:

```
$ar_ref = \@namelist; # array

$fourth = ${$ar_ref}[3];
$fourth = $namelist[3];

foreach $i (0..$#{$ar_ref}) …
foreach $i (0..$#namelist) …
     
$hs_ref = \%lookup; # hash

$myid = ${$hs_ref}{$login};
$myid = $lookup{$login};
```

- Or you can use arrows:

```
$ar_ref = \@namelist; #array

$fourth = $ar_ref->[3];
$fourth = $namelist[3];


$hs_ref = \%lookup; # hash

$myid = ${$hs_ref}{$login};
$myid = $hs_ref->{$login};
$myid = $lookup{$login};

$wrong = $hs_ref{$login};

$sb_ref = \&mysub; # subroutine

$rc = ${$sb_ref}($arg1, $arg2);
$rc = $sb_ref->($arg1, $arg2);
$rc = mysub($arg1, $arg2);

$wrong = $sb_ref($arg1, $arg2);
$wrong = sb_ref($arg1, $arg2);
```

# Multi-Dimensional Arrays

For 2-dimensional arrays, create a list of references to lists:

```
@table =
(
   [  2, -1,  3 ],
   [  0, 10, -9 ],
   [ 18,  3,  4 ],
);

# These are all the same
$x = ${$table[1]}[2];
$x = $table[1]->[2];
$x = $table[1][2];
```

See [Perl data structures tutorial](https://perldoc.perl.org/perldsc.html)

# Get More Info at Course Page

see <a href="https://hprc.tamu.edu/training/intro_perl.html">https://hprc.tamu.edu/training/intro_perl.html</a>

- Intermediate Scripting slides cover some Perl and bash
- Extended Perl Class has slides from past years in which short course was taught over several days

[back to top](#TOC)