# Regular Expressions
## Introduction

## Regular Expressions
* Formally, they define a regular language
    * One that can be recognized with a Deterministic Finite Automata (DFA)
* Informally, they define a pattern to be matched
    * Later we will talk about more advanced uses, like substitution
* Regular expression is commonly shorted to **regexp**, **regex**, or **re**

## Applications of RegEx Matching
* Finding duplicated words in text
    * I went to **the the** store
* Recognizing dates and phone numbers in text 
    <img src="fb_messenger.png" alt="Facebook messenger, with phone number automatically underlined" style="height:25vh">
* Validating input
    <img src="fb_verify.png" alt="Facebook signup screen with message prompting to enter a valid email address">

## Applications of RegEx Substitution
* Removing duplicate words
    * I went to **the the** store $\rightarrow$ I went to **the** store
* Fixing common case errors
    * We will be learning **javascript** this semester $\rightarrow$ We will be learning **JavaScript** this semester
* Reformatting dates and telephone numbers
    * 1-410-455-1000 $\rightarrow$ +1 (410) 455-1000
* Inserting links around phone numbers, emails, etc.

## RegEx's you already might use
* File searching (file globbing)
    * ls *.png
    * rm *.bak
    * cp Lecture??.html ../
* Find & Replace in word processors/text editors
    * Most allow you to select to use RegEx matching or not
* Search Engines
    * Either allow regex's or borrow concepts from them

## Why Regular Expressions *Seem* Intimidating
* Cryptic and compact
* Whitespace sensitive (typically)
* No standard
    * Differences between implementations
* Some characters are overloaded
* Multiple solutions usually exist
* Can be time consuming to iteraively tune a regex

## Programming Language Suppport
* Almost all programming languages support regex's, either natively or through a module/library
* C++
```c++
#include <regex>
```
* Python
``` python  
import re
```
* Java
```java
import java.util.regex.Pattern;
```

## Perl & RegExs
* Perl is a programming language developed with text processing in mind
    * Current version is Perl 5, which is available on GL
* Regular Expressions are a native part of Perl
* The syntax developed for Perl is extremely popular and many other languages support it
    * Kown as Perl Compatable Regular Expressions (PCRE)

## Perl in This Class
* We will use Perl to learn regular expressions, but you are only responsible for a very small subset of perl
* All code will be of the format

```
foreach my $n (@names) {
  print $n if $n =~ /REGEX_HERE/;
}
```

or 

```
while (<>) {
    print if /REGEX_HERE/;
}

```

In [None]:
#This is necessary to print every statement on a new line
#It is activating a feature that will be standard in Perl 6
use feature qw(say);

## Data for today's examples
* We will be using the <a hrerf="http://www.rollingstone.com/music/lists/the-500-greatest-songs-of-all-time-20110407">500 Greatest Songs of All Time</a>, as published by Rolling Stone Magazine in 2010
* For convience, we have stored each song and artist as a string in a large array

In [None]:
my @songs = ("Like a Rolling Stone by Bob Dylan","(I Can\'t Get No) Satisfaction by The Rolling Stones","Imagine by John Lennon","What\'s Going On by Marvin Gaye","Respect by Aretha Franklin","Good Vibrations by The Beach Boys","Johnny B. Goode by Chuck Berry","Hey Jude by The Beatles","Smells Like Teen Spirit by Nirvana","What\'d I Say by Ray Charles","My Generation by The Who","A Change Is Gonna Come by Sam Cooke","Yesterday by The Beatles","Blowin\' in the Wind by Bob Dylan","London Calling by The Clash","I Want to Hold Your Hand by The Beatles","Purple Haze by The Jimi Hendrix Experience","Maybellene by Chuck Berry","Hound Dog by Elvis Presley","Let It Be by The Beatles","Born to Run by Bruce Springsteen","Be My Baby by The Ronettes","In My Life by The Beatles","People Get Ready by The Impressions","God Only Knows by The Beach Boys","(Sittin\' On) The Dock of the Bay by Otis Redding","Layla by Derek and the Dominos","A Day in the Life by The Beatles","Help! by The Beatles","I Walk the Line by Johnny Cash","Stairway to Heaven by Led Zeppelin","Sympathy for the Devil by The Rolling Stones","River Deep - Mountain High by Ike & Tina Turner","You\'ve Lost That Lovin\' Feelin\' by The Righteous Brothers","Light My Fire by The Doors","One by U2","No Woman, No Cry by Bob Marley & The Wailers","Gimme Shelter by The Rolling Stones","That\'ll Be the Day by Buddy Holly & The Crickets","Dancing in the Street by Martha and the Vandellas","The Weight by The Band","Waterloo Sunset by The Kinks","Tutti-Frutti by Little Richard","Georgia On My Mind by Ray Charles","Heartbreak Hotel by Elvis Presley","Heroes by David Bowie","All Along the Watchtower by The Jimi Hendrix Experience","Bridge Over Troubled Water by Simon & Garfunkel","Hotel California by Eagles","The Tracks Of My Tears by Smokey Robinson and the Miracles","The Message by Grandmaster Flash and the Furious Five","When Doves Cry by Prince and The Revolution","When A Man Loves A Woman by Percy Sledge","Louie Louie by The Kingsmen","Long Tall Sally by Little Richard","Anarchy in the U.K. by Sex Pistols","A Whiter Shade of Pale by Procol Harum","Billie Jean by Michael Jackson","The Times They Are a-Changin\' by Bob Dylan","Let\'s Stay Together by Al Green","Whole Lotta Shakin\' Going On by Jerry Lee Lewis","Bo Diddley by Bo Diddley","For What It\'s Worth by Buffalo Springfield","She Loves You by The Beatles","Sunshine of Your Love by Cream","Redemption Song by Bob Marley & The Wailers","Jailhouse Rock by Elvis Presley","Tangled Up in Blue by Bob Dylan","Crying by Roy Orbison","Walk On By by Dionne Warwick","Papa\'s Got a Brand  Bag by James Brown","California Girls by The Beach Boys","Superstition by Stevie Wonder","Summertime Blues by Eddie Cochran","Whole Lotta Love by Led Zeppelin","Strawberry Fields Forever by The Beatles","Mystery Train by Elvis Presley","I Got You (I Feel Good) by James Brown","Mr. Tambourine Man by The Byrds","You Really Got Me by The Kinks","I Heard It Through the Grapevine by Marvin Gaye","Blueberry Hill by Fats Domino","Norwegian Wood (This Bird Has Flown) by The Beatles","Every Breath You Take by The Police","Crazy by Patsy Cline","Thunder Road by Bruce Springsteen","Ring of Fire by Johnny Cash","My Girl by The Temptations","California Dreamin\' by The Mamas & The Papas","In The Still Of The Nite by The Five Satins","Suspicious Minds by Elvis Presley","Blitzkrieg Bop by Ramones","I Still Haven\'t Found What I\'m Looking For by U2","Good Golly, Miss Molly by Little Richard","Blue Suede Shoes by Carl Perkins","Great Balls of Fire by Jerry Lee Lewis","Roll Over Beethoven by Chuck Berry","Love And Happiness by Al Green","Fortunate Son by Creedence Clearwater Revival","Crazy by Gnarls Barkley","You Can\'t Always Get What You Want by The Rolling Stones","Voodoo Child (Slight Return) by The Jimi Hendrix Experience","Be-Bop-A-Lula by Gene Vincent & His Blue Caps","Hot Stuff by Donna Summer","Living For The City by Stevie Wonder","The Boxer by Simon & Garfunkel","Mr. Tambourine Man by Bob Dylan","Not Fade Away by Buddy Holly & The Crickets","Little Red Corvette by Prince","Brown Eyed Girl by Van Morrison","I\'ve Been Loving You Too Long (To Stop Now) by Otis Redding","I\'m So Lonesome I Could Cry by Hank Williams","That\'s All Right by Elvis Presley","Up On the Roof by The Drifters","You Send Me by Sam Cooke","Honky Tonk Women by The Rolling Stones","Take Me To The River by Al Green","Crazy in Love (feat. Jay-Z) by Beyonc\xc3\xa9","Shout (Parts 1 & 2) by The Isley Brothers","Go Your Own Way by Fleetwood Mac","I Want You Back by The Jackson 5","Stand By Me by Ben E. King","The House of the Rising Sun by The Animals","It\'s A Man\'s Man\'s Man\'s World by James Brown","Jumpin\' Jack Flash by The Rolling Stones","Will You Love Me Tomorrow by The Shirelles","Shake, Rattle and Roll by Big Joe Turner","Changes by David Bowie","Rock and Roll Music by Chuck Berry","Born to Be Wild by Steppenwolf","Maggie May by Rod Stewart","With or Without You by U2","Who Do You Love? by Bo Diddley","Won\'t Get Fooled Again by The Who","In the Midnight Hour by Wilson Pickett","While My Guitar Gently Weeps by The Beatles","Your Song by Elton John","Eleanor Rigby by The Beatles","Family Affair by Sly & The Family Stone","I Saw Her Standing There by The Beatles","Kashmir by Led Zeppelin","All I Have to Do Is Dream by The Everly Brothers","Please, Please, Please by James Brown","Purple Rain by Prince and The Revolution","I Wanna Be Sedated by Ramones","Everyday People by Sly & The Family Stone","Rock Lobster by The B-52\'s","Me And Bobby McGee by Janis Joplin","Lust For Life by Iggy Pop","Cathy\'s Clown by The Everly Brothers","Eight Miles High by The Byrds","Earth Angel by The Penguins","Foxey Lady by The Jimi Hendrix Experience","A Hard Day\'s Night by The Beatles","Rave On by Buddy Holly & The Crickets","Proud Mary by Creedence Clearwater Revival","The Sound of Silence by Simon & Garfunkel","I Only Have Eyes for You by The Flamingos","(We\'re Gonna) Rock Around The Clock by Bill Haley & His Comets","Moment of Surrender by U2","I\'m Waiting for the Man by The Velvet Underground","Bring the Noise by Public Enemy","Folsom Prison Blues by Johnny Cash","I Can\'t Stop Loving You by Ray Charles","Nothing Compares 2 U by Sinead O\'Connor","Bohemian Rhapsody by Queen","Fast Car by Tracy Chapman","Let\'s Get It On by Marvin Gaye","Papa Was a Rollin\' Stone by The Temptations","Losing My Religion by R.E.M.","Both Sides Now by Joni Mitchell","99 Problems by Jay-Z","Dream On by Aerosmith","Dancing Queen by ABBA","God Save the Queen by Sex Pistols","Paint It, Black by The Rolling Stones","I Fought the Law by The Bobby Fuller Four","Don\'t Worry Baby by The Beach Boys","Free Fallin\' by Tom Petty","September Gurls by Big Star","Love Will Tear Us Apart by Joy Division","Hey Ya! by OutKast","Green Onions by Booker T. & The MG\'s","Save the Last Dance for Me by The Drifters","The Thrill Is Gone by B.B. King","Please Please Me by The Beatles","Desolation Row by Bob Dylan","Who\'ll Stop the Rain by Creedence Clearwater Revival","I Never Loved a Man (the Way I Love You) by Aretha Franklin","Back in Black by AC/DC","Stayin\' Alive by Bee Gees","Knocking On Heaven\'s Door by Bob Dylan","Free Bird by Lynyrd Skynyrd","Rehab by Amy Winehouse","Wichita Lineman by Glen Campbell","There Goes My Baby by The Drifters","Peggy Sue by Buddy Holly","Sweet Child o\' Mine by Guns N\' Roses","Maybe by The Chantels","Don\'t Be Cruel by Elvis Presley","Hey Joe by The Jimi Hendrix Experience","Flash Light by Parliament","Loser by Beck","Bizarre Love Triangle by Order","Come Together by The Beatles","Positively 4th Street by Bob Dylan","Try a Little Tenderness by Otis Redding","Lean on Me by Bill Withers","Reach Out I\'ll Be There by The Four Tops","Bye Bye Love by The Everly Brothers","Gloria by Them","In My Room by The Beach Boys","96 Tears by ? and the Mysterians","Caroline, No by The Beach Boys","by Prince \xe2\x80\x93","Rockin\' in the Free World by Neil Young","Your Cheatin\' Heart by Hank Williams","Do You Believe In Magic by The Lovin\' Spoonful","Jolene by Dolly Parton","Boom Boom by John Lee Hooker","Spoonful by Howlin\' Wolf","Walk away Ren\xc3\xa9e by The Left Banke","Walk on the Wild Side by Lou Reed","Oh, Pretty Woman by Roy Orbison","Dance To The Music by Sly & The Family Stone","Hoochie Coochie Man by Muddy Waters","Fire and Rain by James Taylor","Should I Stay or Should I Go by The Clash","Good Times by Chic","Mannish Boy by Muddy Waters","Moondance by Van Morrison","Just Like a Woman by Bob Dylan","Sexual Healing by Marvin Gaye","Only the Lonely by Roy Orbison","We Gotta Get Out Of This Place by The Animals","Paper Planes by M.I.A.","I\'ll Feel A Whole Lot Better by The Byrds","Everyday by Buddy Holly & The Crickets","I Got a Woman by Ray Charles","Planet Rock by Afrika Bambaataa & the Soulsonic Force","I Fall to Pieces by Patsy Cline","Son of a Preacher Man by Dusty Springfield","The Wanderer by Dion","Stand! by Sly & The Family Stone","Rocket Man (I Think It\'s Going To Be A Long, Long Time) by Elton John","Love Shack by The B-52\'s","Gimme Some Lovin\' by The Spencer Davis Group","(Your Love Keeps Lifting Me) Higher And Higher by Jackie Wilson","The Night They Drove Old Dixie Down by The Band","Hot Fun In The Summertime by Sly & The Family Stone","Rapper\'s Delight by The Sugarhill Gang","Chain Of Fools by Aretha Franklin","Paranoid by Black Sabbath","Money Honey by The Drifters","Mack the Knife by Bobby Darin","All the Young Dudes by Mott the Hoople","Paranoid Android by Radiohead","Highway To Hell by AC/DC","Heart of Glass by Blondie","Mississippi by Bob Dylan","Wild Thing by The Troggs","I Can See for Miles by The Who","Oh, What A Night by The Dells","Hallelujah by Jeff Buckley","Higher Ground by Stevie Wonder","Ooo Baby Baby by Smokey Robinson and the Miracles","He\'s a Rebel by The Crystals","Sail Away by Randy man","Walking In The Rain by The Ronettes","Tighten Up by Archie Bell & The Drells","Personality Crisis by York Dolls","Sunday Bloody Sunday by U2","Jesus Walks by Kanye West","Roadrunner by The Modern Lovers","He Stopped Loving Her Today by George Jones","Sloop John B by The Beach Boys","Sweet Little Sixteen by Chuck Berry","Something by The Beatles","Somebody to Love by Jefferson Airplane","Born in the U.S.A. by Bruce Springsteen","I\'ll Take You There by The Staple Singers","Ziggy Stardust by David Bowie","Pictures of You by The Cure","Chapel Of Love by The Dixie Cups","Ain\'t No Sunshine by Bill Withers","Seven Nation Army by The White Stripes","You Are the Sunshine of My Life by Stevie Wonder","Help Me by Joni Mitchell","Call Me by Blondie","(What\'s So Funny \'Bout) Peace, Love And Understanding by Elvis Costello & The Attractions","Smokestack Lightning by Howlin\' Wolf","Summer Babe (Winter Version) by Pavement","Walk This Way by Run-D.M.C.","Money (That\'s What I Want) by Barrett Strong","Can\'t Buy Me Love by The Beatles","Stan (feat. Dido) by Eminem","She\'s Not There by The Zombies","Train in Vain by The Clash","Tired Of Being Alone by Al Green","Black Dog by Led Zeppelin","Street Fighting Man by The Rolling Stones","Get Up, Stand Up by Bob Marley & The Wailers","Heart of Gold by Neil Young","Sign \'O\' The Times by Prince","One Way or Another by Blondie","Like a Prayer by Madonna","One More Time by Daft Punk","Da Ya Think I\'m Sexy? by Rod Stewart","Blue Eyes Crying In The Rain by Willie Nelson","Ruby Tuesday by The Rolling Stones","With a Little Help from My Friends by The Beatles","Say It Loud (I\'m Black and I\'m Proud) by James Brown","That\'s Entertainment by The Jam","Why Do Fools Fall In Love by Frankie Lymon & The Teenagers","Lonely Teardrops by Jackie Wilson","What\'s Love Got to Do With It by Tina Turner","Iron Man by Black Sabbath","Wake Up Little Susie by The Everly Brothers","In Dreams by Roy Orbison","I Put A Spell On You by Screamin\' Jay Hawkins","Comfortably Numb by Pink Floyd","Don\'t Let Me Be Misunderstood by The Animals","Alison by Elvis Costello","Wish You Were Here by Pink Floyd","Many Rivers to Cross by Jimmy Cliff","School\'s Out by Alice Cooper","Take Me Out by Franz Ferdinand","Heartbreaker by Led Zeppelin","Cortez the Killer by Neil Young & Crazy Horse","Fight the Power by Public Enemy","Dancing Barefoot by Patti Smith Group","Baby Love by The Supremes","Good Lovin\' by The Rascals","Get Up (I Feel Like Being a) Sex Machine by James Brown","For Your Precious Love by The Impressions","The End by The Doors","That\'s The Way of the World by Earth, Wind and Fire","We Will Rock You by Queen","I Can\'t Make You Love Me by Bonnie Raitt","Subterranean Homesick Blues by Bob Dylan","Spirit in the Sky by Norman Greenbaum","Sweet Jane by The Velvet Underground","Wild Horses by The Rolling Stones","Beat It by Michael Jackson","Beautiful Day by U2","Walk This Way by Aerosmith","Maybe I\'m Amazed by Paul McCartney","You Keep Me Hangin\' On by The Supremes","Baba O\'Riley by The Who","The Harder They Come by Jimmy Cliff","Runaround Sue by Dion","Jim Dandy by Lavern Baker","Piece of My Heart by Big Brother & The Holding Company","La Bamba by Ritchie Valens","California Love (remix) (feat. Dr. Dre & Roger Troutman) by 2Pac","Candle in the Wind by Elton John","That Lady (Parts 1 & 2) by The Isley Brothers","Spanish Harlem by Ben E. King","The Loco-Motion by Little Eva","The Great Pretender by The Platters","All Shook Up by Elvis Presley","Tears in Heaven by Eric Clapton","Watching The Detectives by Elvis Costello","Bad Moon Rising by Creedence Clearwater Revival","Sweet Dreams (Are Made of This) by Eurythmics","Little Wing by The Jimi Hendrix Experience","Nowhere To Run by Martha and the Vandellas","Got My Mojo Working by Muddy Waters","Killing Me Softly With His Song by Roberta Flack","All You Need Is Love by The Beatles","Complete Control by The Clash","The Letter by The Box Tops","Highway 61 Revisited by Bob Dylan","Unchained Melody by The Righteous Brothers","How Deep Is Your Love by Bee Gees","White Room by Cream","Personal Jesus by Depeche Mode","I\'m A Man by Bo Diddley","The Wind Cries Mary by The Jimi Hendrix Experience","I Can\'t Explain by The Who","Marquee Moon by Television","Wonderful World by Sam Cooke","Brown Eyed Handsome Man by Chuck Berry","Another Brick In The Wall (Part II) by Pink Floyd","Fake Plastic Trees by Radiohead","Maps by Yeah Yeah Yeahs","Hit the Road Jack by Ray Charles","Pride (in the Name of Love) by U2","Radio Free Europe by R.E.M.","Goodbye Yellow Brick Road by Elton John","Tell It Like It Is by Aaron Neville","Bitter Sweet Symphony by The Verve","Whipping Post by The Allman Brothers Band","Ticket to Ride by The Beatles","Ohio by Crosby, Stills, Nash & Young","I Know You Got Soul by Eric B. & Rakim","Tiny Dancer by Elton John","Roxanne by The Police","Just My Imagination (Running Away With Me) by The Temptations","Baby I Need Your Loving by The Four Tops","Summer in the City by The Lovin\' Spoonful","O-o-h Child by The Five Stairsteps","Can\'t Help Falling in Love by Elvis Presley","Remember (Walkin\' In The Sand) by The Shangri-Las","(Don\'t Fear) The Reaper by Blue \xc3\x96yster Cult","Thirteen by Big Star","Sweet Home Alabama by Lynyrd Skynyrd","Enter Sandman by Metallica","Tonight\'s the Night by The Shirelles","Thank You (Falettinme Be Mice Elf Agin) by Sly & The Family Stone","C\'mon Everybody by Eddie Cochran","Umbrella (feat. Jay-Z) by Rihanna","Visions of Johanna by Bob Dylan","We\'ve Only Just Begun by Carpenters","In Bloom by Nirvana","Sweet Emotion by Aerosmith","Monkey Gone to Heaven by Pixies","I Feel Love by Donna Summer","Ode to Billie Joe by Bobbie Gentry","The Girl Can\'t Help It by Little Richard","Young Blood by The Coasters","I Can\'t Help Myself by The Four Tops","The Boys Of Summer by Don Henley","Juicy by The Notorious B.I.G","Suite: Judy Blue Eyes by Crosby, Stills & Nash","Nuthin\' but a \'G\' Thang by Dr. Dre","It\'s Your Thing by The Isley Brothers","Piano Man by Billy Joel","Blue Suede Shoes by Elvis Presley","William, It Was Really Nothing by The Smiths","American Idiot by Green day","Tumbling Dice by The Rolling Stones","Smoke on the Water by Deep Purple","Year\'s Day by U2","Everybody Needs Somebody To Love by Solomon Burke","(White Man) In Hammersmith Palais by The Clash","Ain\'t That a Shame by Fats Domino","Midnight Train To Georgia by Gladys Knight & The Pips","Ramble On by Led Zeppelin","Mustang Sally by Wilson Pickett","Alone Again Or by Love","Beast of Burden by The Rolling Stones","Love Me Tender by Elvis Presley","I Wanna Be Your Dog by The Stooges","Push It by Salt-N-Pepa","Pink Houses by John Mellencamp","In Da Club by 50 Cent","Come Go With Me by The Del-Vikings","I Shot The Sheriff by Bob Marley & The Wailers","I Got You Babe by Sonny & Cher","Come as You Are by Nirvana","Pressure Drop by Toot and the Maytals","Leader of the Pack by The Shangri-Las","Heroin by The Velvet Underground","Penny Lane by The Beatles","The Twist by Chubby Checker","Cupid by Sam Cooke","Paradise City by Guns N\' Roses","My Sweet Lord by George Harrison","Sheena Is a Punk Rocker by Ramones","All Apologies by Nirvana","Soul Man by Sam & Dave","Kiss by Prince and The Revolution","Rollin\' Stone by Muddy Waters","Get Ur Freak On by Missy Elliott","Big Pimpin\' (feat. UGK) by Jay-Z","Respect Yourself by The Staple Singers","Rain by The Beatles","Standing In The Shadows Of Love by The Four Tops","Surrender by Cheap Trick","Runaway by Del Shannon","Welcome to the Jungle by Guns N\' Roses","Into The Mystic by Van Morrison","Where Did Our Love Go by The Supremes","Do Right Woman, Do Right Man by Aretha Franklin","How Soon Is Now? by The Smiths","Last Nite by The Strokes","I Want To Know What Love Is by Foreigner","Sabotage by Beastie Boys","Super Freak by Rick James","Since U Been Gone by Kelly Clarkson","White Rabbit by Jefferson Airplane","Cry Me a River by Justin Timberlake","Lady Marmalade by Labelle","Young Americans by David Bowie","I\'m Eighteen by Alice Cooper","Just Like Heaven by The Cure","Under the Boardwalk by The Drifters","Clocks by Coldplay","I Love Rock \'N Roll by Joan Jett and The Blackhearts","I Will Survive by Gloria Gaynor","Time to Pretend by MGMT","Ignition (Remix) by R. Kelly","Brown Sugar by The Rolling Stones","Running On Empty by Jackson Browne","The Rising by Bruce Springsteen","Miss You by The Rolling Stones","Buddy Holly by Weezer","Shop Around by Smokey Robinson and the Miracles")

In [None]:
say join "\n", @songs[0..5];

## Literals

* Literals match exactly
    * Every character matches only with itself


In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Love/;
}

## Meta-Characters
* Meta-Characters are defined by each regex implementation
* They do not match themselves, but instead have special meaning
* We will go into detail about each of them, but some examples are 
    * .
    * \
    * [
    * ]
    * (
    * )

## Matching a Single Character
* There are three main ways to match a single character using meta-characters
    * The dot character
    * Programmer-defined character class
    * Built-in charachter class

## The Dot Character
* Matches any single character **except** newlines
    * This behavoir can usually be changed to also match newlines    

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Lov./;
}

## Character Class
* Often matching any character is not the desired outcome
* To specify a set of characters to match, enclose them in square brackets []
* Characters can either be enumerated like 
    * [abcde]
* Or defined as a range by using a hyphen inside the brackets
    * [a-z]

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Some[bt]/;
}

## More Character Class Examples

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /[&!]/;
} 

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /[1-9]/;
}

## Negation of a Character Class
* By placing a caret symbol (**^**) as the first character after the bracket, the meaning becomes *match anything but this character class*
* [^1-9] matches any character besides a digit

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Lov[^e]/;
}

## Built-in Character Classes
* Many specific character classes are used over and over
    * It becomes very time consuming to always type out [1-9] or [a-zA-Z]
* These common classes have shortcuts that can be used to refer to them
    * \w matches all letters and numbers
    * \W matches everything but letters and numbers
    * \d matches all numbers
    * \D matches everything but numbers
    * \s matches any space 
    * \S matches any non-space

## Built-in Character Class Examples

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /\d\d/;
}

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /by\s\d\d/;
}

## Alteration
* Sometimes we want to match from a set of not just character, but entire strings
* The pipe character **|** can be used to indicate alteration
* Important note about ordering: The order of the regex doesn't matter, the first string matched in the text will be used

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /You|Me/;
}

## Grouping
* We will see shortly that it is very useful to group together certain parts of an expression
    * This will also be useful when we talk about substitution
* To group anything together, wrap it in parentheses ** ( ) **

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /(Your|My)self/;
}

## Quantifiers
* Quantifiers allow us to specify how many times a particular character, class, or group should occur
* There are 4 main types of quantifiers
    * ? must occur 0 or 1 times
    * \* must occur 0 or more times
    * \+ must occur 1 or more times
    * Curly braces can be used to provide a custom range

## Quantifier Examples

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Loving?/;
}

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Lov.*You/;
}

## Quantifier Examples

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Danc\w+/;
}

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Lov\w{2,}\sYou/;
}

## Greedy and Non-Greedy Quantifiers
* By default, all quantifiers will attempt to match as much text as they can
* To change this behavoir, add an extra **?** symbol after the quantifier
* The non-greedy quantifiers are:
    * ??
    * \*?
    * \+?

## Anchors 
* Anchors allow us to specify that the match must start or end the string
    * Caret **^** as the first symbol of a regex forces the match to occur at the beginning of a string
    * Dollar-sign **\$** as the last symbol of the regex forces the match to occur at the end of a string

## Anchor Examples

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /^Danc/;
}

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /^Love/;
}

## Anchor Examples

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /Brothers$/;
}

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /es$/;
}

## Matching Spaces
* We saw earleir that the \s character class matches all spaces
* To match a particular space (very useful in data processing) use the following:
    * \t tab character
    * \n newline character
    * \r carriage return
    * \f form feed
* \b matches a word boundary, either a space, start or end of a line, etc.

## Escaping Meta-Characters
* To match a meta-character as a literal, use the backslash to escape it
    * \. matches a literal period in the string

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /\./;
}

## Backreferences
* One of the most powerful uses of grouping is to specify seeing the same match later in the expression
* Each group is assigned a number by the regular expression engine
    * To refer back to that group, use backslash followed by the number, e.g. \1

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /(\w)\1/;
}

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /(\b\w+\b)\s\1/;
}

## Backreference Ordering
* If there are multiple groups in a regex, they are numbered by their left parentheses
* This can get confusing, here is a helpful chart presented by Dan Hood
<img src="registers.png" alt="Diagram showing numbering of groups, the outermost group being labeled 1, followed by 2, 3, etc.">

## Modifiers
* Modifiers are placed after the final delimiter in Perl to change the behavoir of the regex
    * Other languages may use flags or argumets to functions
* Common modifiers are
    * i - perform a case insensitive match
    * g - matches all instances in a string, not just the first
        * important if you want to substitute all matches 

In [None]:
foreach my $s (@songs) {
  say $s if $s =~ /of/i;
}