## Introduction to Regular Expression Syntax

Let's learn a little bit of the regular expression syntax.  Before we do that, I want to suggest that you can find a lot of regular expressions online.  So if you ever have trouble, you can try searching for regular expression and what you want to accomplish, and you'll often find great examples.  But let's dive into the rules!

- \+ means 1 or more
- \* means 0 or more
- ? means 0 or 1

In [2]:
import pandas as pd
s1 = pd.Series(['bt','bot','boot','booot'])

In [3]:
s1.str.fullmatch('bo+t')

0    False
1     True
2     True
3     True
dtype: bool

In [4]:
s1.str.fullmatch('bo*t')

0    True
1    True
2    True
3    True
dtype: bool

In [14]:
s1.str.fullmatch('bo?t')

0     True
1     True
2    False
3    False
dtype: bool

- a|b means a or b

In [5]:
s2 = pd.Series(['bat','bot','bit'])

In [10]:
s2.str.fullmatch('bat|bit')

0     True
1    False
2     True
dtype: bool

- expressions in parentheses are evaluated first

In [11]:
s2.str.fullmatch('b(a|i)t')

0     True
1    False
2     True
dtype: bool

- . means any character other than a newline

In [13]:
s2.str.fullmatch('b.t')

0    True
1    True
2    True
dtype: bool

- \d represents any digit

In [26]:
s3 = pd.Series(['3.45', '23.0.5', '45.0','555-234-5555'])

In [27]:
s3.str.fullmatch('\d+\.\d\d')

0     True
1    False
2    False
3    False
dtype: bool

In [28]:
s3.str.fullmatch('\d+(\.\d+)?')

0     True
1    False
2     True
3    False
dtype: bool

In [29]:
s3.str.fullmatch('\d\d\d-\d\d\d-\d\d\d\d')

0    False
1    False
2    False
3     True
dtype: bool

In [30]:
s3.str.fullmatch('\d{3}-\d{3}-\d{4}')

0    False
1    False
2    False
3     True
dtype: bool

That's enough to get you started.  There are some excellent regular expression cheat sheets online.  We'll leave a link for one that we like.