 **`re.compile()`** is the feature responsible for  generation of Regex objects.

Regular expressions use the backslash character `('\')` to indicate special forms `(Metacharacters)` or to allow special characters `(speical sequences)` to be used  without invoking their special meaning. This collides with Python’s usage of the same character for the same purpose in string literals. Hence, Raw strings are used (e.g. r"\n") so that backslashes do not have to be escaped.

The return value of `re.search(pattern,string)` method is a match object if the pattern is observed in the string else it returns a None

For Matched items `group()` methods returns actual strings that match the pattern

In the Regex **`r'(\d\d\d)-(\d\d\d-\d\d\d\d)'`** the zero group covers the entire pattern match where as the first group cover **`(\d\d\d)`** and the second group cover **`(\d\d\d-\d\d\d\d)`**

The **`\.`** **`\(`** and **`\)`** escape characters in the raw string passed to re.compile() will match actual parenthesis characters.

If the regex pattern has no groups, a list of strings matched is returned. if the regex pattern has groups, a list of tuple of strings is returned.

In Standard Expressions `|` means `OR` operator.

In regular Expressions, `?` characters represents zero or one match of the preceeding group.

In Regular Expressions, `*` Represents Zero ore more occurances of the preceeding group, whereas `+` represents one or more occurances of the preceeding group.

`{4}` means that its preceeding group should repeat 4 times. where as `{4,5}` means that its preceeding group should repeat mininum 4 times and maximum 5 times inclusively

1. **`\w`** – Matches a word character equivalent to [a-zA-Z0-9_]
2. **`\d`** – Matches digit character equivalent to [0-9]
3. **`\s`** – Matches whitespace character (space, tab, newline, etc.)

1. **`\W`** – Matches any non-alphanumeric character equivalent to [^a-zA-Z0-9_]
2. **`\D`** – Matches any non-digit character, this is equivalent to the set class [^0-9]
3. **`\S`** – Matches any non-whitespace character

 **`.*`** is a Greedy mode, which returns the longest string that meets the condition. Whereas **`.*?`** is a non greedy mode which returns the shortest string that meets the condition.

The Synatax is Either **`[a-z0-9]`** or **`[0-9a-z]`**

We can pass **`re.IGNORECASE`** as a flag to make a noraml expression case insensitive

Dot **`.`** character matches everything in input except newline character **`.`**. By passing **`re.DOTALL`** as a flag to **`re.compile()`**, you can make the dot character match all characters, including the newline character.

The Ouput will be **`'X drummers, X pipers, five rings, X hen'`**

**`re.VERBOSE`** will allow to add whitespace and comments to string passed to **`re.compile()`**.

In [12]:
import re
pattern = r'^\d{1,3}(,\d{3})*$'
pagex = re.compile(pattern)
for ele in ['42','1,234', '6,368,745','12,34,567','1234']:
    print('Output:',ele, '->', pagex.search(ele))

Output: 42 -> <re.Match object; span=(0, 2), match='42'>
Output: 1,234 -> <re.Match object; span=(0, 5), match='1,234'>
Output: 6,368,745 -> <re.Match object; span=(0, 9), match='6,368,745'>
Output: 12,34,567 -> None
Output: 1234 -> None


In [13]:
import re
pattern = r'[A-Z]{1}[a-z]*\sWatanabe'
namex = re.compile(pattern)
for name in ['Haruto Watanabe','Alice Watanabe','RoboCop Watanabe','haruto Watanabe','Mr. Watanabe','Watanabe','Haruto watanabe']:
    print('Output: ',name,'->',namex.search(name))

Output:  Haruto Watanabe -> <re.Match object; span=(0, 15), match='Haruto Watanabe'>
Output:  Alice Watanabe -> <re.Match object; span=(0, 14), match='Alice Watanabe'>
Output:  RoboCop Watanabe -> <re.Match object; span=(4, 16), match='Cop Watanabe'>
Output:  haruto Watanabe -> None
Output:  Mr. Watanabe -> None
Output:  Watanabe -> None
Output:  Haruto watanabe -> None


In [14]:
import re
pattern = r'(Alice|Bob|Carol)\s(eats|pets|throws)\s(apples|cats|baseballs)\.'
casex = re.compile(pattern,re.IGNORECASE)
for ele in ['Alice eats apples.','Bob pets cats.','Carol throws baseballs.','Alice throws Apples.','BOB EATS CATS.','RoboCop eats apples.'
,'ALICE THROWS FOOTBALLS.','Carol eats 7 cats.']:
    print('Output: ',ele,'->',casex.search(ele))

Output:  Alice eats apples. -> <re.Match object; span=(0, 18), match='Alice eats apples.'>
Output:  Bob pets cats. -> <re.Match object; span=(0, 14), match='Bob pets cats.'>
Output:  Carol throws baseballs. -> <re.Match object; span=(0, 23), match='Carol throws baseballs.'>
Output:  Alice throws Apples. -> <re.Match object; span=(0, 20), match='Alice throws Apples.'>
Output:  BOB EATS CATS. -> <re.Match object; span=(0, 14), match='BOB EATS CATS.'>
Output:  RoboCop eats apples. -> None
Output:  ALICE THROWS FOOTBALLS. -> None
Output:  Carol eats 7 cats. -> None
