# *<u> Regular Expressions </u>*
* Rarely need to create your own regex.
* Repositories
    - https://regex101.com
    - http://www.regexlib.com
    - https://www.regular-expression.info

In [2]:
import re

In [3]:
pattern = '02215'

In [4]:
'Hurray' if re.fullmatch(pattern, '02215') else 'Go Back Home Kid'

'Hurray'

In [5]:
'Hurray' if re.fullmatch(pattern, '51220') else 'Go Back Home Kid'

'Go Back Home Kid'

## *<u> Regular Expression Meta Characters </u>*

# [] {} () \ * + ^ $ ? . |

### \d => Any digit (0 - 9)
### \D => Any character that is not a digit.
### \s => Any white space characters such as spaces, tabs and newlines.
### \S => Any character that is not a white space.
### \w => Any word character also called an alphanumeric character. Any uppercase or lowercase letter, any digit or an underscore.
### \W => Any character that is not a word character.

In [6]:
'Valid Zip Code' if re.fullmatch(r'\d{5}', '02215') else 'Not a Valid Zip Code'
# r here is the raw character
# \ is the metacharacter.
# d symbolizes digits.
# d{5} is the sequence of 5 digit character. It will see is there are 5 consecutive digits or not.

'Valid Zip Code'

In [7]:
'Valid Zip Code' if re.fullmatch(r'\d{5}', '022145') else 'Not a Valid Zip Code'

'Not a Valid Zip Code'

In [8]:
'Valid Zip Code' if re.fullmatch(r'\d{5}', '0215') else 'Not a Valid Zip Code'

'Not a Valid Zip Code'

In [11]:
'Valid' if re.fullmatch('[A-Z][a-z]*', 'Wally') else 'Invalid'
# [A - Z]: This is a custom class that looks for any upper case letters A to Z.
# [a - z]: This is a custom class that looks for any lower case letters a to z.
# * : Since * comes immediately after [a - z]. This means any number of lower case letters after the first occurance of the lower case letter.
# This regular expression matches a string that matches starts with an Upper case letter followed by zero or more lower case letters.   

'Valid'

In [12]:
'Valid' if re.fullmatch('[A-Z][a-z]*', 'W') else 'Invalid' 

'Valid'

In [13]:
'Valid' if re.fullmatch('[A-Z][a-z]*', 'W1234') else 'Invalid' 

'Invalid'

In [14]:
'Valid' if re.fullmatch('[A-Z][a-z]*', 'eva') else 'Invalid' 

'Invalid'

In [15]:
'Valid' if re.fullmatch('[^a-z]*', 'W') else 'Invalid' 
# The carat symbol ^ denotes does not match.
# Hence ('[^a-z]*', 'W') decodes to lower case a-z does not match upper case W. Hence this will be valid.

'Valid'

In [16]:
'Valid' if re.fullmatch('[^a-z]*', 'b') else 'Invalid' 

'Invalid'

In [17]:
'Valid' if re.fullmatch('[*+$]', '*') else 'Invalid' 
# Note that anything inside the custome character bracket [] is taken as a literal string.
# Hence here * matches the characters inside the custome character, hence this is valid.

'Valid'

In [18]:
'Valid' if re.fullmatch('[*+$]', '+') else 'Invalid' 
 

'Valid'

In [19]:
'Valid' if re.fullmatch('[*+$]', '$') else 'Invalid' 

'Valid'

In [20]:
'Valid' if re.fullmatch('[*+$]', '%') else 'Invalid' 

'Invalid'

### Atleast One Match Using +

### The + and the * quantifiers are called GREEDY quantifiers as they are using the match as many characters as they can.

In [21]:
'Valid' if re.fullmatch('[A-Z][a-z]+', 'Eva') else 'Invalid' 
# Here the string should start with an Upper Case Letter and should be followed by alleast one (+) lower case letter.

'Valid'

In [22]:
'Valid' if re.fullmatch('[A-Z][a-z]+', 'EVA') else 'Invalid' 

'Invalid'

In [23]:
'Valid' if re.fullmatch('[A-Z][a-z]+', 'EVa') else 'Invalid' 

'Invalid'

In [None]:
'Valid' if re.fullmatch('[A-Z][a-z]+', 'Eva') else 'Invalid' 

In [24]:
'Valid' if re.fullmatch('[A-Z][a-z]+', 'E') else 'Invalid'

'Invalid'

### Quantifier that matches 0 or 1 matches of subexpressions (?)
### It matches 0 or 1 subexpressions that preceeds it.

In [25]:
'Match' if re.fullmatch('labell?ed', 'labelled') else 'No Match'

'Match'

In [26]:
'Match' if re.fullmatch('labelll?ed', 'labelled') else 'No Match'

'Match'

In [27]:
'Match' if re.fullmatch('labell?ed', 'labellled') else 'No Match'

'No Match'

In [28]:
'Match' if re.fullmatch('labell?ed', 'labeed') else 'No Match'

'No Match'

In [29]:
'Match' if re.fullmatch('labell?ed', 'labeled') else 'No Match'

'Match'

### Atleast Or More. Just like GREATER THAN OR EQUALS TO

In [30]:
'Match' if re.fullmatch(r'\d{3,}', '123') else 'No Match'

'Match'

In [31]:
'Match' if re.fullmatch(r'\d{3,}', '12') else 'No Match'


'No Match'

In [32]:
'Match' if re.fullmatch(r'\d{3,}', '123456') else 'No Match'


'Match'

### Atleast Or Not More Than. Limiting with the range

In [33]:
'Match' if re.fullmatch(r'\d{3,6}', '123456') else 'No Match'


'Match'

In [34]:
'Match' if re.fullmatch(r'\d{3,6}', '123') else 'No Match'


'Match'

In [35]:
'Match' if re.fullmatch(r'\d{3,6}', '12') else 'No Match'


'No Match'

In [36]:
'Match' if re.fullmatch(r'\d{3,6}', '1234569') else 'No Match'


'No Match'

# *<u> Self Check </u>*\
* Create and test a regular expression that mataches a street addess consisting of a number with one or more digits followed by two words of one or more characters each. The token should be seprated by one space each, as in 123 Main Street.

In [61]:
street = r'\d+ [A-Z][a-z]* [A-Z][a-z]*'

In [62]:
'Match' if re.fullmatch(street, '1038 Vivekananda Road') else 'No Match'

'Match'

In [63]:
'Match' if re.fullmatch(street, 'Vivekananda Road') else 'No Match'

'No Match'