# Task Automation

### **Regular Expressions**

Regular expressions allow you specify a specific pattern to search for. In this case a phone number.

First we import the **regular expressions module**

In [2]:
import re

In [3]:
message = 'Call me at 617-990-9998, or if I dont answer there give me a try at 781-560-2210'
phoneNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
matchObject = phoneNumRegex.search(message)
print('The numbers found are\n', matchObject.group())

The numbers found are
 617-990-9998


After finding the regular expression pattern, phoneNumRegex.search(message), finds the first instance of that pattern. 

In [4]:
message = 'Call me at 617-990-9998, or if I dont answer there give me a try at 781-560-2210'
phoneNumRegex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
print(phoneNumRegex.findall(message))

['617-990-9998', '781-560-2210']


After finding the regular expression pattern, phoneNumRegex.findall(message), finds all instances of that pattern. 


* You can also use parenthesis to group part of a string. In this case the (area code)-(phonenumber)

In [5]:
message = 'Call me at 617-990-9998, or if I dont answer there give me a try at 781-560-2210'
phoneNumRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
matchObject = phoneNumRegex.search(message)

print(matchObject.group(1))
print(matchObject.group(2))

617
990-9998


* The pipe character also creats groups with strings. You can use it anywhere you want to match oneof many expressions. For example, the regular expression r'Batman|Tina Fey' will match either 'Batman' or 'Tina Fey'.

In [6]:
batRegex = re.compile(r'Bat(man|mobile|copter|bat)')
matchObject = batRegex.search('Batmobile lost a wheel')
print(matchObject.group())
print(matchObject.group(1))

Batmobile
mobile


* Specific repetitions or number of repetitions can be found using curly brackes:

In [9]:
haRegex = re.compile(r'(Ha){3}')
matchObject1 = haRegex.search('HaHaHaHaHaHa')
print(matchObject1.group())

HaHaHa


* We can also match zero or more using the asterisk (*)

* This tells us we are looking for every instance of "wo"

In [23]:
batRegex = re.compile(r'Bat(wo)*man')

In [24]:
matchObject1 = batRegex.search('The Adventures of Batman')
matchObject1.group()

'Batman'

In [25]:
matchObject2 = batRegex.search('The Adventures of Batwoman')
matchObject2.group()

'Batwoman'

In [26]:
matchObject3 = batRegex.search('The Adventures of Batwowowowoman')
matchObject3.group()

'Batwowowowoman'

While the asterisk (*) means matchin zero or more

The plus (+) means find one or more.

In [27]:
batRegex = re.compile(r'Bat(wo)+man')

In [28]:
matchObject1 = batRegex.search('The Adventures of Batwoman')
matchObject1.group()

'Batwoman'

In [29]:
matchObject2 = batRegex.search('The Adventures of Batwowowowoman')
matchObject2.group()

'Batwowowowoman'

In [30]:
matchObject3 = batRegex.search('The Adventures of Batman')
matchObject3 == None

True

### Project: **email and phone number finder**

In [3]:
#  email and phone number finder
import re

x = input('Enter text here..')

Phone_Number_Regex = re.compile(r'\d{3}-\d{3}-\d{4}')
match_Object = Phone_Number_Regex.findall(x)
print(match_Object)

Email_Regex = re.compile(r'\w+[@]\w+\W+\w+')
match_Object2 = Email_Regex.findall(x)
print(match_Object2)

Enter text here..my work number is 786-123-4456 you may also find me at 777-898-0007. my email is sadie@jhon.com or alternatively sadie@comcast.net
['786-123-4456', '777-898-0007']
['sadie@jhon.com', 'sadie@comcast.net']
