# Regex Cheat Sheet

A regular expression is a sequence of characters that define a search pattern. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory

###  Search for a pattern 'brand' .  If found print 'found' else print 'not found'


In [18]:
import re

text = 'Apple is a tech brand, not a fruit'
result = re.search(r'brand', text)
print (result.group(0))

if result.group(0) == 'brand':
    print('found')
else:
    print('not found')

brand
found


### Split the given text at @

In [13]:
phrase = 'the list of given email are harry@gmail.com , ron@hotmail.com'
result=re.split(r'@',phrase)
result


['the list of given email are harry', 'gmail.com , ron', 'hotmail.com']

### Remove the punctation from the given sentence

In [25]:
test_phrase = 'hi! are you doing good?'
result= re.sub(r'[!?]','', test_phrase)
print(result)

hi are you doing good


### Write a list of test patterns having a sequence of 
- lower case letters, 
- uppercase letters, 
- lower or upper 
- and one upper case followed by lower case

In [53]:
import re
test_phrase = 'Regex Is difficult to grasp in short period of time'

#and one upper case followed by lower case
pattern1 = '[a-z]+[a-z]|[a-z]'
pattern2 = '[A-Z]+'
pattern3 = '[A-Z|a-z]+'
pattern4 = '[A-Z]+[a-z]+'

result1 = re.findall(pattern1, test_phrase)
result2 = re.findall(pattern2, test_phrase)
result3 = re.findall(pattern3, test_phrase)
result4 = re.findall(pattern4, test_phrase)

print(result1)
print(result2)
print(result3)
print(result4)



#answer
#['egex', 's', 'difficult', 'to', 'grasp', 'in', 'short', 'period', 'of', 'time']

#['R', 'I']

#['Regex', 'Is', 'difficult', 'to', 'grasp', 'in', 'short', 'period', 'of', 'time']

#['Regex', 'Is']



['egex', 's', 'difficult', 'to', 'grasp', 'in', 'short', 'period', 'of', 'time']
['R', 'I']
['Regex', 'Is', 'difficult', 'to', 'grasp', 'in', 'short', 'period', 'of', 'time']
['Regex', 'Is']


### Write a list of test patterns having a sequence 
- sequence of non-digits
- sequence of whitespace
- sequence of non-whitespace
- alphanumeric characters
- non-alphanumeric

In [158]:
import re
test_phrase = 'This is a string with some numbers 1233 and a symbol #hashtag'
pattern1 = '[^0-9]+'
pattern2 = '\s+'
pattern3 = '\S+'
pattern4 = '\w+'
pattern5 = '\W+'

result1 = re.findall(pattern1, test_phrase)
result2 = re.findall(pattern2, test_phrase)
result3 = re.findall(pattern3, test_phrase)
result4 = re.findall(pattern4, test_phrase)
result5 = re.findall(pattern5, test_phrase)

print(result1)
print(result2)
print(result3)
print(result4)
print(result5)

['This is a string with some numbers ', ' and a symbol #hashtag']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
['This', 'is', 'a', 'string', 'with', 'some', 'numbers', '1233', 'and', 'a', 'symbol', '#hashtag']
['This', 'is', 'a', 'string', 'with', 'some', 'numbers', '1233', 'and', 'a', 'symbol', 'hashtag']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' #']


### Find the date from the given text

In [18]:
text="is your birthday in 12-12-1992"
pattern = '[0-9]+'
result = re.findall(pattern, text)
print(result)

['12', '12', '1992']


### Write a regrex that finds anything that starts with 'bat''

In [6]:
import re
text="Here comes the batman in batmobile"
pattern = r'bat'
result = re.findall(pattern, text)
print(result)

['bat', 'bat']


### Write a regrex that finds all the words that end with 'at'

In [8]:
import re
text="The cat in the hat sat flat on the mat."
match=re.findall(r"[a]*t",text)
match

['at', 't', 'at', 'at', 'at', 't', 'at']

### Find the first name and the last name of the person

In [12]:
text="First Name: Travis Last Name: Baker"
match1=re.findall(r'\s*Travis\s*',text)
match1
match2=re.findall(r'\s*Baker\s*',text)
print(match1, match2)


[' Travis '] [' Baker']


 ### Find all the words with an internal o. 

In [157]:
text="the quick brown fox over "
match1=re.findall(r'..o..',text)
match1

['brown', ' fox ']

### Find that begin or end with o

In [36]:
text="quick brown fox jumps over a lazy dog"
match1=re.findall(r"o[a-z]*|[a-z]*o",text)
match1

['bro', 'fo', 'over', 'do']

### Print the answer in the given format . hint use re.sub
'The quick =brown= =fox=  jumped =over= the lazy =dog= '


In [127]:
text="the quick brown fox jumped over the lazy dog"
result= re.sub(r'[\s]','=', text)
print(result)

the=quick=brown=fox=jumped=over=the=lazy=dog


### Get the month from the given text

In [150]:
text="bill gate was born in 28-10-1955 and will was born in 1-02-2000"
pattern = '-[0-9]*-'
result = re.findall(pattern, text)
print(result)


['-10-', '-02-']


### Splits before every o

In [56]:
text="python is an object oriented langugae"
result=re.split(r'\so',text)
result


['python is an', 'bject', 'riented langugae']

In [59]:
h=re.compile("(?=o)")

h.split(text)

['pyth', 'on is an ', 'object ', 'oriented langugae']

### Write a pattern that find dogs and dog

In [53]:
import re
test_string = "I like dogs but my dog doesn't like me."
match1=re.findall(r"(?<=\s)(dogs|dog)",test_string)
match1


['dogs', 'dog']

### Find all the p's and q's in the test string below.

In [63]:

test_string = "Quick, pizzaz shop is closing. Is this a path to queensland?"
mat=re.findall(r'(p|q|Q)',test_string)
mat


['Q', 'p', 'p', 'p', 'q']

### Find everything except t in word

In [77]:
test_string = 'the quick brown fox jumped over the lazy dog'
mat=re.findall(r'[^t]*',test_string)
mat

['', 'he quick brown fox jumped over ', '', 'he lazy dog', '']

### Find all the ^ characters in the following test sentence.

In [83]:


test_string = """You can match the characters not listed within the class by complementing the set. 

This is indicated by including a ^ as the first character of the class; 

^ outside a character class will simply match the ^ character. 

For example, [^5] will match any character except 5"""

mat=re.findall(r"(?:\^)",test_string)
mat



['^', '^', '^', '^']

### Find all three digit prices in the following test sentence. 

In [80]:
test_string = 'The Mac book cost over $999, while the windows system can be bought for less than $550.'
match=re.findall(r'[0-9]{2,3}',test_string)
print(match)

['999', '550']


### Find all prices in the following test sentence.

In [159]:
test_string = """The iPhone X costs over $999, while the Android cost around $550.

Apple's MacBook Pro costs $1200, while razer blade cost $2500.

A new charger for iphone cost over $30.

Bose headphone cost around $390

"""

match=re.findall(r'[0-9]+',test_string)
print(match)



['999', '550', '1200', '2500', '30', '390']


### Find all the better in the given text

In [84]:
robot_string = '''The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
'''

match1=re.findall(r"(?<=\s)(better)",robot_string)
match1

['better',
 'better',
 'better',
 'better',
 'better',
 'better',
 'better',
 'better']

### Find the word that start with 'bat' and end with 'man'

In [119]:
test_string = 'Batman is the best movies, batwoman never herad about it.'

import re 
match1=re.findall(r"[B|b]at[a-z]*|[a-z]*man",test_string)
match1



['Batman', 'batwoman']

### Remove the year from the given list


In [125]:
str1="Another You (1991-2000)"

result= re.sub(r'[0-9|\-|\()]','', str1)
print(result)

Another You 


### Write a regex to find the url_name only

In [148]:
name="http://www.zombie-bites.com" # >>>zombie-bites output
pattern = '[a-z]*-[a-z]*'
result = re.findall(pattern, name)
print(result)


['zombie-bites']
