# Regular Expressions
## re - python RE package

### re.match(pattern, string)

It searches the pattern at the beginning of the string.

In [1]:
import re

result = re.match('AV', 'AV Analytics Vidya AV')

In [2]:
result

<re.Match object; span=(0, 2), match='AV'>

Use “r” at the start of the pattern string, it designates a python raw string.

In [3]:
result = re.match(r'AV', 'AV Analytics Vidya AV')

Use method "group" (It helps to return the matching string)

In [4]:
result.group(0)

'AV'

string is not starting with ‘Analytics’ so it returns no match

In [5]:
result = re.match(r'Analytics', 'AV Analytics Vidhya AV')

In [7]:
result.group(0)

AttributeError: 'NoneType' object has no attribute 'group'

start() and end() methods - to know the start and end positions of matching pattern in the string

In [29]:
result = re.match(r'AV', 'AV Analytics Vidya AV')

In [31]:
result.end()

2

In [30]:
result.start()

0

### re.search(pattern, string)

It is similar to match() but it doesn’t restrict us to find matches at the beginning of the string only. Unlike previous method, here searching for pattern ‘Analytics’ will return a match.

In [32]:
result = re.search(r'Analytics', 'AV Analytics Vidhya AV')

In [34]:
result.group(0)

'Analytics'

But it only returns the first occurrence of the search pattern.

### re.findall (pattern, string)

It helps to get a list of all matching patterns. It has no constraints of searching from start or end. If we will use method findall to search ‘AV’ in given string it will return both occurrence of AV. It can work like re.search() and re.match() both.

In [35]:
result = re.findall(r'AV', 'AV Analytics Vidya AV')

In [36]:
result

['AV', 'AV']

### re.split(pattern, string, [maxsplit=0])

This methods helps to split <i>string</i> by the occurrences of given pattern.

In [37]:
result = re.split(r'y', 'Analytics')

In [38]:
result

['Anal', 'tics']

Above, we have split the string “Analytics” by “y”. Method split() has another argument “maxsplit”. It has default value of zero. In this case it does the maximum splits that can be done, but if we give value to maxsplit, it will split the string. Let’s look at the example below:

In [39]:
result=re.split(r'i','Analytics Vidhya')

In [40]:
result

['Analyt', 'cs V', 'dhya']

In [41]:
result=re.split(r'i','Analytics Vidhya',maxsplit=1)

In [42]:
result

['Analyt', 'cs Vidhya']

### re.sub(pattern, repl, string)

It helps to search a pattern and replace with a new sub string. If the pattern is not found, string is returned unchanged.

In [43]:
result=re.sub(r'India','the World','AV is largest Analytics community of India')

In [44]:
result

'AV is largest Analytics community of the World'

### re.compile(pattern, repl, string)

We can combine a regular expression pattern into pattern objects, which can be used for pattern matching. It also helps to search a pattern again without rewriting it.

In [45]:
pattern=re.compile('AV')

In [46]:
result=pattern.findall('AV Analytics Vidhya AV')

In [47]:
result

['AV', 'AV']

In [48]:
result2=pattern.findall('AV is largest analytics community of India')

In [49]:
result2

['AV']